A bilingual benchmark for evaluating large language models
This work introduces a new benchmark for the bilingual evaluation of large language models (LLMs) in English and Arabic. While LLMs have transformed various fields, their evaluation in Arabic remains limited. This work addresses this gap by proposing a novel evaluation method for LLMs in both Arabic...
Main Author: | Mohamed Alkaoud |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2024-02-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-1893.pdf |
Similar Items
-
A Comprehensive Study of ChatGPT: Advancements, Limitations, and Ethical Considerations in Natural Language Processing and Cybersecurity
by: Moatsum Alawida, et al.
Published: (2023-08-01) -
A Survey on Large Language Model (LLM) Security and Privacy: The Good, The Bad, and The Ugly
by: Yifan Yao, et al.
Published: (2024-06-01) -
Large language models and political science
by: Mitchell Linegar, et al.
Published: (2023-10-01) -
Evaluating the Utility of a Large Language Model in Answering Common Patients’ Gastrointestinal Health-Related Questions: Are We There Yet?
by: Adi Lahat, et al.
Published: (2023-06-01) -
A Mathematical Investigation of Hallucination and Creativity in GPT Models
by: Minhyeok Lee
Published: (2023-05-01)