A bilingual benchmark for evaluating large language models
This work introduces a new benchmark for the bilingual evaluation of large language models (LLMs) in English and Arabic. While LLMs have transformed various fields, their evaluation in Arabic remains limited. This work addresses this gap by proposing a novel evaluation method for LLMs in both Arabic...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2024-02-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-1893.pdf |