Large Language Model Routing with Benchmark Datasets

There is a rapidly growing number of open-source Large Language Models (LLMs) and benchmark datasets to compare them. While some models dominate these benchmarks, no single model typically achieves the best accuracy in all tasks and use cases. With a new dataset, it can be difficult to determine whi...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awdur:	Ou, Anthony C.
Awduron Eraill:	Thompson, Neil
Fformat:	Traethawd Ymchwil
Cyhoeddwyd:	Massachusetts Institute of Technology 2024
Mynediad Ar-lein:	https://hdl.handle.net/1721.1/153846

_version_	1826190779432828928
author	Ou, Anthony C.
author2	Thompson, Neil
author_facet	Thompson, Neil Ou, Anthony C.
author_sort	Ou, Anthony C.
collection	MIT
description	There is a rapidly growing number of open-source Large Language Models (LLMs) and benchmark datasets to compare them. While some models dominate these benchmarks, no single model typically achieves the best accuracy in all tasks and use cases. With a new dataset, it can be difficult to determine which LLM is best suited to the task. In this work we will address the challenges associated with selecting the best LLM model out of a collection for a new task. To do so, benchmark datasets are repurposed to learn a “router” model for this LLM selection, such that the “router” model will solve a collection of binary classification tasks. This work will demonstrate the utility and limitations of learning model routers from various benchmark datasets, where performance is improved upon using any single model for all tasks.
first_indexed	2024-09-23T08:45:34Z
format	Thesis
id	mit-1721.1/153846
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T08:45:34Z
publishDate	2024
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1538462024-03-22T04:06:30Z Large Language Model Routing with Benchmark Datasets Ou, Anthony C. Thompson, Neil Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science There is a rapidly growing number of open-source Large Language Models (LLMs) and benchmark datasets to compare them. While some models dominate these benchmarks, no single model typically achieves the best accuracy in all tasks and use cases. With a new dataset, it can be difficult to determine which LLM is best suited to the task. In this work we will address the challenges associated with selecting the best LLM model out of a collection for a new task. To do so, benchmark datasets are repurposed to learn a “router” model for this LLM selection, such that the “router” model will solve a collection of binary classification tasks. This work will demonstrate the utility and limitations of learning model routers from various benchmark datasets, where performance is improved upon using any single model for all tasks. M.Eng. 2024-03-21T19:10:03Z 2024-03-21T19:10:03Z 2024-02 2024-03-04T16:38:12.047Z Thesis https://hdl.handle.net/1721.1/153846 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Ou, Anthony C. Large Language Model Routing with Benchmark Datasets
title	Large Language Model Routing with Benchmark Datasets
title_full	Large Language Model Routing with Benchmark Datasets
title_fullStr	Large Language Model Routing with Benchmark Datasets
title_full_unstemmed	Large Language Model Routing with Benchmark Datasets
title_short	Large Language Model Routing with Benchmark Datasets
title_sort	large language model routing with benchmark datasets
url	https://hdl.handle.net/1721.1/153846
work_keys_str_mv	AT ouanthonyc largelanguagemodelroutingwithbenchmarkdatasets

Large Language Model Routing with Benchmark Datasets

Eitemau Tebyg