[Submitted on 6 Apr 2024 (v1), last revised 16 Jun 2024 (this version, v3)] Title:PhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks View a PDF of the paper titled PhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks, by Nicolas Yax and 2 other authors View PDF HTML (experimental) Abstract:This paper introduces PhyloLM, a method adapting phylogenetic algorithms to Large Language Models (LLMs) to explore whether and how they relate to each other and to predict their performance characteristics. Our method calculates a phylogenetic distance metrics based on the similarity of LLMs' output. The resulting metric is then used to construct dendrograms, which satisfactorily capture known relationships across a set of 111 open-source and 45 closed models. Furthermore, our phylogenetic distance predicts performance in standard benchmarks, thus demonstrating its functional validity and paving the way for a time and cost-effective estimation of LLM capabilities. To sum up, by translating population genetic concepts to machine learning, we propose and validate a tool to evaluate LLM development, relationships and capabilities, even in the absence of transparent training information. Submission history From: Nicolas Yax [view email] [v1] Sat, 6 Apr 2024 16:16:30 UTC (4,036 KB) [v2] Thu, 23 May 2024 16:03:29 UTC (11,764 KB) [v3] Sun, 16 Jun 2024 14:39:20 UTC (11,764 KB)
First seen: 2025-04-19 15:20
Last seen: 2025-04-19 23:22