Meta has distinguished itself positively by releasing three generations of Llama, a semi-open LLM with weights available if you ask nicely (and provide your full legal name, date of birth, and full organization name with all corporate identifiers). So no, it’s not open source. Anyway, on Saturday (!) May the 5th, Cinco de Mayo, Meta released Llama 4. This is a draft. Come back later for the final version. Source: https://x.com/burkov/status/1909088837554291049 LM Arena As became standard practice, Meta tested the model anonymously on LM Arena before release. The model ended up second on the leaderboard, which is great, and this is where controversy starts. LM Arena is the most popular online benchmark, and they release some conversations along with their associated users preferences. These two facts mean that the companies are willing and able to overfit the benchmark. If you look at the leaderboard, about half of the models there are marked as “experimental”, “preview”, or something like that. This might well mean that what you get in normal use is not what you get on LM Arena. People usually don’t pay much attention to this when a model delivers. Llama 4 is exceptional in that it does not deliver. By the way, that is not to say that the “experimental” Llama 4 is good. It’s just yappy, explains QKV in transformers through family reunions, and also hallucinates. Mistakes happen On release Meta published a chart of various model performances vs prices, with Llama on top. They forgot, however, to put Gemini 2.5 on the chart. They probably ran out of the vertical space and Gemini 2.5 just didn’t fit, being about 20 points above Llama. Honest mistakes like this happen, honestly, even to honest people, you know. Source: https://x.com/AIatMeta/status/1908618302676697317 Early nail in the coffin On April 8, LM Arena tweeted this: Meta’s interpretation of our policy did not match what we expect from model providers. Meta should have made it clearer that “Llama-4-Maverick-03...
First seen: 2025-04-24 07:49
Last seen: 2025-04-24 07:49