Apple just released a weirdly interesting coding language model

https://news.ycombinator.com/rss Hits: 4
Summary

Apple quietly dropped a new AI model on Hugging Face with an interesting twist. Instead of writing code like traditional LLMs generate text (left to right, top to bottom), it can also write out of order, and improve multiple chunks at once. The result is faster code generation, at a performance that rivals top open-source coding models. Here’s how it works. The nerdy bits Here are some (overly simplified, in the name of efficiency) concepts that are important to understand before we can move on. Autoregression Traditionally, most LLMs have been autoregressive. This means that when you ask them something, they process your entire question, predict the first token of the answer, reprocess the entire question with the first token, predict the second token, and so on. This makes them generate text like most of us read: left to right, top to bottom. Temperature LLMs have a setting called temperature that controls how random the output can be. When predicting the next token, the model assigns probabilities to all possible options. A lower temperature makes it more likely to choose the most probable token, while a higher temperature gives it more freedom to pick less likely ones. Diffusion An alternative to autoregressive models is diffusion models, which have been more often used by image models like Stable Diffusion. In a nutshell, the model starts with a fuzzy, noisy image, and it iteratively removes the noise while keeping the user request in mind, steering it towards something that looks more and more like what the user requested. Diffusion model processes moving to and from data and noise. Image: NVIDIA Still with us? Great! Lately, some large language models have looked to the diffusion architecture to generate text, and the results have been pretty promising. If you want to dive deeper into how it works, here’s a great explainer: Why am I telling you all this? Because now you can see why diffusion-based text models can be faster than autoregressive ones, since they...

First seen: 2025-07-08 11:30

Last seen: 2025-07-08 14:30