Llama 4 Now Live on Groq

https://news.ycombinator.com/rss Hits: 1
Summary

Our vertically integrated GroqCloud and inference-first architecture deliver unmatched performance and price. With Llama 4 models, developers can run cutting-edge multimodal workloads while keeping costs low and latency predictable. Llama 4 Scout is currently running at over 460 tokens/s while Llama 4 Maverick is coming today. Stay tuned for official 3rd party benchmarks from Artificial Analysis. Groq is offering the first of the Llama 4 model herd at the following pricing:

First seen: 2025-04-05 21:10

Last seen: 2025-04-05 21:10