Our vertically integrated GroqCloud and inference-first architecture deliver unmatched performance and price. With Llama 4 models, developers can run cutting-edge multimodal workloads while keeping costs low and latency predictable. Llama 4 Scout is currently running at over 460 tokens/s while Llama 4 Maverick is coming today. Stay tuned for official 3rd party benchmarks from Artificial Analysis. Groq is offering the first of the Llama 4 model herd at the following pricing:
First seen: 2025-04-05 21:10
Last seen: 2025-04-05 21:10