I have an RTX 3070 with 8GB VRAM and for me Qwen3:30B-A3B is fast enough. It's not lightning fast, but more than adequate if you have a _little_ patience.I've found that Qwen3 is generally really good at following instructions and you can also very easily turn on or off the reasoning by adding "/no_think" in the prompt to turn it off.The reason Qwen3:30B works so well is because it's a MoE. I have tested the 14B model and it's noticeably slower because it's a dense model.
First seen: 2025-05-30 12:23
Last seen: 2025-05-30 20:24