tokens Archives - ML Journey

How Many Tokens Per Second Is ‘Good’ for Local LLMs?

February 7, 2026 by Peter Song

You’ve set up a local LLM and it’s generating at 15 tokens per second. Is that good? Should you be happy, or is your setup underperforming? Unlike cloud services where you simply accept whatever speed you get, local LLMs put performance optimization in your hands—but that requires knowing what benchmarks to target. The answer isn’t … Read more