AI models just doubled in speed - but there's still one major bottleneck
“like, three weeks ago, these models were, doing, like, 30 tokens per second, which is, like, what is that, 15 words per second.”
like, three weeks ago, these models were, doing, like, 30 tokens per second, which is, like, what is that, 15 words per second. And now they're up to, like, 50 or 60. So that speed alone definitely, like, makes it feel like these models are getting faster.
Yeah. The two constraining issues I'm having right now with OpenClaw is, the speed and threadedness of it. I want to be able to have my one replicant doing, like, 10 jobs at once as opposed to 10 replicants. What do you think the future of that is? The speed and the parallel nature of it, and any tips on optimizing your replicant slash OpenLaw agent,
About this clip
An early OpenClaw contributor discusses the dramatic speed improvements in AI models over recent weeks, from 30 to 50-60 tokens per second. The conversation focuses on current technical limitations around speed and threading, specifically the challenge of running multiple parallel tasks with a single AI agent rather than needing multiple instances.
Why this clip
This clip provides specific technical insights about AI model performance improvements and practical optimization challenges that developers are currently facing.
What they said next
Early OpenClaw contributor reveals the token cost tradeoff of AI sub-agents
28:19 - 38s · technical insight
More from this episode
Similar clips from other shows
From the blog
Want clips like this for your podcast?
We find your top 5-8 clips, write the hooks, and deliver ready-to-post content. First 2 episodes are free.
Get 2 Episodes Clipped Free