Searching...
Searching...
14 results for “scaling operations”
...scaling in context length. So this can mean just having more text inputs for for your models, but it can also mean things like taking a lot of visual token inputs, image inputs to your models, or generating lots of outputs. And one thing that's been
But but by itself, that's not enough, because invariably, that one model running on, a set of hardware, is gonna get too much traffic that it cannot handle. And at that point, you need to horizontally scale it. And that's not an ML problem. That's no
...scaling law for it, which is effectively for how much compute you put in, the architecture will get to different levels of performance at test tasks. And mixture of experts is one of the ones at training time, even if you don't consider the inference
...we're seeing scaling not only during training time, but also during test time. So this is one of the the this is the iconic image from the OpenAI o one release. Not only are we starting to scale train time compute, but we're also starting to scale te
...scaling test time compute more generally, is the way to go to have, smarter models, without having to keep scaling, the training, compute by, you know, 10 x to get sort of diminishing returns in terms of the the the performance. Then again, the the t
And at based on when you deploy a model, whether it's, your own custom weights or or or an open source model, you get dedicated inference dedicated resources, dedicated inference. And so when it comes to the quantization question, where it matters is
...scaling. And I think less famously for also showing that you can scale reinforcement learning training and get kind of this log x axis and then a linear increase in performance on y axis. So there's kind of these three axes now where the traditional
...if scaling is actually giving you a better model, like, is it going to be financially worth it? And I think it'll kind of slowly will push it out as AI solves more compelling tasks. So like the likes of cloud opus 4.5, making cloud code just work for
predicting your next edit and saving you some time. And we want to serve both of those demographics. And so I think to pick a single acceleration factor is difficult because it depends where you were previously. And I think compared to many other com
...about the scaling law behind it. So if you take, like, Gusto, for example, or Rippling or an HR service like this, when they're buying from an AWS or a GCP, they're buying CPUs and they're running web servers. Those web servers, they kind of buy up t
chips is actually not four x. So you have what? Three x or four x? You have three x in here. And it's it's like two x maybe or it's more of like a power story rather than like a a share sort of compute tokens efficiency story. But, yeah, what what's
reasoning thing, these O series, the strawberry models were that big of a deal. It was like they thought we were making a bigger deal of it than it really deserved to be. And then when we announced O1 and they saw the reaction of their coworkers at t
And, you know, he says, oh, yeah. I think data centers can scale up by about this much. I think that you can do scale up the data and some other things by this much. But one of the things that like, makes the rest of that order of magnitude growth po
we had grown really fast until that point, and we thought, okay. Maybe doubling next year would be a good target. And we said, like, okay. Maybe maybe we're gonna double, and that would be the OTE. And during, like, the interviews and during negotiat
Have a podcast?
Get ranked clips, hooks, and ready-to-post copy from your own episodes. Free to try.