How Chipotle's training data could teach every robot to make burritos

It's video plus language understanding plus how that can be translated into actions.

7:24 / 8:05

and we have chef robotics VLA in our system. Right? So you can just plug and play, and the robot will learn how to do it. What does VLA stand for? Vision language action models. It's just like LLMs. Alright. It's video plus language understanding plus how that can be translated into actions. So just the robot waving the arm. So each and every torque, the motor movement is fed into the software.

So if Chipotle, for example, shared

their training data, do they get paid for it? Like, then people can remix it? Yeah. So there are two parts to the platform. Right? One is software as a service for the training itself. Just like how you see cursor or chat g p t APIs or croc APIs.

About this clip

This clip explains how Vision Language Action (VLA) models work in robotics - combining video, language understanding, and physical actions to train robots. The discussion covers how companies like Chipotle could potentially share their training data through a software-as-a-service platform, creating a plug-and-play system for robot learning.

Why this clip

This moment reveals the practical business model behind robot training platforms and how real-world company data becomes valuable training assets.

7:24 - 8:0541smarket insight

Share

LinkedInX

More from this episode

From the blog

Want clips like this for your podcast?

We find your top 5-8 clips, write the hooks, and deliver ready-to-post content. First 2 episodes are free.

Get 2 Episodes Clipped Free