Search the Top VC & Business Podcasts

Marc Andreessen: Who Runs the World’s AI?with Marc Andreessen

“China copies American AI breakthroughs but can't make them 10x better”

...distillation where you basically train the next model on the answers of of the previous model, and and and I think for sure China is is doing some of that, and there there's a there's a lens on that t

14:57 / 15:44

market insight14:57 – 15:44

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

...model. Yeah. And even if that's the case, what they did is still amazing, by the way, what DeepSig did efficiency wise. Distillation is standard practice in not if you're at a closed lab where you care about terms of service and IP closely, you disti

from transcript3:35:15 – 3:36:16

The Twenty Minute VC (20VC)

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

...distillation. So Smaller models become better than bigger models purely because of the quality of the data that's inputted through them? One theory is that the smaller model is better at generating output that you would want it to generate, essential

55:30 / 56:52

from transcript55:30 – 56:52

The Twenty Minute VC (20VC)

20VC: Anthropic CPO Mike Krieger: Where Will Value Be Created in a World of AI | Have Foundation Models Commoditized | When Do Model Providers Become Application Providers | What Anthropic Learned from Deepseek with Mike Krieger

...models and so many different providers, Open source is a very viable possible route, and distillation is looked to have in a shady way. Is distillation really wrong if it ultimately propels spaces forward? Well, even, like, let's take within the labs

29:59 / 31:06

from transcript29:59 – 31:06

a16z Podcast

What You Missed in AI This Week (Google, Apple, ChatGPT)

...models that are able to do similar things at a lower cost.

6:00 / 7:03

from transcript6:00 – 7:03

a16z Podcast

What You Missed in AI This Week (Google, Apple, ChatGPT)

...Hopefully, we'll see more sort of condensed optimized distilled models that are able to do similar things at a lower cost. Okay. So there was a lot of news last week, so this got kind of lost. But I heard there was a big update to ChatGPT's advanced

6:46 / 7:47

from transcript6:46 – 7:47

The Twenty Minute VC (20VC)

20VC: a16z's Martin Casado on Anthropic vs OpenAI: Where Value Accrues | Cursor vs Replit vs Lovable: Who Wins and Who Loses | The One Sin in AI Investing | Why Open Source is a National Security Risk with China

...released in a while, certainly around code, so that's gonna show up. And so I just feel like the players, the money behind the players, the fact that these models distill, this one end up in an oligopoly. To what extent do you think the large model p

9:33 / 10:56

from transcript9:33 – 10:56

The Twenty Minute VC (20VC)

20VC: a16z's Martin Casado on Anthropic vs OpenAI: Where Value Accrues | Cursor vs Replit vs Lovable: Who Wins and Who Loses | The One Sin in AI Investing | Why Open Source is a National Security Risk with China

...with different flavors, and there's gonna be a lot of new flavor models that will come out. You know, Mira and Ilya are out there creating models. I mean, you got these very legit teams that were some of the pioneers. We're just starting up models fo

10:23 / 12:04

from transcript10:23 – 12:04

The Twenty Minute VC (20VC)

20VC: Anthropic CPO Mike Krieger: Where Will Value Be Created in a World of AI | Have Foundation Models Commoditized | When Do Model Providers Become Application Providers | What Anthropic Learned from Deepseek with Mike Krieger

...open source models, take LAMA, for example, like, they've been able to do that from their own research and perspective and data ingestion and and training. And so I guess I would say distillation does not feel, essential in order to unlock those thin

30:42 / 31:46

from transcript30:42 – 31:46

a16z Podcast

Monopolies vs Oligopolies in AI

...model. It's a great model. And if you actually look at it, you know, on the price performance, I would say in many use cases, it's the one that I actually use as my standard model, it's better than anthropic for some use cases if you actually, you kn

7:18 / 8:32

from transcript7:18 – 8:32

a16z Podcast

Monopolies vs Oligopolies in AI

...models distill, like, this will end up in an oligopoly. But I I mean, I don't know. That's just my guess. To what extent do you think the large model providers in ten years' time have already been created, or are they yet to be founded? I think that

8:16 / 9:18

from transcript8:16 – 9:18

Latent Space

⚡️The new OpenAI Agents Platform

...model distillation to make these four zero fine tunes really good. But, yeah, the main thing is, like, can it remain factual? Can it answer questions based on what it retrieves? And get it cited accurately? And that's what this fine tune model really

9:03 / 10:10

from transcript9:03 – 10:10

Invest Like the Best

Gaurav Misra & Dwight Churchill - Building Captions - [Invest Like the Best, EP.405]with Dwight Churchill

...and it's still, oh, it's 10,000,000,000 parameters, 20,000,000,000 parameters, whatever. On the inference side, it's similar learnings happening simultaneously. We don't need to do a 100 steps of diffusion for inference, like a 100 denoising steps to

17:15 / 18:16

from transcript17:15 – 18:16

The Twenty Minute VC (20VC)

20VC: Mercor: From $1M to $500M in 17 Months: The Fastest Growing Company in the World | How to Think About Margins and Revenue Sustainability in AI | Why Evaluation Benchmarks in AI are BS Today with Brendan Foody

...models and make them an order of magnitude more efficient in twelve months, then it could make sense to run really aggressive margins on serving models. It really comes down to the stickiness and whether those subsidies today are driving large LTVs t

36:40 / 37:51

from transcript36:40 – 37:51

Invest Like the Best

Gaurav Misra & Dwight Churchill - Building Captions - [Invest Like the Best, EP.405]with Dwight Churchill

...can distill models and have them work with a few steps of diffusion now. I think we're definitely the most inefficient we'll ever be, and it's only gonna get more and more efficient. It could be a factor of at least an order of magnitude, like, 10 x

17:56 / 19:06

from transcript17:56 – 19:06

a16z Podcast

How OpenAI Built Its Coding Agent

...model distillation, or do you think that comes from just better orchestration of tools? Where do you think that's the speed? I think, like, the low hanging fruit is just, like, plain old deterministic, like, DevOps y type stuff. Okay. You know, like,

44:02 / 45:17

from transcript44:02 – 45:17

Latent Space

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

...models different than external because the external models have been getting a lot smaller,

23:14 / 24:20

from transcript23:14 – 24:20

The Twenty Minute VC (20VC)

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

...distillation is that sometimes the smaller models become better than the than the than the bigger model through distillation.

54:45 / 55:45

from transcript54:45 – 55:45

The Twenty Minute VC (20VC)

20VC: How to Fix the UK Tech Ecosystem | Why We Need to Flood the UK with Venture Capital | What the UK Can Learn From Sequoia, Stripe and Norway | Why Now is the Time to be Bullish on China & Lessons from Jensen Huang with Tom Hulme & Stan Boland

...seen distillation of foundation models. We're going to start to see distillation of business models, businesses. And so I would expect these really successful businesses to get copied ridiculously quickly. So I think this idea that you're gonna have

1:08:13 / 1:09:15

from transcript1:08:13 – 1:09:15

Latent Space

[LIVE] Anthropic Distillation & How Models Cheat (SWE-Bench Dead) | Nathan Lambert & Sebastian Raschka

...model with supervised learning, supervised fine tuning on it. But then how would you even detect that this is an distillation attack versus just an evaluation? Because right now, I'm actually running, I mean, I'm distilling myself for chapter eight o

6:44 / 7:57

from transcript6:44 – 7:57

Run your podcast through Clypt

Run your podcast through Clypt