Wong In-hoon's podcast: The moat of In Weida is much deeper than the chip

Video title: Jensen Huang: – Will Nvidia's moat user?

Image by Dwarkesh Patel

Photo by Peggy Block Beats

The editor presses that, while the outside world is still discussing whether the moat of Inverda comes from the supply chain, the dialogue argues that what is really hard to replicate is not the chip itself, but the full-system capability to transform "electronics into Token" — from computing architecture, software systems, and eco-ecosystems of developers。

This post has been edited by Dwarkesh Patel and talks with Jensen Huang. Dwarkesh Patel is one of the most interested podcasts in Silicon Valley, leading YouTube channel, Dwarkesh Podcast, for an in-depth research-type interview and long-term dialogue between AI researchers and the core of the technology industry。

Dwarkesh Patel on the right, Jensen Huang on the left

Around this core, this conversation can be understood at three levels。

The first is changes in technology and industrial structures。

The strength of Yvette is not just hardware performance, but the developers' ecology that CUDA carries, and the path around the calculator stack. In this system, computing is no longer the only variable, and algorithms, system engineering, network and energy efficiency together determine the speed at which AI advances. This also leads to an important judgement: the software is not simply "commercialized" because of AI, but rather the value of the software is further amplified with the expansion of Agent, which will result in an exponential increase in its use。

Second, commercial boundaries and strategic choices。

IN THE FACE OF THE EXPANDING AI INDUSTRIAL CHAIN, YVETTE CHOSE TO "DO WHATEVER IS NECESSARY BUT NOT EVERYTHING." IT DOES NOT ENTER CLOUD COMPUTING, NOR DOES IT INVOLVE EXCESSIVE VERTICAL INTEGRATION, BUT RATHER EXPANDS THE OVERALL MARKET SIZE THROUGH INVESTMENT AND ECOLOGICAL SUPPORT. THIS RESTRAINT HAS ALLOWED IT TO MAINTAIN BOTH CRITICAL CONTROLS AND AVOID BECOMING AN ECOLOGICAL SUBSTITUTE, THEREBY INTEGRATING MORE PARTICIPANTS INTO ITS TECHNICAL SYSTEM。

Thirdly, there are differences about technology diffusion and industrial patterns。

THE MOST ROBUST PART OF THE DIALOGUE IS NOT ABOUT CONCRETE CONCLUSIONS, BUT ABOUT HOW TO UNDERSTAND THE "RISK" ITSELF. ONE PERSPECTIVE EMPHASIZES THE PRE-EMPTIVE ADVANTAGE OF LEADING IN NUMBERS, WHILE THE OTHER FOCUSES MORE ON THE LONG-TERM ATTRIBUTION OF ECOLOGY AND STANDARDS IN TECHNOLOGY DIFFUSION. PERHAPS THE MORE CRITICAL QUESTION THAN THE SHORT-TERM CAPACITY GAP IS WHAT TECHNOLOGY SYSTEM THE FUTURE AI MODEL AND DEVELOPERS ARE OPERATING ON。

In other words, the end of this competition is not just "who led the better model" but "who defined the infrastructure in which the model operates"。

IN THIS SENSE, THE ROLE OF INVERDA IS NO LONGER JUST A CHIP COMPANY, BUT IS CLOSER TO THE AI ERA'S "BOTTOM OPERATING SYSTEM PROVIDER" -- IT SEEKS TO ENSURE THAT THE PATH OF VALUE GENERATION CONTINUES TO REVOLVE AROUND ITSELF, REGARDLESS OF HOW COMPUTING CAPACITY SPREADS。

The following is the original text (in order to facilitate reading and understanding, the original text has been consolidated):

TL;DR

The moat of Inverda is not "chip" but "system-wide capability from electronics to Token". The core is not a hardware performance, but a total capacity to convert calculations into prices (structure + software + ecology)。

THE ESSENCE OF CUDA IS NOT A TOOL, BUT THE LARGEST AI DEVELOPER ECOLOGY IN THE WORLD. DEVELOPERS, FRAMEWORKS, MODELS ARE ALL TIED TO THE SAME TECHNOLOGY WAREHOUSE TO FORM AN IRREPLACEABLE PATH DEPENDENCE。

THE KEY TO COMPETITION IS NOT JUST CALCULUS, BUT A COMBINATION OF "CALCULATING STACKS X ALGORITHMS X SYSTEMS ENGINEERING". ARCHITECTURE, NETWORKS, ENERGY EFFICIENCY, AND SOFTWARE SYNERGIES HAVE LED TO IMPROVEMENTS THAT GO FAR BEYOND MERE PROGRESS。

• The measurement bottlenecks are short-term and supply will be filled by demand-driven signals within 2-3 years. The real long-term constraints are not chips, but energy and infrastructure。

• AI software will not be commodified, but will lead to exponential growth in its use as a result of the Agent outbreak. The future is not a cheaper software, but a surge in software calls。

• The core strategy of not doing clouds is to do what is necessary, but not to swallow the whole value chain. Increase the overall market size through investment and ecological support, rather than vertical integration。

• THE REAL STRATEGIC RISK IS NOT THAT THE RIVALS GET THE MATH, BUT THAT THE GLOBAL AI ECOLOGY IS NO LONGER BASED ON AMERICAN TECHNOLOGY. ONCE MODELS AND DEVELOPERS MIGRATE, LONG-TERM TECHNICAL STANDARDS AND INDUSTRY OWNERSHIP WILL SHIFT。

Interviews

Where is the moat in In Weida: the supply chain, or the control of "electronic to Token"

Dwarkesh Patel (Moderator):

WE'VE SEEN A DECLINE IN THE VALUATION OF MANY SOFTWARE COMPANIES BECAUSE IT'S EXPECTED THAT AI WILL TURN SOFTWARE INTO A STANDARDIZED COMMODITY. THERE'S ALSO A SLIGHTLY NAIVE WAY OF UNDERSTANDING: YOU SEE, FROM THE DESIGN FILE (GDS2) TO THE BUILD-UP OF THE BUILD-UP OF THE LOGIC CHIP, THE CRYSTAL CIRCLE, THE CONSTRUCTION OF THE SWITCH CIRCUIT, AND THEN SEAL IT UP WITH THE HBM PRODUCED BY SK HERCULES, MIO, SAMSUNG, AND THEN SEND IT TO ODM TO ASSEMBLE THE WHOLE RACK。

Note: HBM (High Bandwidth Memory, High-bandwidth Memory) is an advanced memory technology specially designed for high performance computing and AI; ODM (Original Design Manufacturer, original design manufacturer) means a proxy plant responsible not only for production but also for product design Business

So from this point of view, it's essentially software, and it's made by someone else. If the software is commodified, it will also be commodified。

Jensen Huang:

but in the final analysis, there has to be a process to turn electronic into a token. from electronics to token, and to make these tokens more valuable over time, i think it's hard to be completely commercialized。

the transformation from electron to token is itself an extraordinary process. and making one token is more valuable, just as making one molecule more valuable than another is making one token more valuable than another。

in this process, there is a great deal of art, engineering, science and invention to give this token value。

Obviously, we are observing this in real time. So the transformation process, the manufacturing process and the signals involved are far from fully understood and the journey is far from over. So I don't think that's gonna happen。

of course, we will make it more efficient. in fact, the way you described the problem was actually a model of my mind for inverda: the input is electron, the output is token, and the middle is yvette。

Our job is to "do as much as is necessary, while doing as little as is necessary" in order to achieve this transformation and to give it the highest level of capacity。

WHEN I SAY "DO AS LITTLE AS POSSIBLE" WE MEAN WHAT WE DON'T HAVE TO DO OURSELVES, WE WORK WITH OTHERS TO INCORPORATE IT INTO OUR ECOLOGY. IF YOU LOOK AT BRITAIN TODAY, WE MAY HAVE ONE OF THE LARGEST COOPERATIVE ECOSYSTEMS IN BOTH THE UPSTREAM AND DOWNSTREAM SUPPLY CHAINS. FROM COMPUTER MANUFACTURERS, APP DEVELOPERS, TO MODEL DEVELOPERS -- YOU CAN LOOK AT AI AS A FIVE-STOREY CAKE. AND WE HAVE AN ECOLOGICAL LAYOUT AT THESE FIVE LEVELS。

Read about:YIN WEIDA WONG IN-HOON'S LATEST POST: "FIVE LAYERS OF CAKE" BY AIIt's the only thing that's going on

So we try not to do that, but that part of what we have to do is extremely difficult. And I don't think that part will be commercialized。

In fact, I don't think corporate software companies are actually making tools. The reality, however, is that today most software companies are indeed tool providers。

There are, of course, exceptions, some in the coding and solidification of workflow systems, but many companies are essentially tool companies。

Excel is a tool, PowerPoint is a tool, Cadence does a tool, Synopsys is a tool。

Jensen Huang:

and what i see is the opposite of what many people think. i think the number of angents will increase exponentially, and the number of users of tools will increase exponentially。

There is also a potential surge in the number of calls for tools. For example, the use of Synopsys Design Compiller is likely to increase significantly。

there will be a large number of delegates using floor planners, layout tools, design rule check tools。

today, we are limited to the number of engineers; tomorrow, these engineers will be supported by a large number of angents, and we will explore design space in an unprecedented way. when you use these tools today, this change will be very clear。

the use of the tools will drive these software companies to boom. this has not happened yet because the present agent is not good enough to use tools。

so either these companies build their own angent, or angent itself becomes strong enough to use these tools. i think it will eventually be a combination of the two。

Dwarkesh Patel

I remember in your latest disclosure, you had nearly $100 billion in procurement commitments for border components, memory, sealing, etc. The SemiAnalysis report suggests that this figure could reach $250 billion。

One interpretation is that the moat in Inverda is that you have locked up the supply of these scarce components for years to come. In other words, others might be able to make accelerators, but can they get enough memory? Can they get enough logical chips

Is this the core advantage of the years ahead

Jensen Huang:

That is one thing we can do, but it is hard for others to do. We have been able to make huge commitments upstream, in part in the obvious, that is, in the procurement commitments to which you referred, and in the hidden part。

FOR EXAMPLE, MUCH OF THE UPSTREAM INVESTMENT IS ACTUALLY MADE BY OUR SUPPLY CHAIN PARTNERS, BECAUSE I WOULD SAY TO THEIR CEO, "LET ME TELL YOU HOW BIG THE INDUSTRY IS, LET ME EXPLAIN WHY IT IS, LET ME PLAY WITH YOU, LET ME TELL YOU WHAT I SEE."。

THROUGH SUCH PROCESSES — DELIVERING MESSAGES, STIMULATING VISIONS, BUILDING CONSENSUS — I AM ALIGNED WITH CEOS IN DIFFERENT INDUSTRIES UPSTREAM, AND THEY ARE WILLING TO MAKE THESE INVESTMENTS。

Then why would they invest in me instead of for others? Because they know that I can buy their capacity and digest it through my downstream. It is because of the downstream demand and the size of the supply chain of Inverda that they are willing to invest upstream。

YOU SEE GTC, THE SIZE OF THE GENERAL ASSEMBLY HAS SHOCKED MANY PEOPLE. IT'S ESSENTIALLY A 360-DEGREE AI UNIVERSE, BRINGING THE WHOLE INDUSTRY TOGETHER. EVERYBODY'S GATHERED BECAUSE THEY NEED TO SEE EACH OTHER. I BROUGHT THEM TOGETHER TO SEE DOWNSTREAM, DOWNSTREAM TO SEE PROGRESS ON AI。

AND MORE IMPORTANTLY, THEY HAVE ACCESS TO AI PRIMARY COMPANIES AND START-UPS, AND THEY SEE THE INNOVATIONS THAT ARE HAPPENING, SO THEY CAN VERIFY WHAT I'M TALKING ABOUT。

so i spent a lot of time explaining, directly or indirectly, the opportunities ahead to our supply chains and ecological partners. a lot of people would say that my keynote doesn't come from the traditional press, but part of it sounds like a class. and that's actually what i'm doing。

I need to ensure that the entire supply chain — upstream and downstream — understands what happens next, why, when, how big, and is as systematically deduced as I am。

So the kind of moat you were talking about, it does exist. If this market reaches trillions of dollars in the coming years, we have the capacity to build the supply chain that sustains it. As with cash flows, there is movement and turnover in the supply chain. No one would build a supply chain for a structure if it did not run fast enough. We have been able to maintain this size because downstream demand is extremely strong, and we can all see that。

It is this that allows us to do these things on a scale like this。

Dwarkesh Patel

I'd like to see if I can keep up. Over the past many years, your income has almost doubled over the years, and the global calculus has even tripled。

Jensen Huang:

And it continues to double in this volume。

Dwarkesh Patel

RIGHT. SO IF YOU LOOK AT LOGICAL CHIPS, LIKE YOU'RE ONE OF THE BIGGEST CLIENTS IN THE N3 PROCESS, ON N2 AS WELL。

ACCORDING TO SOME ANALYSIS, THIS YEAR AI MAY ACCOUNT FOR 60% OF N3 CAPACITY, OR EVEN 86% NEXT YEAR。

Note: N3 refers to 3 nanometer (3nm) node) of TSMC, which can be understood as one of the state-of-the-art chip manufacturing processes of the current generation of desktop electricity

HOW CAN YOU DOUBLE IT WHEN YOU'RE ALREADY IN THIS POSITION? AND DOUBLE EVERY YEAR? ARE WE AT A STAGE WHERE AI'S GROWTH IN NUMBERS MUST SLOW DUE TO UPSTREAM CONSTRAINTS? IS THERE ANY WAY TO CIRCUMVENT THESE RESTRICTIONS? HOW THE HELL ARE WE GOING TO DO THAT

Jensen Huang:

At certain moments, demand does exceed supply for the entire industry, both upstream and downstream. And in some cases, we can even be limited by the number of plumbers — this really happened。

Dwarkesh Patel:

THAT GTC NEXT YEAR SHOULD HAVE INVITED THE PLUMBER。

Jensen Huang:

Yeah, it's actually a good phenomenon. You want to be in a market where instant demand is greater than the total supply of the entire industry. The reverse is certainly not very good。

If the gap between the two is too large, a specific link, a component becomes a clear bottleneck, and the whole industry moves on to address it. For example, I noticed that the CoWos are not being discussed anymore. This is because, over the past two years, we have invested in and expanded it on a very large scale, which has doubled。

Now I think the whole thing is in a better position. It has also been recognized that the supply of CoWos must keep pace with the growth in demand for logical chips and memory. So they're expanding COWOS, and they're also expanding future advanced containment techniques, and they're expanding at the same pace as the logical chip。

This is very important, because in the past, CoWos and HBM memory were more like "special capabilities" but not anymore. It is now recognized that they are part of mainstream computing techniques。

AT THE SAME TIME, WE ARE NOW BETTER EQUIPPED TO INFLUENCE THE WIDER SUPPLY CHAIN. IN THE PAST, AT THE BEGINNING OF THE AI REVOLUTION, I'VE BEEN TALKING ABOUT THESE JUDGMENTS FIVE YEARS AGO。

There were people who believed in and invested in, like the Sanjay team of pretty light. I still remember that meeting, when I made very clear what was going to happen in the future, why it happened and the predictions of these results today. They chose to add a large number at the time, and we established a cooperative relationship with them. They have invested in many directions, such as LPDDR, HBM and so on, and this has clearly yielded considerable returns to them. There are companies that have been following since, but we are all at this stage。

So I think that every generation of technology, every bottleneck, will get a lot of attention. And now we've been talking about the bottlenecks of the previous years. For example, we work with Lumentum, Coherent and the entire silicon photonics. Over the past few years, we have effectively reshaped the entire ecological and supply chain。

In the case of silicon light, we have built a complete supply chain around the reservoir, developed technology in cooperation with them, invented many new technologies and licensed these patents to the supply chain to maintain ecological openness. We prepare the supply chain by creating new technologies, new workflows, new testing equipment (including double-sided detection), investing in companies and helping them expand their production。

So you can see that we are proactively shaping this ecology so that the supply chain can support the scale of the future。

Dwarkesh Patel:

Sounds like some bottlenecks are easier to solve than others. For example, those that are harder to expand than the CoWos

Jensen Huang:

I have just mentioned the most difficult example。

Dwarkesh Patel:

Which one

Jensen Huang:

The plumber. Yeah, it's true. I was talking about the hardest one — the plumber and the electrician. The reason for this is that it also gives me some concern about the term "end-of-life"ists, who are always talking about jobs disappearing and jobs being replaced. If we get people to stop being software engineers, there's a real shortage of software engineers in the future。

Similar projections were made 10 years ago. And it was said, “No matter what you do, you shall not be a radiologist.” You can still find those videos online, saying that radiology would be the first occupation to be eliminated and that the world would no longer need radiologists. But the reality is that we now lack radiologists。

Dwarkesh Patel:

ALL RIGHT, LET'S GO BACK TO THE QUESTION JUST NOW: THERE ARE LINKS THAT CAN BE EXPANDED AND THERE ARE NO LINKS. SO HOW DO WE DOUBLE THE CAPACITY OF THE LOGIC CHIP? AFTER ALL, THE REAL BOTTLENECK HERE IS BOTH MEMORY AND LOGIC. WHAT ABOUT THE EUV? HOW DO YOU DOUBLE ITS NUMBER EVERY YEAR

Jensen Huang:

None of this can be done. It is true that rapid expansion is not easy, but it is not difficult to do so within two to three years. The key is to have a clear demand signal. Once you can make one, you can make ten; once you can make ten, you can make a million. So it's not really hard to copy。

Dwarkesh Patel:

HOW DEEP WOULD YOU PASS THAT JUDGMENT TO THE SUPPLY CHAIN? WOULD YOU, FOR EXAMPLE, GO TO ASML AND SAY, "IF I LOOK FOR THE NEXT THREE YEARS, IN ORDER TO MAKE BRITAIN'S ANNUAL INCOME $2 TRILLION, WE NEED MORE EUVS?"

Jensen Huang:

SOME I WOULD DO DIRECTLY, SOME INDIRECTLY. IF I CAN CONVINCE THE BUILD, THEN ASML WILL BE CONVINCED. SO WE NEED TO IDENTIFY THE KEY BOTTLENECKS. BUT AS LONG AS THE BUILDER BELIEVES IN THIS TREND, YOU'LL HAVE ENOUGH EUV EQUIPMENT IN A FEW YEARS。

I mean, none of the bottlenecks will last more than two to three years, none of them。

At the same time, we are improving computing efficiency. From Hopper to Blackwell, about 10 times, 20 times, and in some cases 30 to 50 times. We are also constantly proposing new algorithms. Because CUDA is flexible enough, we can develop new methods to increase efficiency while expanding production capacity。

So none of this worries me. What really worries me is factors outside of our downstream, such as energy policy. Without energy you cannot expand; without energy you cannot build an industry; without energy you cannot build a completely new manufacturing system。

Now we want to promote America's re-industrialization by bringing back chips, computers, seals, and building new industries like electric cars and robots. When we're building AI factories, it's all energy, and energy-related construction cycles are long. By contrast, increasing the capacity of chips is a problem of two to three years; increasing the capacity of CoWos is also a problem of two to three years。

Dwarkesh Patel:

Interesting. I feel that some of the guests I interviewed gave the opposite judgment. It's just that I don't really have enough technical background to judge。

Jensen Huang:

But the good thing is, you're talking to the experts now。

Does Google's TPU shake Britain's position

Dwarkesh Patel:

Yes, indeed. I'd like to ask your competition. If you look at TPU, it can be said that two of the world's top three models now -- Claude and Gemini -- are trained with TPU. What does this mean for the future of Inverda

Note: TPU (Tensor Processing Unit) is a type of special core designed by Google for artificial intelligence (especially in-depth learning). Snippets

Jensen Huang:

We do something completely different. The construction of YVD is "accelerated calculation" rather than "scale processing" (TPU)。

ACCELERATING CALCULATIONS CAN BE USED FOR A VARIETY OF TASKS, SUCH AS MOLECULAR DYNAMICS, QUANTUM-COLOUR DYNAMICS, DATA PROCESSING, DATA FRAMES, STRUCTURED DATA, UNSTRUCTURED DATA, HYDRODYNAMICS, PARTICLE PHYSICS, AND, OF COURSE, AI. THEREFORE, THE APPLICATION OF ACCELERATED CALCULATIONS IS MUCH BROADER。

ALTHOUGH THE DISCUSSION IS NOW FOCUSED ON AI, WHICH IS INDEED VERY IMPORTANT AND INFLUENTIAL, THE "CALCULATION" ITSELF IS MUCH BROADER THAN AI. WHAT YVETTE DID WAS RECREATE THE CALCULATION FROM GENERIC TO ACCELERATED CALCULATIONS. OUR MARKET COVERAGE IS MUCH GREATER THAN ANY TPU OR OTHER SPECIAL ACCELERATOR。

If we look at our location, we're the only company that can speed up all kinds of applications. We have huge ecosystems, and frameworks and algorithms can operate on the British Wida platform. And our computer system is designed to be "operated by others." Any operator can buy our system for use。

Most of the self-study systems are not designed for use by others, and you basically have to operate them on your own, because they were not designed to be flexible enough to be used by others from the outset. It's because our system can be run by anyone, we have access to all major platforms, including Google, Amazon, Azure, OCI。

Whether you operate the system for the purpose of renting computing or using it for your own use, if you're going to lease it, you have to have a large-scale customer ecology covering a wide range of industries to meet these demands. If you run the system for your own use, we certainly have the capacity to help you do it. For example, Elon's xAI。

Because we can get any industry, any company operator to use our system that you can use to build supercomputers for companies like Lily for scientific research and drug discovery. We can help them run their own supercomputers and use them for the whole spectrum of drug research and development and bioscience applications, which are areas where we can accelerate。

SO WE CAN COVER A LOT OF APPLICATIONS, AND TPU CAN'T DO THAT. THE CUDA ITSELF, BUILT BY IN WEIDA, CAN BE AN EXCELLENT MASS-PROCESSING PLATFORM, BUT IT IS MORE THAN THAT, COVERING THE ENTIRE LIFE CYCLE OF DATA PROCESSING, COMPUTING, AI, ETC. SO WE HAVE MUCH GREATER MARKET OPPORTUNITIES AND WIDER COVERAGE. AND BECAUSE WE ARE NOW LARGELY SUPPORTIVE OF ALL TYPES OF APPLICATIONS AROUND THE GLOBE, YOU CAN DEPLOY THE SYSTEM ANYWHERE, AND YOU CAN BE SURE THAT THERE WILL BE CUSTOMERS TO USE IT。

So it's a completely different thing。

Dwarkesh Patel:

The question will be a little longer。

YOUR CURRENT INCOME IS AMAZING, AND IT IS NOT MAINLY FROM PHARMACEUTICAL OR QUANTUM CALCULATIONS. YOU'RE NOT MAKING $60 BILLION A QUARTER FROM THESE OPERATIONS, BUT AI IS AN UNPRECEDENTED TECHNOLOGY THAT IS MOVING AT UNPRECEDENTED SPEED。

SO THE QUESTION IS, IF ONLY AI, WHAT'S ITS BEST OPTION? I'M NOT DOING THE BOTTOM LINE, BUT I'VE TALKED TO SOME FRIENDS OF THE AI RESEARCHER, AND THEY SAY, "WHEN I USE TPU, IT'S A BIG ARRAY THAT'S PERFECT FOR A MATRIX MULTIPLICATION; AND GPU IS MORE FLEXIBLE, FOR A LOT OF BRANCH AND IRREGULAR MEMORY ACCESS。

But if you look at AI, isn't it essentially a multiplication of the matrix over and over and over and over again? So you don't really need to take the chip area for the function of warp scheduling, linear switching, memory bank, etc. So TPU is highly optimized for the main applications in the current wave demand and income growth。

What do you think of that

Jensen Huang:

THE MATRIX MULTIPLICATION IS INDEED AN IMPORTANT PART OF AI, BUT IT IS NOT ALL OF AI。

If you want to come up with a new mechanism, or do it in different ways; if you want to design a completely new architecture, like hybrid SSM; if you want to build a model that combines diffusion and autogressive -- - All you need is a universal programmable structure, and we can run anything you can think of。

THAT'S OUR ADVANTAGE. IT MAKES NEW ALGORITHMS MUCH EASIER. IT'S BECAUSE IT'S A PROGRAMMABLE SYSTEM, AND IT'S THE REASON AI CAN MAKE SUCH RAPID PROGRESS。

TPU, LIKE ANY OTHER HARDWARE, IS INFLUENCED BY MOORE'S LAWS. WE KNOW THAT MOORE'S LAW BRINGS ABOUT 25 PERCENT OF THE INCREASE EVERY YEAR. SO IF YOU WANT TO JUMP 10 TIMES, 100 TIMES, THE ONLY WAY IS TO CHANGE THE ALGORITHM AND HOW IT'S CALCULATED EVERY YEAR。

That is the core strength of Ingweida。

The reason why we managed Blackwell to make a big – I said 35 times – increase its energy efficiency 35 times more than Hope, nobody believed it。

Then Dylan wrote an article saying that I'm actually conservative and that I'm almost 50 times more advanced, and that this cannot be achieved by Moore's law alone. Our solution to this problem is to introduce new model structures, such as MoE, and to extend computing to the entire computing system through parallelization, decoupling and distributive processing. It's hard to do this without the ability to go to the bottom and develop new computing kernels using CUDA。

Note: Refers to Dylan Patel, a well-known analyst in the field of semiconductor and AI infrastructure, founder of the research institute SemiAnalysis

So our advantage is the programmability of the architecture and the fact that Yvette is a highly coordinated company. We can even unload some of the calculations into interconnective structures, like NVLink, or network layers, like Spectrum-X. In other words, we can drive change at the same time as processors, systems, interconnections, software banks, algorithms. All of this was accomplished simultaneously. I don't even know where to start without CUDA supporting all this。

Dwarkesh Patel:

This also raises a question about the client structure of the New Zealand: And if you 60 percent of your income comes from these five superscalers, they'll be very dependent on CUDA in another era, facing different kinds of clients, such as professors of experiments. They can't use other accelerators, they can use PyTorch + CUDA, and they need everything to be optimized。

But if these super-large cloud manufacturers are able to write their own core. In fact, they have to do that to extract the last 5% of their performance. Anthropic, Google often trains with self-research accelerators or TPUs. Even OpenAI would use Triton when using GPU, and they would say, "We need our Kernel." So they write CUDA C++, instead of using cubLAS, NCCL libraries, and build their own stacks, and even compile them on other accelerators。

SO FOR MOST OF YOUR CLIENTS, THEY CAN AND DO REPLACE CUDA. SO, TO WHAT EXTENT DOES CUDA REMAIN THE KEY TO DRIVING THE FRONT LINE, AND AI HAS TO RELY ON BRITAIN

Jensen Huang:

FIRST, CUDA IS A VERY RICH ECOSYSTEM. IF YOU'RE GOING TO DEVELOP ON ANY COMPUTER, FROM CUDA IS A VERY WISE CHOICE. WE SUPPORT ALL MAINSTREAM FRAMEWORKS BECAUSE OF THE ABUNDANCE OF THIS ECOLOGY。

If you need to write self-defined kelnels, like Triton, we're contributing a lot of British Wida technology at the back end of Triton, and we're happy to help the frameworks get better. There are a lot of frames, like Triton, vLM, SG Lang, and more。

This area is expanding rapidly with the development of post-training and intensive learning. You have Vairal, Nemo RL, and a series of new frameworks. If you're going to develop a structure, it's most reasonable to start with CUDA, because you know that ecology is mature. When there's a problem, it's probably your own code, not the big pile of code at the bottom。

Don't forget, the size of the code behind these systems is very large. When there's a problem with the system, you want to know whether it's you or the platform itself。

of course you would prefer the problem to be you, not the platform for calculation. of course, we have a lot of bugs on our own, but our system is so mature that you can build at least on a reliable basis。

the second point is the size of the base figure. if you're a developer, whatever you're doing, the most important thing is installation base. you want your software to run on as many computers as possible. you don't write software for yourself, you write software for your whole cluster, even for the whole industry, because you're a framework developer。

THE CUDA ECOLOGY OF ENGLAND IS ESSENTIALLY OUR MOST IMPORTANT ASSET. THERE ARE HUNDREDS OF MILLIONS OF GPUS AROUND THE WORLD. ALL CLOUD MANUFACTURERS, RANGING FROM V100, A100, H100, H200, TO L SERIES, P SERIES, VARIOUS SPECIFICATIONS。

AND THEY EXIST IN DIFFERENT FORMS. IF YOU WERE A ROBOT COMPANY, YOU'D WANT CUDA TO RUN DIRECTLY ON THE ROBOTIC BODY. WE ARE ALMOST EVERYWHERE。

This means that once you have developed software or models, it can be used anywhere. So the installation base itself is extremely valuable。

Finally, flexibility in deployment locations. We are present in all cloud platforms, and this makes us unique. As an AI company or developer, you're not sure which cloud manufacturer will eventually work with or where your system will run. And we can operate everywhere, including on-site deployment。

The combination of ecological richness, the size of the installation base and the flexibility of deployment locations is therefore very valuable。

Dwarkesh Patel:

That makes sense. But I wonder if these advantages really matter to your main clients. There are a lot of people who will benefit from these advantages, but those who can build their own reservoirs -- that is, the clients who contribute most of your income. – Especially in a world where AI is getting stronger on a "valitable feedback loop" mission, such as enhanced learning scenes, kernel optimization like attention or MLP, is a very easy feedback loop。

So these super-massive cloud manufacturers, can they write these themselves, Kernel? Of course, they may still choose Britain for value. But the question is, will this eventually become a simple comparison: who can provide better specifications? For example, at unit cost, who can provide higher computing power (FLOPS) and higher memory bandwidth? Because we used to have very high margins in hardware and software, largely because of CUDA, the moat。

THE QUESTION THEN IS, IF MOST CUSTOMERS CAN BUILD THEIR OWN STACKS OF SOFTWARE INSTEAD OF RELYING ON CUDA, CAN THIS PROFIT MARGIN BE MAINTAINED

Jensen Huang:

THE NUMBER OF ENGINEERS WE PUT INTO THESE AI LABORATORIES IS AMAZING, WORKING WITH THEM, HELPING THEM OPTIMIZE THE ENTIRE TECHNOLOGY VAULT. THE REASON IS THAT NO ONE KNOWS OUR ARCHITECTURE BETTER THAN WE DO. AND THESE STRUCTURES ARE NOT AS UNIVERSAL AS CPU。

CPU'S KIND OF LIKE A "HOME CAR," WHICH YOU CAN INTERPRET AS A CRUISER, NOT DRIVING VERY FAST, BUT EVERYONE CAN DRIVE WELL, HAVE CRUISE CONTROL, AND EVERYTHING IS SIMPLE. BUT THE GPU ACCELERATOR IS MORE LIKE AN F1 RACE. I CAN IMAGINE EVERYONE DRIVING IT 100 MILES AN HOUR, BUT TO REALLY PUSH IT TO THE LIMIT, IT TAKES A LOT OF EXPERTISE。

And we use a lot of AI to generate these Kernels. I am quite sure that we will remain indispensable for a considerable period of time. Our expertise can help these AI lab partners, easily double their performance. Many times, when we optimize their technology or some kelnel, their models can accelerate three times, two times, or even 50 percent. It's a very big upgrade, especially when you think they own a lot of Hopper and Blackwell clusters。

IF YOU DOUBLE PERFORMANCE, IT MEANS DOUBLE INCOME. THIS DIRECTLY CORRESPONDS TO INCOME. IN WEIDA'S CALCULATOR, THE TCO (TOTAL OWNERSHIP COSTS) PERFORMED BEST GLOBALLY, WITH NO RIVALRY. NONE OF THE COMPANIES CAN PROVE TO ME THAT THERE'S A BETTER PLATFORM IN PERFORMANCE/TCO THAN WE DO. NONE. AND THESE BENCHMARKING TESTS ARE OPEN。

Dylan was right. Inference Max is public, anyone can use. But no TPU team is willing to use it to show their reasoning cost advantage. It's hard to do, no one wants to prove it。

MLPerf is the same. I welcome them to show what they've been claiming to be 40% advantage. I'd love to see them prove TPU's advantage in cost. In my view, it makes no sense, and it makes no sense in the rationale. It doesn't make any sense。

SO I THINK THE REASON WE'RE SUCCESSFUL IS BECAUSE OUR TCO IS VERY GOOD。

And the other thing, you said that 60 percent of our clients came from the top five manufacturers, but most of the business was actually for outside clients. For example, on the AWS, the majority of Yvette's calculations are for external clients, not AWS. On Azure, our clients are mostly external; so on OCI. They chose us because of our wide coverage。

We can bring to them the world's best clients, who are themselves based on the platform of England. And these companies are based on Britain because our coverage and flexibility are strong。

SO I THINK THIS WHEEL WORKS: THE BASE FIGURE, THE PROGRAMMABILITY OF THE ARCHITECTURE, THE CONTINUOUS ECOLOGICAL ACCUMULATION. AND NOW THERE ARE THOUSANDS OF AI COMPANIES AROUND THE WORLD. IF YOU WERE ONE OF THOSE AI START-UPS, WHAT KIND OF STRUCTURE WOULD YOU CHOOSE? YOU'LL CHOOSE THE MOST POPULAR, THE MOST BASIC, THE MOST ECOLOGICALLY RICH ARCHITECTURE. THAT'S THE LOGIC OF THIS WHEEL。

So the reason is:

:: first, our unit cost performance is very high and therefore minimal

Second, we have the highest unit power-consuming performance in the world; if partners build a 1GW data centre, they have to produce the highest token, the highest income. And our architecture can produce the most tokens with unit effort。

• Thirdly, if your goal is rent-calculations, we have the largest customers in the world。

That's why the wheel was set up。

Dwarkesh Patel:

VERY INTERESTING. I THINK THE CORE OF THE PROBLEM IS WHAT THIS MARKET STRUCTURE IS. EVEN IF THERE ARE MANY COMPANIES, THERE IS A REAL POSSIBILITY THAT THERE ARE THOUSANDS OF AI COMPANIES THAT ARE ROUGHLY EVENLY DIVIDED。

But the truth is, through these super-massive cloud manufacturers, it's the underlying model companies like Anthropic, OpenAI, which have the ability to get different accelerators running。

Jensen Huang:

I think your premise is wrong。

Dwarkesh Patel:

Maybe. Then I'll ask you another question, if all this talk about performance and cost is set up, why did companies like Anthropic just announce a dogiva-level TPU collaboration with Chase and Google the other day? And most of their calculations come from these systems. For Google, TPU itself is the main source of calculation. So if look at these big AI companies, once they used to use the whole of Britain, they're not anymore。

If these advantages are theoretical, why do they choose other accelerators

Jensen Huang:

Anthropic is a special example. Without Anthropic, TPU growth would hardly exist. TPU grew almost entirely from Anthropic. Similarly, without Anthropic, the growth of training needs is almost non-existent。

That is a very clear fact. Not a lot of similar opportunities, actually only one Anthropic。

Dwarkesh Patel:

But OpenAI also works with AMD and they're developing their own Titan accelerator。

Note: AMD (Advanced Micro Services) is a semiconductor company in the United States, designed primarily to calculate chips, and is an important competitor between Inverda and Intel

Jensen Huang:

But the vast majority of them are still in Weidar. We will also continue to cooperate substantially. I will not be dissatisfied by others trying other options. If they don't try other options, how do they know how good our solutions are

Sometimes it is true that this needs to be reconfirmed by comparison. And we must constantly prove that we are worthy of the present。

THERE HAVE ALWAYS BEEN A VARIETY OF ACCOUNTS ON THE MARKET. YOU CAN SEE HOW MANY ASIC PROJECTS WERE CANCELLED. JUST BECAUSE YOU STARTED DOING ASC, DOESN'T MEAN YOU CAN MAKE SOMETHING BETTER THAN ENGLAND。

In fact, it's not easy. It can even be said that, rationally speaking, this is not too valid. Unless Yvette really made a serious mistake in some ways. But given our size, our speed — we are the only company in the world to make a significant leap each year。

Dwarkesh Patel:

Their logic is that they don't need to be better than Yvette, but don't be 70% worse than Yvette, because they think you have 70% profit。

Jensen Huang:

BUT DON'T FORGET, EVEN WITH ASC, THE PROFIT MARGIN IS REALLY HIGH. IN WEIDA'S PROFIT MARGIN IS ABOUT 60-70 PERCENT, WHILE ASIC'S PROFIT MARGIN IS PROBABLY 65 PERCENT. HOW MUCH DID YOU REALLY SAVE

YOU ALWAYS HAVE TO PAY SOMEONE. SO FROM WHAT I'VE SEEN, THE PROFITS OF THESE UNDERLYING (ASIC) OPERATIONS ARE ACTUALLY VERY HIGH, AND THEY THEMSELVES BELIEVE THAT, AND THEY ARE QUITE PROUD OF IT。

A long time ago, we were not really able to do this. And honestly, I didn't really understand how hard it was to build a basic model lab like OpenAI or Anthropic. Nor is it fully aware that they actually require large-scale investment support from the supply side。

We didn't have the capacity to make billions of dollars in investments like Anthropic to use our calculus. But Google and AWS can, in return, invest a lot of money in the first place, and in return, Anthropic uses their algorithms。

We had neither the capacity to do so, and I would say that it was a mistake of mine: I did not really realize that they had no other choice. The venture capital agency could not invest $5 billion or $10 billion to support an AI laboratory and expect it to grow into anthropic。

It's my fault. But even when I realized that, I did not think we were capable of doing it at that stage。

But I won't make the same mistake again. I'm happy to invest in OpenAI and to help them expand, which I think is necessary. And when Anthony came to us, I was happy to be an investor and help them develop。

It's just that we couldn't do it at that time. I would be very happy to do these things if I could do it again and if we were as powerful as we are now。

Why didn't Young Waida do Clouds

Dwarkesh Patel:

It's interesting. For many years, Yvette has been a company selling shovels in the AI field and making a lot of money. And now you start putting that money in. There are reports that you invested 30 billion in OpenAI and 10 billion in Anthropic. The valuation of these companies continues to rise。

So, if you look back over the last few years, you give them the ability to calculate and you see the trend, when their valuation was only one tenth of what it is now, and even a year ago it was far below it. And you already had a lot of cash。

In fact, there is a possibility that Yin Weidar would become a base model company himself, or that he would make a large investment even earlier at a lower valuation, something like you do now。

So I'm curious, why didn't I do it earlier

Jensen Huang:

We're doing it when we can. If I could, I would have done it earlier. I'll do it when Anthropic needs our support. But we didn't have that power。

It is not within our reach, nor within our inertia in decision-making。

Dwarkesh Patel:

Is it a matter of money, or

Jensen Huang:

Yeah, it's the size of the investment. We had little tradition of investing abroad, let alone on that scale. And we are not aware that this is necessary。

My idea at the time was that they could have gone to venture capital, like other companies. But what they want to do is not supported by venture capital. OpenAi wants to do something that is not supported by venture capital。

That's what I realized later. But that's what they're smart about. They realized at the time that they had to follow that path. I'm glad they did. Even if we didn't get involved, which led Anthropic to turn to other partners, I still think it's a good thing. The presence of Anthropic is a good thing for the world, and I'm happy about it. Some regrets are acceptable。

Dwarkesh Patel:

The question will again come back: how should we use the money now that you have so much cash and are growing

One idea is that there is now an intermediate ecology that helps these AI laboratories convert capital expenditures (capex) into operating expenses (opex) so that they can rent their energy。

Because GPU is expensive, but as models advance, they can continue to produce higher value token over the life cycle. And Britain itself has the capacity to cover these upfront capital expenditures. For example, there are reports that you've supported CoreWeave by up to $6.3 billion and invested $2 billion。

then why didn't young waida become a cloud manufacturer himself? why don't you be a hyperscaller, build your own clouds and rent your math? after all, you have this cash capacity。

Jensen Huang:

It's a corporate philosophy, and I think it's a wise philosophy: we should do "as many things as necessary and do as little as possible"。

This means that if we don't do it, I really don't think it's gonna be done。

If we don't take these risks, if we don't build NVLink, if we don't build the entire warehouse, if we don't build this ecology, if we don't invest 20 years in CUDA, most of the time, if we don't. If we don't build these CUDA-X domain libraries -- be it light tracking, image generation, early AI models, data processing, structured data, vector processing -- if we don't do this, they won't exist。

I am fully convinced of that. We even developed a library called culitho for the calculation of light, and if we don't do it, no one will do it。

That is why we have done these things. That is what we should be fully committed to。

But at the same time, there are already many cloud manufacturers in the world. Even if we don't, someone will. So based on the principle of "do as much necessary as possible, but do as little as possible " , this concept has been in existence in the company. Every decision I make will be taken from that perspective。

In the field of clouds, if we didn't support CoreWeave, these new forms of AI cloud might not exist. If we do not support them, they will not reach this scale today. Like Nscale and Nebius, they wouldn't be here without our support. Now, they are well developed。

BUT ISN'T THIS A BUSINESS WE SHOULD DO IN PERSON? NOPE. WE STILL STICK TO THAT PRINCIPLE: DO WHAT IS NECESSARY AND DO AS LITTLE AS POSSIBLE. SO WE INVEST IN ECOLOGY BECAUSE I WANT THE WHOLE ECOLOGY TO FLOURISH. I WANT OUR ARCHITECTURE TO CONNECT AS MANY INDUSTRIES AS POSSIBLE, AS MANY COUNTRIES AS POSSIBLE, SO THAT AI CAN BE BUILT ON A GLOBAL SCALE AND BUILT ON AMERICAN TECHNOLOGY。

That is the vision we are advancing。

At the same time, as you just mentioned, there are a lot of good underlying model companies, and we'll try to invest in them。

And the other thing is, we're not going to "pick the winner." We want to support everyone. This is both our business needs and what we are willing to do. So when I invest in one of them, I invest in others。

Dwarkesh Patel:

Then why don't you choose the winner

Jensen Huang:

Because it is not our duty. The first point。

THE SECOND POINT, WHEN YOUNG WEIDAR FIRST STARTED, WAS ABOUT 60 GRAPHIC COMPANIES, 60 COMPANIES DOING 3D GRAPHICS. ONLY WE SURVIVED. IF YOU WERE TO CHOOSE ONE OF THE 60 COMPANIES, IT'D PROBABLY BE THE LEAST FAVOURED ONE。

And that was before your time, but the graphics of Inverda were completely wrong. It's not a slight deviation, it's fundamentally wrong. We designed a structure that was almost impossible for developers to support, and was destined to fail. We're based on a rational first-rate theory, but we end up with the wrong solution。

Everyone thought we could not succeed, but we survived. So I have enough modesty to admit that and not to choose the winner. Let them develop themselves or support everyone。

Dwarkesh Patel:

One thing I don't understand. You said you were not deliberately giving priority to supporting these new cloud manufacturers, but you just mentioned that they might not exist without Weidar. So how did these two things come together

Jensen Huang:

First, they must themselves want to exist and come to us for help. When they have a clear will, a business plan, a capacity, a passion - Of course, they themselves must have some capacity — and we will be there if some investment support is needed during the initial phase。

But it is essential that they build their own flying wheels as soon as possible. Your question was, do we want to get into the financing business? The answer is no. We don't want to be a financial institution. There are already many financiers in the market, and we are more willing to cooperate with these financial institutions than to do so ourselves。

So our goal is to focus on our own business, make business models as simple as possible, while supporting the whole ecology。

Jensen Huang:

When companies like OpenAI needed $30 billion in investment before IPO, we trusted them very much... - I personally believe that they're already a remarkable company and that they're going to be even better. The world needs them, everyone wants them, and I want them. They have all the elements to be winners, so we support them and help them expand。

So we will do that because they really need us to do it. But our principle is not to do as much as possible, but as little as possible。

Dwarkesh Patel:

THE PROBLEM MAY BE A LITTLE OBVIOUS, BUT WE'VE BEEN IN A GPU SHORTAGE FOR MANY YEARS, AND THIS IS EVEN MORE SO AS THE MODELS GET STRONGER。

Jensen Huang:

YES, WE DO HAVE GPU SHORTAGES。

Dwarkesh Patel:

In the distribution of these scarce resources, Ying Weida is thought not to be the highest bidder, but to consider, for example, ensuring the presence of these new cloud manufacturers — some for CoreWeave, some for Crusoe, some for Lambda。

First of all, do you agree? Second of all, what's in it for Young Waida

Jensen Huang:

I think your premise is wrong. Of course, we will look at them very carefully。

FIRST OF ALL, IF YOU DO NOT PLACE A PURCHASE ORDER (PO), IT MAKES NO SENSE TO COMMUNICATE. SO FIRST, WE WILL WORK WITH ALL CLIENTS TO MAKE DEMAND PROJECTIONS BECAUSE OF THE LONG PRODUCTION CYCLES OF THESE PRODUCTS AND THE LONG DATA CENTRE CONSTRUCTION CYCLES. IT IS THE FIRST THING THAT WE MATCH SUPPLY WITH DEMAND THROUGH FORECASTING。

Secondly, we will make projections with as many clients as possible. But in the end, you still have to order. If you don't order, there's nothing I can do. So at some point, it's "serve first."。

But otherwise, if your data centre is not ready, or if some key components are not ready, you may not be able to deploy the system for the time being, we may give priority to other clients. It's just to maximize the overall ingestion efficiency of our factory。

In addition to this, the principle of priority is “service first”. You have to order. If you don't order, there's no way。

There are, of course, a lot of stories outside, like people saying Larry and Elon asked GPU for dinner with me -- we did have dinner together, and it was a nice dinner, but they never asked for GPU. They just need to order on it. We'll do our best to provide capacity as soon as we make the order. It's not that complicated。

Dwarkesh Patel:

So it sounds like a queuing mechanism, depending on when you're putting down the order and whether the data centre is ready. But it's still not just "the highest bidder," is it

Jensen Huang:

We never do that。

Dwarkesh Patel:

Never at the highest bid

Jensen Huang:

Never. Because it's bad business practice。

You set the price, the client decides not to. I know that some companies in the industry raise prices when demand rises, but we don't. This has never been our approach. Clients can rely on us. I prefer to be a reliable presence and the foundation of the industry. You don't have to speculate about price changes。

If I give you a quote, that's the final price. Even the surge in demand will not change。

Dwarkesh Patel:

That's one of the reasons why you're stable with the power supply, right

Jensen Huang:

We've been working together for almost 30 years. There is no formal legal contract between Yin Weida and the build-up, and there is more about fair agreement between them. Sometimes I'm right and sometimes I'm wrong; sometimes I get better terms, sometimes less. But on the whole, this relationship is remarkable, and I can trust or rely on them completely。

And, for Inveida, one thing is certain: this year Rubin will be great, next year Vera Rubin Ultra will be launched, next year Feynman will be launched, next year -- I haven't announced that name yet. That is, every year, you can count on us. You have to go around the world and find another ASIC team to see if there's one that you can say, "I can put the whole company in, and I trust you'll be here every year to support me."。

my token costs will drop at one order of magnitude every year, and i can believe it as much as i believe the clock. i just said the same thing about the build-up. there's not a single round factory in history that's allowed you to say that。

But today you can say that to Inverda. You can count on us every year。

IF YOU WANT TO BUY A BILLION DOLLARS OF AI MACHINE POWER, NO PROBLEM; IF YOU WANT TO BUY 100 MILLION DOLLARS, NO PROBLEM; IF YOU WANT TO BUY 10 MILLION DOLLARS, EVEN A SHELF, NO PROBLEM; EVEN IF YOU WANT TO BUY A GRAPHIC CARD. IF YOU WANT THE NEXT $100 BILLION AI FACTORY ORDER, IT'S FINE。

Today, only one company in the world can say that. And I can say that to the electricity: I want to buy a billion dollars, no problem. All we have to do is plan together, walk through the process and do what mature businesses do。

SO, I THINK IT TOOK US DECADES TO GET TO THE GROUND OF THE AI INDUSTRY. THERE IS GREAT INPUT, GREAT FOCUS, AND CORPORATE STABILITY AND CONSISTENCY ARE VERY IMPORTANT。

Why did Young Waida refuse to take multi-routing bets

Dwarkesh Patel:

THIS ACTUALLY RAISES AN INTERESTING QUESTION. WE'VE BEEN TALKING ABOUT POWER ACCUMULATION AND MEMORY BOTTLENECKS. NOW IF YOU ENTER A WORLD WHERE YOU ALREADY OWN MOST OF N3'S CAPACITY, THEN YOU'LL PROBABLY HAVE MOST OF N2. WOULD YOU CONSIDER TURNING BACK TO THE IDLE CAPACITY OF OLD PROCESS NODES LIKE 7 NANOMETERS

For example, because AI's needs are too great and the most advanced process is not keeping up, you're going to redo a Hopper or Ampere version with all of today's experience with numerical optimization and system design. Do you think this will happen before 2030

Jensen Huang:

That's not necessary. The reason is that the progress of each generation is not just a change in the size of the transistor. You've also done a lot of engineering work on sealing, stacking, numeric systems, system architecture. And when you get to this point and you go back to an old version, the scale of research and development that needs to be invested is not affordable. We can afford to keep going, but I do not think we can afford to go back。

Of course, if there is an intellectual experiment: assuming that one day the world will say that advanced production capacity will never increase again. Will I go back to use 7 nanos right away? Of course, no doubt about it。

Dwarkesh Patel:

I've been talking to someone before: Why doesn't Inverda advance a number of completely different chip projects at the same time? For example, you can make a crystal circle structure like Cerebras, a big cover like Dojo, or something without CUDA。

YOU HAVE RESOURCES, YOU HAVE ENGINEERING SKILLS, YOU CAN DO IT IN PARALLEL. IF NO ONE KNOWS WHERE THE AI OR ARCHITECTURE IS GOING, WHY PUT ALL THE EGGS IN A BASKET

Jensen Huang:

Of course we can. It's just that we don't see a better solution. We've all simulated it, and it's probably even worse in our simulation. So we won't do it. What we do now is those items that we really want to do, and which we think are the right ones。

Of course, if the load itself changes dramatically in the future — not the algorithm, but the load actually changes — then we might also add other types of accelerators。

For example, recently we joined the Grok, and we'll integrate the Grok into the CUDA ecology. We're doing this right now. This is because the value of the token has become very high, so the same model, based on different response speeds, may correspond to different price levels。

a few years ago, token was almost free, or almost as cheap as free. but now, different clients are demanding token differently. and the customers themselves can make a lot of money. for example, for software engineers, if i can give them a faster response token and make them more efficient than today, i would pay for it。

But it's a recent market. So I think that for the first time now we really have the capacity to make different market layers based on the same model on response time。

This is why we decided to expand the Paretto front to a branch of reasoning that "responds faster, but lower." Because in the past, it's always the most important thing. But we now believe that there might be a high ASP token in the future. Even if it was even lower in the factory, the unit price would be enough to compensate for it。

That is why we did it. But if only to talk about the architecture itself, I would say that if I had more money, I would invest more money in the existing architecture。

Dwarkesh Patel:

i find this idea of "high premium token" and deduce the market hierarchy very interesting。

Last question. Assuming the deep learning revolution never happened, what would Young Waida do today

Jensen Huang:

Of course, the game does, but in addition, it accelerates the calculation. That's what we've been doing。

Our company's basic premise is that Moore's law will slow down. Generic calculations are good for many things, but not ideal for many computing tasks. So we put GPU together this architecture and CPU and let it accelerate the CPU's load. Different codes, kennels, different algorithms can be unloaded to run on GPU. That way, an application can accelerate 100 times, 200 times。

So where does it work? Of course, engineering, science, physics, data processing, computer graphics, image generation, various places。

SO EVEN TODAY WHEN AI DOESN'T EXIST, INVERDA WILL STILL BE A VERY BIG COMPANY. THE REASON IS REALLY FUNDAMENTAL: THE ABILITY OF UNIVERSAL COMPUTING TO CONTINUE TO EXPAND HAS LARGELY COME TO AN END. AND ONE WAY OF IMPROVING PERFORMANCE — NOT THE ONLY WAY, BUT THE MOST IMPORTANT WAY — IS TO DO A FIELD-SPECIFIC ACCELERATION。

WE STARTED WITH COMPUTER GRAPHICS, BUT THERE ARE MANY OTHER AREAS. SCIENTIFIC CALCULATIONS, PARTICLE PHYSICS, FLUID SIMULATIONS, STRUCTURED DATA PROCESSING, ETC., WOULD BENEFIT FROM VARIOUS TYPES OF ALGORITHMS。

So our mission has been to bring accelerated computing to the world and to promote the continued development of applications for which universal computing is not possible, or which cannot be extended to sufficient capacity levels, to help achieve breakthroughs in science. Some of our earliest applications were molecular dynamics, seismic processing in energy exploration and, of course, image processing。

In all these areas, generic calculations are themselves too inefficient. So, yeah, without AI, I'd be sad. But it is precisely because of our computational progress that we democratized deep learning. We have allowed any researcher, any scientist, any student, anywhere, to do amazing scientific research with a PC or a GeForce graphic card. This fundamental commitment has never changed, not at all。

SO IF YOU LOOK AT GTC, YOU'LL FIND THAT A SIGNIFICANT PART OF THE FIRST PART OF IT IS NOT REALLY AI. IT'S NOT ABOUT AI, BUT IT'S STILL VERY IMPORTANT. I KNOW THAT AI IS INTERESTING AND EXCITING。

But there are still a lot of people doing very important things that have nothing to do with AI. Tensor wasn't the only way they calculated it. And we want to help all of them。

Dwarkesh Patel:

Jensen, thank you very much。

Jensen Huang:

You're welcome, I enjoy this dialogue。

Original Link

Wong In-hoon's podcast: The moat of In Weida is much deeper than the chip

TL;DR

Interviews

Where is the moat in In Weida: the supply chain, or the control of "electronic to Token"

Does Google's TPU shake Britain's position

Why didn't Young Waida do Clouds

Why did Young Waida refuse to take multi-routing bets

Related Articles

a16z new: forecasting markets, entering the fast-track phase

a16z Founder: Agent, something really important has changed

ATTENTION WILL BE DRAWN NEXT WEEK TO THE LUNCHEON TO BE HELD BY MONTRAMPO FOR TRUMP HOLDERS; HEARINGS ON THE NOMINATION OF THE CHAIRMAN OF THE FEDERAL RESERVE IN WALSH WILL SOON BE HELD (4.20-4.26)

Arthur Hayes: It's no trade time

Products

Legal & Support

Friends