Litecoin

In the AI era, Token's ultimate battle between supply and demand

2026/04/29 01:01
🌐en

The strongest model is becoming a weapon for a few

In the AI era, Token's ultimate battle between supply and demand
Video title: The Super and Demand of AI Tokens | Dylan Patel Interview
Video by Invest Like The Best
Photo by Peggy Block Beats

Editors press: Against the backdrop of the continuing leap in AI model capacity, the massive access of tools such as Claude Code and Cursor to enterprises, industry discussions are shifting from "how powerful models" to "how good models go into production". But when AI programming, automated analysis and data modelling are emerging as a new consensus, a bottom-up question begins to emerge: when implementation costs are rapidly being pushed down, is the real scarcity of manpower, capital or access to frontier models and token

Left for Patrick O'Shaughtnessy, right for Dylan Patel

This post was organized by Patrick O'Shaughtnessy in a conversation with Dylan Patel, founder of SemiAnalysis. Dylan has long focused on AI Infrastructure, Semiconductor Supply Chain and Model Economics, a dialogue in which he discussed how AI can change corporate organization, information services, token demand, computing supply chains, and social sentiment, starting from his own firm Claude Code ' s soaring spending。

What is most interesting about this conversation is not that a model is being updated again, benchmark, but that it provides a way to understand the AI economy — to see AI as a production system that is reallocating implementation capacity, organizational efficiency and industrial profits, not just an upgrade of software tools。

This conversation can be roughly understood from five angles。

First, the cost of implementation is broken。In the past, ideas were not scarce, and what was really difficult was to turn them into products, systems and deliverables. Now, Claude Code, which allows non-technical staff to write codes, apply them, and do data analysis, would have required a long-term maintenance team, starting with a few using models. The annual expenditures of Claude Code in Semianallysis have reached $7 million, more than a quarter of its salary expenditure, which means that AI is no longer just a tool of efficiency but is turning into new production capital for enterprises。

Second, the information services industry was first rewritten。Dylan's business is essentially the sale of analytical, advisory and data sets, which is the area most easily commodified by AI. Chip reverse analysis, energy grid modelling, macroeconomic indicators, which may require a long-term team input in the past, can now be used by a few to build available products in a few weeks. This means that AI's pressure on ISPs is not "will it replace?" It's "who can re-engage more quickly." Companies that do not use AI will be commodified faster, while those that do use AI will have to continuously raise standards to avoid being replaced by the next more efficient competitors。

at a deeper level, token is becoming a new production resource。In the past, enterprises had purchased software subscriptions, and the central issue was whether the tool was working well; now, access rights to front-line models, rate limitt, enterprise contracts and token budgets were starting to determine production capacity directly. Stronger models do not necessarily mean higher costs, as smarter tokens may take fewer steps to perform higher-value tasks. The real competition is moving from "who uses AI" to "who gets the strongest model and uses the most expensive token in the highest value scenario."。

This demand will also continue to be channelled throughout the supply chain。token use surged and eventually became a constant pressure on capital expenditure on GPU, CPU, memory, FPGA, PCB, copper, semiconductor equipment and round mills. This is the logic behind the reference to the "cow whip effect": downstream, it seems, is just a model that has increased demand for calls, but can be transmitted upstream into orders, product expansion and price increases that are multiplied. AI ' s distribution of profits in the industry is therefore not limited to model companies and NVIDIA, but continues to spill along the semiconductor and data centre supply chains。

AND FINALLY, THE SOCIAL REBOUND OF AI COULD COME EARLY。Public concerns about job substitution, energy consumption, expansion of data centres and concentration of power will rise simultaneously as AI enters the work stream. Dylan even predicted a massive protest against AI within three months. For model companies, continuing to stress that "AI will change the world" does not necessarily ease anxiety, but rather may reinforce the imagination of ordinary people about losing control. AI industry then needs to prove not just its technological capabilities, but how it creates concrete and perceived public values in the present。

Today, the core issue of AI is moving from "what models can do" to "who can access models, how they can be used, and who can capture the value created by models." In this sense, the subject of this discussion is not just Claude Code, Anthropic or an AI company, but rather a structural reordering around productivity, capital spending, organizational efficiency and social acceptance。

The following is the original text (in order to facilitate reading and understanding, the original text has been consolidated):

TL;DR

• THE CORE VARIABLE OF AI IS MOVING FROM "CAN OR NOT" TO "VALUE IS NOT WORTH DOING," AND THE REAL SCARCITY OF IMPLEMENTATION COSTS IS HIGH-VALUE IDEAS THAT CAN BE SCALED UP BY MODELS。

Claude Code spends 25 per cent of salary costs just to begin with, AI is moving from software tools to new corporate production capital。

• the competition for front-line models is no longer just a competition for capacity, but rather a competition for token rights; new business barriers may be created by those who can obtain the strongest models earlier and more steadily。

• THE INFORMATION SERVICES INDUSTRY WILL BE FIRST RECAST BY AI, AS PRODUCTION COSTS FOR DATA, ANALYSIS AND RESEARCH ARE FALLING RAPIDLY AND SLOW FIRMS WILL BE COMMODIFIED MORE QUICKLY。

• Token demand will not slow down because of the price reductions of the old model, as each stronger model releases new high-value examples and pushes users towards more expensive forward models。

• The greatest change brought about by AI is not to make people work less, but to allow the same number of people to do several times the same; those who cannot create and capture the value of the token will be locked on the “permanent bottom”。

CALCULATOR SHORTAGES ARE SPREADING THROUGHOUT THE SEMICONDUCTOR SUPPLY CHAIN, FROM GPU, CPU, MEMORY TO PCB, COPPER MILL AND EQUIPMENT MANUFACTURERS, AND AI DEMAND HAS BECOME A PRICE PUSH FOR THE WHOLE INDUSTRY CHAIN。

The real problem is not just how much money the model company earns, but how much "ghost GDP" is created by the decision-making, efficiency and chain effects of token。

Other Organiser

Claude Code became a new labor force

Patrick O'Shaughtnessy (Moderator):
you told me a wonderful story about your team's huge change in use this year. can you repeat that? what does it give you to understand what is happening in this world

Dylan Patel (SemiAnalysis founder):
Last year, we thought we were an AI heavy user. Everybody's using ChatGPT, everybody's using Claude, and I'm giving the team all the subscriptions they want. At that time, the company spent about tens of thousands of dollars。

But this year, expenditures started to soar. The real starting point is probably last December, with the appearance of Opus. This includes Doug, our CEO Douglas Lawler. He's basically taking the lead in pushing non-technicals to write code with AI. He brought the whole company in a little bit. Of course, the engineers were in use, but from January this year, our spending was clearly going up, and then it broke fast。

We later signed an enterprise contract with Anthropic. Last time I talked to you, our annual spending was about $5 million; it's now $7 million。

Patrick O'Shaughtnessy:
And that's last week's figure。

Dylan Patel:
Yeah, much of that is the use itself. What's really interesting is that people who have never written codes before are now using Claude Code, and some people spend thousands of dollars a day. But in corporate terms, we're spending $7 million a year on Claude Code, and our salary is about $25 million. That is, Claude Code has spent more than 25 per cent of salary expenditure。

IF THIS TREND CONTINUES, IT MAY EVEN EXCEED 100 PER CENT OF TOTAL SALARIES BY THE END OF THE YEAR. IT'S A LITTLE SCARY. FORTUNATELY, I DO NOT NEED TO CHOOSE BETWEEN "PEOPLE" AND "AI" NOW, BECAUSE COMPANIES ARE GROWING FAST. MORE LIKE: I DON'T NEED TO HIRE PEOPLE SO FAST, BUT I CAN SPEND MORE MONEY ON AI, AND IT DOES WORK, AND COMPANIES CAN GROW FASTER。

But I think that sooner or later other companies will start to face the problem: If one person can do five, ten, or even 15 jobs with Claude Code, what happens next? First, there may indeed be a need for downsizing; and second, there is now a very wide range of such uses。

For example, we have a reverse engineering laboratory in Oregon, which has been built for a year and a half. There are a lot of high-end devices, such as microscopes, scanning electron microscopes. The core uses of the laboratory are reverse analysis chips, extraction of chip structures and analysis of materials used in their manufacture. These are also one of the data we sold。

However, the analysis of such data has been a very slow process in the past. Now, there's one man in our team who's just thousands of dollars in Claude token, and that's an application. This application can accelerate GPU and run on our server in CoreWeave. All we have to do is send it a chip picture that automatically points out the location of each material on the image: Here's copper, here's tantalum, here's platinum, here's cobalt. Then you can do a limited meta-analysis of the entire chip stack structure very quickly, and it is visualized, along with a complete graphical interface and dashboard。

This man, who had worked in Intel before, said that in the past it had been a complete team to do and maintain. It's incredible to put something like that in the company as a whole。

There's another example I find particularly interesting, is Malcolm. He was an economist at a big bank. The economics department of that bank could be 100 to 200 people. He's making something amazing now。

HE BROUGHT IN DATA INCLUDING FRED DATA, EMPLOYMENT REPORTS AND OTHER DATA SETS FROM DIFFERENT APIS. WE ALSO SIGNED CONTRACTS WITH SOME DATA PROVIDERS AND GOT API ACCESS. HE THEN PULLED ALL THE DATA IN AND STARTED RUNNING BACK AND ANALYSING THE EFFECTS OF DIFFERENT ECONOMIC CHANGES ON INFLATION OR DEFLATION IN THE ECONOMY。

The United States Bureau of Labor Statistics has a set of job classifications with approximately 2,000 assignments. Malcolm uses AI to assess which tasks can now be performed by AI, which cannot, and which points according to a set of rubric. The results show that about 3% of the tasks can now be performed by AI。

So he created an indicator to measure what could be done by AI and how deflationary it would be when it was done by AI. Output may rise, but because costs fall too much, in theory GDP may contract. He called it "Phantom GDP"。

he made a whole set of analyses based on this concept and created a completely new language model, benchmark, which contains about 2,000 evals。

Patrick O'Shaughtnessy:
He did all this by himself

Dylan Patel:
Yeah, he did it all by himself. He said to me, "Brother, it took a 200-person team of economists a year to do it." He's completely in Claude, saying everything's changed。

Patrick O'Shaughtnessy:
As a business operator, how do you understand this? You've gone from spending almost nothing, to now it's close to 25 percent of salary spending, and it's still rising. At what point do you think, "Wait, should I step on the brake? Should I control the expenses?" Maybe we don't have to always use the front-line model that just came out today, like Opus 4.7, but instead we can switch to a cheaper model

Dylan Patel:
After all, I'm in the information business. We sell analysis, we consult, we create data sets. I don't see any reason why these things will not be fully commercialized at a very fast pace。

If I don't continue to improve, the first data product I sold first, more people are now doing similar things. We can still sell it because we keep doing it better and better. But the way we did it in 2023, it's not really the same way people are doing it now. If I don't continue to raise standards, I'll be commodified. If I don't move fast enough, I'll lose my advantage。

SO THE PROBLEM IS, YES, AI WILL COMMERCIALIZE A LOT OF THINGS, JUST LIKE IT'S COMMERCIALIZING SOFTWARE. BUT THOSE WHO MOVE FAST ENOUGH TO MASTER CUSTOMER RELATIONSHIPS, CONTINUE TO PROVIDE EXCELLENT SERVICES AND CONSTANTLY IMPROVE THEIR SERVICES WILL NOT SHRINK, BUT WILL GROW FASTER. THOSE WHO ARE INCOMPETENT AND DO NOTHING WILL LOSE。

SO IT'S KIND OF LIKE A SURVIVAL PROBLEM: IF I DON'T USE AI, SOMEONE ELSE WILL, AND THEN THEY'LL BEAT ME。

Another simple example is the energy sector. We have had several energy analysts around the past year trying to build an energy model. The model is very complex, and the energy data services market is about $900 million in size, so it's obviously a huge market I'd like to enter. But despite the fact that some of our team has been working for a year, we are not really in the energy data service business。

Then, Claude Code's psychotic. We have a man in charge of energy and industry in the data centre, Jeremy. After he started using Claude Code, things suddenly changed. In three weeks, he spent a lot of money, about $600,000 a day, and it's a real exaggeration. However, he captured every power plant in the United States, every transmission line above a certain voltage level, and created a map of the entire United States grid from various open data sources, with numerous demand side data access。

We have made it into a dashboard that allows us to view and analyse power shortages and surpluses in various microregions of the United States, as well as many details. It'll be up in a few weeks。

We then showed it to a number of customers who had already purchased our dataset, including energy traders. After reading it, they said, "Wow, how long has this been going on?" It's good, better than a company." And then we got to know that there were 100 people in that company that had been doing this for 10 years。

Of course, our products are not as complete and robust as they are, but in some respects they are better. So now I'm commodifying these energy data services. But in turn, if I don't run faster, who's going to commercialize me

So, from a business point of view, the question is not, "Did I spend a lot of money?" Yeah, I did spend a lot of money. But the question is, what do I get from this money? Did it bring more income? If the answer is yes, the money is worth it。

Patrick O'Shaughtnessy:

Are you worried that, in the end, those who control capital and are responsible for investing it, who hire you often because of what you do, will say, "We have analysts ourselves, and they're smart. Why don't we just do it ourselves?" If it becomes so easy, at what point does it all flow back to the investment agency? After all, they are most likely to gain maximum leverage from these data and insights。

Dylan Patel:
First, any information service business is essentially the same: I have gained value from a message, and obviously no client has gained value from that information。

If I sell you information for a dollar, you're willing to spend it because you know it helps you make a decision that makes you earn more than a dollar. In other words, you get arbitrage. You made more money from me than I made from selling this information。

Investment funds themselves certainly have their own information service capacity. Especially for agencies like Jane Street and Citadel, they're very detailed and deep in terms of data. But they will still purchase our data, and they continue to do so, and cooperation with us is growing。

I think there's some kind of "it factor." We move faster, more flexible, smaller teams, and focus on a very specific area: AI infrastructure, and the enormous changes it brings about, including AI, the token economy and the associated set of things. We can see direction earlier and build things faster。

So, of course, investment professionals try to do something for themselves. But more often, they buy our data directly, and then build on it. For them, buying our data is usually cheaper than building it from scratch. And, of course, eventually someone will try to do it themselves。

Token became a new production resource

Patrick O'Shaughtnessy:
i think every time i talk to you, i end up with the same question: the supply and demand of token. that's what interests me most now. does your own experience give you any new understanding of the demand side? did you change your judgment about token's need after you felt it very well

Dylan Patel:
If we step back, from a macro point of view, Anthropic's ARR may have grown from $9 billion to around $35 billion, $40 billion. By the time this show was aired, it might have reached 40 billion to 45 billion dollars。

But they have not grown to the same extent. If you count, and assume that they have not reduced R & D algorithms — obviously they have not, because they are publishing new models, such as Metis, Opus 4, Opus 4.7 — then one thing: their added calculus, even if all of them go to reasoning, is about 72 per cent lower than their Maori ratio。

In reality, a part of the additional calculus is likely to be in R & D, so their actual Māori ratio may be higher than 72 per cent. You know, at the beginning of this year, someone leaked some of the information in their financing documents, and it was about 30 percent。

How can a business raise the Māori rate to this level in such a short time? In principle, it is because demand is too high. They can tighten usage levels, speed limits and restrictions. What's really important is that you have an Anthropic customer manager, an enterprise contract, and get the promotion you need. Otherwise, token will eventually become extremely aggressive。

Whoever can afford to get it. Anthropic is facing the same problem — of course, it is not a problem, but a reality of how capitalism works. Yes, clients may pay them $40 billion a year for token, but these tokens are much more than $40 billion in value created by clients。

the value of each token created by different enterprises is different. but as models become more intelligent, what really matters is who gets to the most intelligent tokens and uses them for the most valuable things。

As a person, you have to decide how to use these tokens to grow business and create value. A lot of people would want token, and they would consume token. But ordinary SaaS start-ups that produce software in San Francisco with Claude don't really make a lot of money. So sooner or later, they'll be squeezed out of token prices。

Patrick O'Shaughtnessy:
That's what I met today on the way. Once published, Opus 4.7, I wanted to use 4.7 and immediately. And then I was cut off. I couldn't use it. I can't even imagine continuing to use 4.6, although I've been satisfied with 4.6 for the past few weeks, it's already strong。

Are you surprised that people are so determined to use the most expensive and forward model

Dylan Patel:
Not at all. One of the most funny memories I've had in the past month and a half is that I was on my knees with my friend Leopold, almost in front of the co-founder of Anthropic, asking him to give us Metis access。

We knew it existed, so we said, "Please, let us use it." Then he said, "I do not know what you are saying."

Patrick O'Shaughtnessy:
how did you react when that price sheet, or eval card came out

Dylan Patel:
There were rumors before the Bay Zone, and we probably knew it would be very strong. If you look at benchmark, of course benchmark will change, but Mephisto / Metis is probably the biggest leap in model capacity in the last two years。

I think it's very important: it's strong enough that Anthropic doesn't even want to publish it completely. Although they have already published prices to some clients, they have also been selectively published, for example, for security-related settings. It's probably five or ten times the cost of token, but they still don't want to be fully released because they're worried about its impact on the real world。

So what we have now is a worse and weaker version of Opus 4.7. And they made it clear to model Kari that we were actually deliberately making a worse pre-optimation of cybersecurity capabilities. I don't know if you read that part。

So what I'm saying is, whoever you are, as long as you have enough capital, you should go buy anthropic business subscription, press token and pay instead of using regular subscriptions. Because then you won't be so easily confined。

and then you have to think about how to use these tokens on top-value missions and make money from them. because basically, in a year or two, a lot of business is essentially token arbitrage. token is strong, but the point is where you point them。

in the next three or four years, the model itself may know how token should be used and how to maximize value。

If you look back at any benchmark, you will find that the cost of reaching a certain level of competence in the past is X, and now it may be only one percent, or even one thousand. For example, when DeepSeek reaches GPT-4 level capacity, it costs about one percent of GPT-4. Since then, the cost of the GPT-4 model has continued to decline。

OF COURSE, NOBODY REALLY CARES ABOUT THE GPT-4 MODEL ANYMORE. WHAT YOU WANT IS A FRONT-LINE MODEL, BECAUSE IT CREATES SOMETHING THAT IS TRULY ECONOMIC. HOWEVER, GPT-4 MODELS CAN STILL BE USED FOR SOME SCENES, BUT THOSE ARE USUALLY SMALLER。

So what really drives demand is not that old capabilities become cheaper, but that new examples are emerging. Now you're using an Opus 4.6 or Opus 4.7-level model. A year later, if I were to acquire the same quality of modelling capacity today, I might spend $70,000, maybe 100 times less。

But it doesn't matter. Because then, I'll use a stronger model to do something more valuable。

Anthropic's Metis is more expensive as a model itself, but it consumes far less token to finish the same thing. So in most missions, it's actually cheaper than Opus 4.6。

Dylan Patel:
because it is much more efficient. even if every token itself is more "smart" and more expensive, it needs fewer tokens to do its job。

Patrick O'Shaughtnessy:
The last time I saw you, Metis might have just released, or model cards just came out. You said it was strong enough to scare you a little. What do you mean by that

Dylan Patel:
Anthropic's goal in 2025, even from 2024, is to have an L4 level software engineer in the model by the end of 2025. Overall, they basically did it with Opus 4.6。

But they didn't say if you looked at Metis, compared to benchmark, it was more like an L6 engineer. L4 is probably a relatively junior software engineer, and L6 is already a fairly experienced engineer。

I remember Anthropic saying that this model has been available internally since February. So in two months, they jumped from L4 to L6. So what happens next

And when you think about the evolution of the model, you see it's actually accelerating. Anthropic's release rhythm is compressed, and OpenAI's release rhythm is compressed. Why? Because usually, to make a better model, you need a few things。

First of all, you need a strong figure. Counting is very expensive and has its own timescale. We will track these things, which are actually growing, but are largely established in the short term. You've signed down the math. It's pretty much settled. Of course, there will be delays and adjustments in the middle, and there may be some more, but the whole is fixed。

Secondly, you need very good researchers. The company is now willing to pay tens of millions of dollars for these people。

Finally, capacity is achieved. Historically, this has been very difficult to achieve. If I had an idea, I'd have to make it happen, and it's hard. But now that ideas are everywhere, it becomes very easy to achieve. It's expensive, but very easy。

So the question becomes: How does a person decide which ideas to achieve? As a result, when it becomes too easy, you can achieve more ideas and run faster on this treadmill。

THIS CAN HAPPEN IN AI MODEL STUDIES, SO THE FREQUENCY OF MODEL RELEASE HAS BEEN REDUCED FROM SIX MONTHS TO TWO MONTHS. IT CAN ALSO HAPPEN IN OTHER AREAS. FOR EXAMPLE, I WANT TO MODEL EVERY POWER PLANT IN THE UNITED STATES, EVERY TRANSMISSION LINE, RUN BACK, ANALYZE THE SUPPLY-DEMAND RELATIONSHIP IN MICROREGIONS — AND I CAN DO IT NOW。

the idea itself is cheap. the point is, which idea makes sense? which idea is it worth your capital to buy token and get it out? because the ability to achieve is already there. this is the most critical change。

If the cost continues to decline — and it does — we have not even really got Metis. Opus 4.7 was just released a few hours ago, but we're very excited inside the team。

What does this bring to the world? I think it would reorder the way the economy works。

In the past, implementation was very important because it was difficult; ideas were cheap. Now, ideas are not only cheap, they are sufficient, but implementation is also very easy. So, what really is worth doing is that there's only good enough ideas to... They can prove that you are worth the money even if it is already extremely cheap。

Patrick O'Shaughtnessy:
So you're really scared? Or does it simply introduce an uncertainty that is difficult to grasp

Dylan Patel:
Uncertainty certainly exists. But I do feel that this brings with it some kind of fear. The question is, how can society restructure itself

WHAT IS IMPORTANT WHEN YOU LIVE IN A WORLD WHERE "THE ABILITY TO DO SOMETHING" IS NOT AS IMPORTANT AS IT IS? THE IMPORTANT THING IS, CAN YOU CHOOSE THE RIGHT IDEA FOR AI TO MAKE IT HAPPEN; CAN YOU SELL IT, OR CAN YOU SELL WHAT AI REALIZES; CAN YOU RAISE CAPITAL FOR THIS DIRECTION? THAT'S WHAT MAKES IT IMPORTANT。

And that goes back to the question before: it's important to have up-to-date models forever. Who can get the latest model

Anthropic has a project, and I know it's not called Earwig, but I like to call it Earwig on purpose to flirt with Anthropic people. They only offer Metis to certain companies for cyber security. I think this is going to happen: the model is going to be narrower and less accessible。

Note: Earwig is meant to be a vermin, a little insect, and is often called an earplug in Chinese. It's more like a charade: on the one hand, it sounds like some kind of bug, on the other hand, there's a connection between "sneaking into the ear" and "improving people."。

I know OpenAI, Anthropic and other companies say they want everyone to have a strong AI. But AI is very expensive. Who will pay for trillions of dollars in infrastructure? Those who have money and can build useful things with an AI。

and you don't want anyone to distill your model, so you don't publish it on a wide scale. you'll give it to fewer and fewer clients. then these customers will start fighting for token。

Unless Anthropic had a huge price increase. They can double the price of Opus, and I'll keep paying. I bet most users will continue to pay. But I don't think that even solves their huge capacity problems。

so the problem is: where does this cycle end? what happens when token usage, and the added value of these tokens, is increasingly concentrated in the hands of a few companies

I don't have Metis now. But who has? Top bank has. Now they're probably just using it in cyber security, but I can imagine a world: because I have a business contract for Anthropic, and because Anthropic people like me, they might be willing to give us a little earlier access, or a little higher rate limit. I certainly hope that happens。

And then my competitors don't have these accesses, and I can beat them。

It may also be another case. For example, Ken Griffin from Citadel, who has a very strong pulse and is very rich. He might go and sign an agreement with OpenAI or Anthropic, saying, "I buy $10 billion a year token first. Every time you publish a new model, I'll buy the top $10 billion token, then the others

What happens then? He could crush everyone in the market。

This is just one example. It can also happen in the area of cyber security, such as Anthropic's concern that models make it easier to hack into systems. It could have happened in an information service like me, and I used it to crush people。

I believe that this matter has a very wide-ranging impact. We don't know what these models can do. Anthropic doesn't know, OpenAI doesn't know, nobody knows. Ultimately, it's up to the end user to find out: where can these tokens be used? What can they build? What can you imagine

This, of course, will greatly increase productivity and has a very positive side for humanity. But the question is, how can resources and access be concentrated

The robots will take on the next wave of demand

Patrick O'Shaughtnessy:
now, robots, or robotics, consume token, almost negligible compared to other fields. what do you think? will it become the second demand curve? every day, within a mile, a new robotic business appears, trying to make something interesting。

Dylan Patel:
Here is a concept called "software-only speciality." In other words, the world might start with an AI singularity that only happens in software. But the problem is that much of the world is still physical. You'll see that the world will eventually be organized around hardware, not just software. So I think that the so-called "software wonders" are only a short period, not the end. Because we end up in the physical world。

What is the real hard part of robots once software becomes very easy? It's programming, microcontrollers, implementers, and controls all this stuff. These are very difficult now。

AI MODELS HAVE AN INTERESTING FEATURE: THEY'RE ACTUALLY VERY INEFFICIENT. IT'S ONLY BECAUSE WE GAVE THEM BIG DATA THAT THEY LEARNED SOMETHING AND IN SOME WAYS SURPASSED HUMANITY。

But the robot's current models, like VLA, which is the Vision-Language-Action, visual-linguistic-action model, are hot right now, but I don't think it's going to be something that can eventually be expanded. They're inefficient, and we can't expand robotic data fast enough。

There must be some way in the future to train robotic models on a large scale. It's like humans keep seeing data throughout their lives. The real thing about humanity is that we're very "sampling efficient." One example, two examples, we can learn。

If this capability were applied to robots, it would be completely different. Once the singularity at the software level has emerged, it becomes very cheap for anyone to start building these models. And then people can start building really useful robots。

so i think in the next six to 18 months, we're gonna start seeing real breakthroughs in robotics. the key ability is the few-shot learning. and then there'll be a pre-trained robot model, and you'll hire or buy a robot, and you'll show it a few examples, and it'll do the job。

You tell it to fold these two things up, it can do it. And you tell it, "This thing actually has a balance. It will start and finish. Believe me, I've done it myself many times。

So I think robots have a little learning ability。

It is true that there are already companies doing robotics, some for advertising and some for simple tasks. But it's going to be very subdivided. For example, robots that are used to fold clothes, or robots that are more subdivided into blackboards. It could be a rental service, it could be a model package, you could download it on a standard robot, it could do it, and then you could pay for it。

in any case, the field of physical commodities will be accompanied by a significant acceleration and deflationary effect. and this will eventually continue to drive token's crazy growth in demand. so i personally don't think token needs will slow down。

Patrick O'Shaughtnessy:
Did you learn anything new about the world from the results of Metis and how it was built? In other words, if you take apart all the components of the scaling laws, like pre-training..

Dylan Patel:
It's a much bigger model than before. $100,000 Blackwell, equivalent to hundreds of thousands of chips from the previous generation. Of course, TPU and Triton have different release rhythms, so they can't be completely matched. But eventually, yes, Metis is a significantly bigger model. It proves scaling laws are still valid. All it shows is that the trend line continues: with more input into models, models become better。

And it's not just "more calculus to make models better." At the same time, we are constantly gaining in computing efficiency. And all the research and development energy that the lab put into it turned into one thing: If I want a model for a certain level of competence, the cost of reaching that capacity will decrease significantly every six months, or now every two months. But if I pull it up big enough, I can still make a huge leap forward。

So, yes, it proves that this continues to happen. Google and Anthropic are not GPU users on the training side. OpenAI should also launch a new generation of models. I think they're taking a more rational and principled little step forward on scaling. And this time, Anthropic made a huge leap。

This year we'll see a better and better model, and the rhythm will only get faster。

Patrick O'Shaughtnessy:
We've been talking about this conversation for a long time, but hardly mention OpenAI. This would have been a strange thing in the past。

Dylan Patel:
That's the fun part. Now a lot of people would say, "So Anthropic has won, right? They had Metis in February, but they had not even been released because they felt unnecessary. Their calculations have been sold out and revenues are increasing by $10 billion a month. And today, Opus 4.7 was released, and all of this happened before OpenAI's rumor, which was covered in the media like The Information。

So on the surface, Anthropic clearly leads, OpenAI seems finished. But what's interesting is that Anthropic's math is very limited, and they can expand at very limited speed. Dario used to exaggerate that OpenAI was too radical in terms of arithmetic, and Anthropic's scaling was more rational. But now Anthropic might have thought that we really should have had more talent。

OpenAI is fully capable of paying these bills. In fact, they have already financed a lot of money to increase their incremental capacity. In addition to this, they used to buy credit from companies like Oracle, CoreWeave, SoftBank, Microsoft, etc., on a very radical and even "irresponsibility" scale. Now they also got Trainium from Amazon。

So OpenAI did a very crazy thing in math, and they knew they needed more。

Interestingly, if we look at Opus 4.6, let's not consider the model for the time being, but the proliferation of this technology. You and I could use it immediately on the first day of the model release, but other businesses need time. People also need time to learn. The "Claude Awakening Time" will not hit everyone at the same time. So by the end of the year, assuming an Opus 4.6-grade model, the whole economy is willing to spend $100 billion a year on it, which I don't think is an exaggeration. After all, it's now costing $40 billion。

Patrick O'Shaughtnessy:
It's basically just linear push。

Dylan Patel:
Yeah, it's linear, not exponential. To achieve exponential growth, you need better models. But Anthropic doesn't have the capacity to meet these needs. So, assuming OpenAI or Google quickly reaches this capability level, anyone can do it next。

Anthropic may be able to charge 70% of the Māori rate, but if OpenAI reaches the same level of capacity next, even if it collects only 50% of the Māori rate, it will eat all these incremental demands. And it's probably also not sufficiently calculative to serve all users. So maybe Metis, a model like this, if the world had enough talent, could have brought in $500 billion, even more exaggerated. The market demand for these tokens is too strong and the availability of computing power is extremely limited。

We've seen this in the H100 price boom. The service life of GPU is also increasing. Obviously, even second-line laboratories, they sell out, not to mention first-line laboratories. First-line laboratories would have a better profit margin, but second-line laboratories would have sold out, and even third-line laboratories might have been close to selling out。

the economic value of the strongest models is growing faster than the ability of infrastructure to provide these tokens to people. so this gap will continue to widen. modelling laboratories will also continue to grow in profitability until people in the hardware supply chain and infrastructure supply chain react: etc., why don't i just raise my profit margin

Patrick O'Shaughtnessy:
So it can be said that your judgment today on the demand side, especially your own example of SemiAnalysis, is completely explosive. And more broadly, as people enter what you call "AI psychosis" and feel what they can do and feel that the difficulties are almost completely gone, as I do. In a few weeks, my own token spending has soared。

that sounds like a pretty good needs-side judgment. on the demand side, do we have anything to miss? if you don't use more token, you'll never get out of the permanent bottom. can you spread that

In other words, either you use more token, and through these tokens you create extra economic value; but many people are bored and lazy. They'll think, "I'll work only one hour a day, not eight hours, so AI will do most of my work."

Dylan Patel:
That's the boring way. The cooler way is: I still work eight hours a day, but I do eight times the work, maybe five times the money. It may not be five times, but it should be that way。

Of course, if you're just doing a job, it's hard. There are people who do multiple jobs at the same time, and there are people who start companies and start selling. You need to capture the economic value of AI before it is used by everyone, and it turns into an industry label. Because it's not exactly a frame now. If you don't use more token, if you don't create value from these tokens and capture it, you can't get out of the permanent bottom。

there are three different questions here: first, use more token; second, create value from these tokens; and third, capture value from the value you create from token. if you can't do these three things, you'll never get out of the permanent bottom as model capabilities continue to surge and resources may become more concentrated。

Okay, let's talk supply side. What the hell happened now? If the demand curve rises, what changes are taking place at the front of the entire supply depot to serve all these tokens? As demand surged, everything on the supply side was rising. Prices are rising either in NVIDIA GPU or elsewhere. At the same time, their useful lives are being extended。

That's the price trend for H100. In the past, it had been argued that GPU had a useful life of less than five years, which was totally nonsense. Some Hopper clusters, three or four years ago, are now re-contracting for three or four years; some A100 clusters are also renewing contracts for the coming years。

So the active life of GPU is obviously not five years, or even seven or eight years. We don't know yet, wait till Hopper really gets to that stage. But obviously, it's not five years. And at the time of renewal, prices were rising。

THIS MEANS THAT THE MĀORI RATE FOR A CLUSTER IS ACTUALLY NOT 35 PER CENT, BUT HIGHER. THE PROFITS OF CLOUDS ARE EXPANDING. THE HARDWARE LAYER HAS A VERY HEALTHY PROFIT MARGIN, AND THE NVIDIA IS STILL CHARGING ABOUT 75% OF THE MAORI RATE. LOOKING FURTHER DOWN THE SUPPLY CHAIN, IT IS CLEAR THAT THE PROFIT MARGIN OF THE MEMORY CHAIN HAS INCREASED SIGNIFICANTLY. LARGE ADVANCES WERE ALSO MADE IN AREAS SUCH AS LIGHT MODULES AND LOGICAL CHIPS, AND PROFIT MARGINS WERE SLOWLY RISING。

AND MORE IMPORTANTLY, COMPANIES LIKE NVIDIA THAT MAKE CHIPS ARE PAYING HUGE ADVANCES. SO EVEN IF THE MĀORI RATE DOES NOT INCREASE SIGNIFICANTLY, THE COST OF CAPITAL, THE POINT OF CASH FLOW, OR THE RETURN ON INVESTMENT CAPITAL IS ON THE RISE。

You can see this across the supply chain. ASML has sold out completely, and it needs Carl Zeiss to expand more quickly. Along the supply chain, each chain is either sold out and the profit margin increases; or advances are received, thus increasing the return on investment capital, which is actually required to invest less。

THIS IS A CONSISTENT TREND ACROSS THE SUPPLY CHAIN. EVEN PCB IS LIKE THIS. THE MANUFACTURING OF PCB REQUIRES COPPER PLATINUM, WHICH IS SOLD OUT, AND PEOPLE START MAKING ADVANCES FOR COPPER PLATINUM。

It can be said that as long as this thing has a pulse, as long as it is in the supply chain and is sold out, people will fight for more incremental supplies and for supplies in the coming years ahead。

The shortage of computing power leads to the whole industry chain

Dylan Patel:
Supply chains usually react quickly. But this time there is a unique place: today's supply chain is more complex than ever before, and what we are building is more complex than ever before, so the delivery cycle is longer. Not to say that other industries did not have an 18-month delivery cycle, but this time the construction of the new supply itself took several years。

THAT'S THE MEMORY. MEMORY CAPACITY CAN ONLY GROW BY A LOWER DOUBLE-DIGIT PERCENTAGE PER YEAR, FOR EXAMPLE, 20%, 30%. NAND EVEN LOWER, DRAM SLIGHTLY HIGHER. EVEN THOUGH DEMAND SIGNALS WERE STRONG BY THE END OF 2025 AND MEMORY COMPANIES WERE RESPONDING IMMEDIATELY, REAL NEW CAPACITY WOULD NOT BE FORTHCOMING。

In addition to the 20 to 30 percent growth that would have taken place each year, they can certainly squeeze a little more productively. But it's not until 2028 for the real new supply. It could be the end of 2027 at the earliest, but probably 2028. It's very unique. Even if they wanted to expand their production as quickly as possible, the supply would not arrive immediately。

AS A RESULT, MEMORY PRICES HAVE RISEN. AND I'M TELLING YOU, ESPECIALLY DRAM, THE PRICE WILL AT LEAST DOUBLE, DOUBLE, OR EVEN TRIPLE。

SOME WILL SAY, "THE STORY OF MEMORY HAS BEEN BROKEN, AND EVERYONE UNDERSTANDS." BUT NOT REALLY, YOU DON'T REALLY UNDERSTAND. DRAM IS STILL LIKELY TO DOUBLE OR TRIPLE FROM NOW ON, BECAUSE THAT'S HOW MUCH IT TAKES. THEY MUST SEIZE CAPACITY FROM ELSEWHERE. IN THE CAPITALIST ECONOMY, THE ONLY WAY TO SEIZE CAPACITY FROM ELSEWHERE IS TO DESTROY PART OF DEMAND AT HIGHER PRICES. WE'RE NOT IN A RATIONING SYSTEM, SO IT'S BOUND TO HAPPEN. THE PROFIT MARGIN WILL CONTINUE TO RISE。

I think there's also a huge capacity problem with the logical chip. The build-up has just been published, and they have been increasing capital spending. However, the construction of the round mill will take a long time. They are making every effort to extract more output from each existing plant. However, there was no rapid increase in the price of electricity, because they were "good people". The price increases are probably only in single digits, rather than triple-digit increases like memory manufacturers。

So eventually you'll see a market in which electricity is a great company, but will it really take out all the values? Not necessarily。

I'VE JUST MENTIONED SOME THINGS, LIKE THE COPPER, GLASS, LASERS THAT THE PCB NEEDS. THESE ARE RELATIVELY WELL-UNDERSTOOD, BUT VERY SUBDIVIDED SUPPLY CHAINS, WHICH ARE ALSO VERY TENSE. LOOKING UPSTREAM, THE SUPPLY CHAIN OF SEMICONDUCTOR CRYSTAL ROUND MANUFACTURING EQUIPMENT CONTINUES TO BE CONSIDERED TO HAVE SIGNIFICANTLY INCREASED, BUT THE MARKET SERIOUSLY UNDERESTIMATES ITS IMPORTANCE。

The capital expenditure for this year is 56 billion dollars. We started in January with a projection of $57.4 billion, and it is likely that it will be slightly higher, as we see some ways of increasing capital spending。

But there is no real concern: what does this mean for next year? What does it mean for the next year

As a result, in three years, the power build could raise capital spending to $100 billion. Maybe two years later, in 2028, they could really spend $100 billion on capital spending. I'm seriously saying that the power build could cost $100 billion in 2028 for capital spending。

Many people can't imagine that. But what does it mean for its downstream supply chain? What does it mean for companies like Lam Research, Applied Materials, ASML? What does it mean for companies like the more downstream supply chain, such as MKS Industries

The oxen can be further magnified。

Note: The term "frown whips" refers to magnification in the supply chain. Specifically, AI needs at the bottom look like only token usage surges, but when transmitted to the upstream supply chain, they are magnified by a layer, which turns into more exaggerated expansions, price increases and capacity-taking。

If the build-up really wants to spend $100 billion on capital spending in 2028, and I think it's possible that many people would think it's crazy, but it could really happen。

Patrick O'Shaughtnessy:
WHAT ABOUT THE REST OF THE CHIP ECOLOGY? GPU HAS ALWAYS BEEN ABSOLUTELY DOMINANT. BUT ARE CPUS, ASICS OR OTHER THINGS EMERGING AS NEW OPPORTUNITIES AND BOTTLENECKS? BEYOND THE DOMINANCE OF THE NVIDIA GPU, WHAT ELSE IS THERE TO FOCUS ON

Dylan Patel:
YES, ASCIC IS APPARENTLY TAKING OFF. BUT FIRST I WANT TO JUMP OFF THE AI CHIP ITSELF AND TALK ABOUT SOMETHING ELSE. WE'VE DONE A PROJECT ON FPGA, AND IT TURNS OUT THAT EVERY NEXT GENERATION OF AI HANGERS WILL NEED 120 FPGAS. SO WHAT DOES THAT MEAN FOR ALL THE FPGA COMPANIES

The same goes for CPU. All these reinforced learning environments, plus the "spam code" that you and I created, Now they're all running on an example of Vercel, an example of AWS, or a cloud resource that we start with. All of this needs CPU. So the CPU is now completely sold, and the demand is rising fast。

Patrick O'Shaughtnessy:
LET'S SEE WHAT CPU IS PLAYING IN THE SYSTEM

Dylan Patel:
THERE ARE TWO MAIN REASONS WHY YOU NEED A LOT OF CPU。

FIRST, STRENGTHENING LEARNING. WHEN DOING INTENSIVE LEARNING, CPU IS CRUCIAL。

In the past, you'd throw the entire Internet data into the model, and then the model would spit out some results. Now, you still put Internet data in the model, but then you put it in an environment, and you say, "Come on, try it. The model will try a lot of different things. Finally, the environment assesses the success of the results of its attempts and points it out. These environments can be anything. It can be simple, for example, to check whether the output text fits the correct format or whether the structured output is correct. It can be very complicated。

NOW PEOPLE ARE GETTING INTO VERY COMPLEX SCENES. FOR EXAMPLE, I WANT YOU TO OPEN THIS FILE, MODIFY IT, EDIT IT, UPDATE IT, AND THEN SUBMIT IT TO A WEBSITE. OR, "I WANT YOU TO OPEN SIEMENS' PHYSICS SIMULATION SOFTWARE AND EDIT THIS CAD MODEL." SO THESE ENVIRONMENTS BECOME MORE COMPLEX. AND THESE ENVIRONMENTS RUN ON CPU, NOT ON GPU AND NOT ON ASIC。

ASIC OR GPU ARE RESPONSIBLE FOR RUNNING THE MODEL ITSELF: RECEIVING INPUT DATA FROM THE ENVIRONMENT, SENDING IT TO THE MODEL, GENERATING DIFFERENT OUTPUT PATHS, THAT IS, DIFFERENT WAYS IN WHICH THE MODEL BELIEVES IT CAN SOLVE THE PROBLEM. THESE PATHS ARE THEN EVALUATED AND RATED. THOSE SUCCESSFUL PATHS WILL BE USED TO CONTINUE TRAINING MODELS, TO UPDATE MODELS AND TO REPEAT THEM. SO THIS IS THE VERY USEFUL FIRST PLACE FOR CPU。

The second place is deployment。

WHEN YOU HAVE THESE POWERFUL MODELS, AND YOU DEPLOY THEM OUT, THEY GENERATE CODES AND ALL KINDS OF USEFUL OUTPUTS. BUT THESE OUTPUTS ARE NOT DIRECTLY FROM GPU INTO THE HUMAN BRAIN. THEY WILL COME OUT OF GPU OR ASIC INTO AN APPLICATION THAT YOU DEPLOY, AND THAT APPLICATION ITSELF USUALLY RUNS ON CPU。

SO THIS IS ANOTHER AREA OF GREAT NEED. THE CPU HAS LARGELY SOLD OUT。

AI VALUE HARD TO GET GDP STATISTICS

Patrick O'Shaughtnessy:
As you continuously assess supply and demand trends and try to be the world's most knowledgeable of both, what do you want to know, but not yet

Dylan Patel:
i think the hardest part for us, for everyone, is tokenomics, or token economics. the cost of running the infrastructure, the cost of token, the cost of models, the profit margin of these laboratories is a very good judgement. but what is really difficult to model is use and speed of adoption。

In January, we made some very radical predictions in February, and as a result, Anthropic easily exceeded it. How do we calibrate this model? What data sources should be used? By February, we made a very radical assumption about March, and they went beyond it. When you see the figure of $10 billion in revenue, the reaction is: what is it? How did they actually add $10 billion to their income? Who's using these token? Why? What are they building with these tokens? And, more importantly, how can what they build with these tokens spread to the economy? How much value have they created

It's not something that can be easily captured by GDP statistics. For example, I use all the value created by token, which eventually translates into better information. Then I sold the information and I sold it at a lower price than anyone else had in the past。

this information then enters the entire economic system, enabling people to make better investment decisions or better competition decisions. they're semiconductor companies, data centre companies, or hyperscaler, so what is the value of this information? what is its impact on the economy

This is clearly very alarming from any subjective indicator. But the question is, where's Ghost GDP? Phantom GDP. What is it? How do we track real economic values

Because existing GDP indicators are not accurate. And if you ask Dylan Patel how much GDP has been created, the number will be very small, and it will be out of proportion to the value I think I actually have。

so the final question is: how much value have these tokens created? it is not just direct income, but the ripple effect they bring. what are the consequences of everything they do

i think that is the real problem and the most difficult challenge to measure. i think we have a very good judgment on the supply side. we also have very good judgment about the many signals on the demand side. but it's hard to quantify and measure what value these tokens have created. i hope that we can do this every three months, because it's going too fast。

ANTI-AI PROTESTS, PROBABLY WITHIN THREE MONTHS

Patrick O'Shaughtnessy:
What do you think happens next? I'll see you in San Francisco in three months. What do you expect

Dylan Patel:
Mass protests。

Patrick O'Shaughtnessy
A PROTEST AGAINST AI。

Dylan Patel
People hate AI. AI is now even less popular than ICE, more than politicians. I don't know how Pew did it, but apparently AI is less popular than politicians。

As Anthropic increases so much revenue, it starts to trigger business changes downstream. People are gonna get scared of AI. They will begin to blame AI for the growing number of their problems, as well as for many of the long-standing, deep-rooted global problems。

THESE PROBLEMS WILL SURFACE AND BE ATTRIBUTED TO AI. IT IS LIKELY THAT SOME POLITICIANS, OR PEOPLE IN SOCIAL MEDIA, INFLUENCERS, WILL START WEAPONIZING AI TO ATTACK OTHERS。

You look at the comments under some news articles. Sam Altman's house was thrown in a burning bottle twice in two weeks, and someone in the comment section was screaming. It's just the beginning. So I think within three months we'll see a massive protest against AI。

Patrick O'Shaughtnessy:
WHAT'S THE POWER TO OFFSET IT? AI, HOW SHOULD INDUSTRY RESPOND IN ADVANCE

Dylan Patel:
First, Sam Altman and Dario should stop interviewing. They're too unattractive. I don't know what they're doing. Every interview makes ordinary people hate them even more. Sam Altman on Tucker Carlson, for example, may make all Republicans hate OpenAI more. Dario is the same. They're really not attractive. That is the first point。

SECOND, THEY NEED TO START SHOWING THE POSITIVE, INSPIRING THINGS THAT AI CAN DO。

THIRD, THEY NEED TO STOP TALKING ABOUT HOW "AI CAPABILITIES WILL CHANGE THE WORLD." PEOPLE CAN ONLY BE AFRAID TO HEAR THAT. ESPECIALLY WHEN THEY HAVE NO REAL CONNECTION TO THIS TECHNOLOGY。

Patrick O'Shaughtnessy
They don't know how to use it。

Dylan Patel:
And they're not connected to it. Ordinary people don't know Anthropic employees or OpenAI employees. Ordinary people do not know who these people are or what they are targeting. They only see these companies as some sort of sneaky small group: thousands of people gathered in one company to change the world, automate all jobs and destroy society. That's what many people see。

IN ADDITION, THESE COMPANIES ARE FINANCING AND PROMOTING THE CONSTRUCTION OF A LARGE NUMBER OF DATA CENTRES AND POWER PLANTS THAT, IN THE PUBLIC EYE, CAN POLLUTE THE WORLD. PEOPLE DON'T REALLY UNDERSTAND WHAT HAPPENED. SO THESE COMPANIES HAVE TO STOP TALKING ABOUT THE BIG CHANGE THAT'S GOING TO HAPPEN IN THE FUTURE, JUST NOW: HOW AI IS GOING TO MAKE A POSITIVE DIFFERENCE NOW. I THINK THIS REQUIRES A HUGE ORGANIZATIONAL AND BRAND REMODELLING。

Patrick O'Shaughtnessy:
I love doing this conversation with you. Thank you for your time。

Dylan Patel:
Great, thanks。

[ Chuckles ]Video Link]

QQlink

Không có cửa hậu mã hóa, không thỏa hiệp. Một nền tảng xã hội và tài chính phi tập trung dựa trên công nghệ blockchain, trả lại quyền riêng tư và tự do cho người dùng.

© 2024 Đội ngũ R&D QQlink. Đã đăng ký Bản quyền.