Three years later: back in 2023, my judgment on ChatGPT

2026/06/01 01:10
🌐en
Three years later: back in 2023, my judgment on ChatGPT

Author:Wang Jin-seok

 

On March 6th, 2023, ChatGPT was just out, GPT-4 had not been released, and Sarah and I had an interview with ChatGPT, Traders' Talk, "The Big White Story Series."ChatGPT podcasts released. WelcomeI don't know。

When ChatGPT came out, there were very few real people, three hours of interviews, and then it was in the first place in the universe. I threw 20 or so judgments and predictions in one breath, all intuitive and limited information, with little data. The full verbatim version of the interview remained on the public sign。

NOW IT'S 2026, AT THE END OF MAY, THREE YEARS LATER, AND AI HAS GROWN BEYOND WHAT IT COULD HAVE BEEN。

I would like to do one thing: to take out the 20 articles one by one, and to make an objective reconciliation with the latest data available today. See what the world has become in three years and see where I was three years ago and where I was wrong。

In order to be as neutral as possible, this reconciliation was sexualized to AI to do it: to throw the current year's interview verbatim into a workflow, which was programmed41 Opus 4.8 delegatestwenty judgements were to be broken down, updated data were to be retrieved online, a cross-check was to be made, and wang was to be scored three years ago. the group of angents spent about 20 minutes burning 1.4 million tokens (approximately $35), running out of the following report. the judgment comes from them, not me. the reference date is set in may 2026。

I. Scoreboard

Decision symbol: ✅ Correct... Essentially Correct... Partially Correct... Error

BY THE WAY, MOST OF WANG JIAN-SEOK'S OLD DIRECTION STOOD UP, AND THERE WAS ONLY ONE REALLY HARD CALCULATION -- PASSING GPT-4 INTO 100T PARAMETERS。But the devil hides in the details: Almost every "right" behind it is a tail that was wrong. None of the 20 articles is purely “still uncertain”, three years is long enough, and most things have a preference for answers. Let's go into groups。

Two, you're right

The common denominator of this group is that the direction, mechanism and even the rhythm of Wang Jian-suk's judgment in the current year have been put in place, and that the error has been only in terms of the “level” and “absolute language”。

RAG AND THE SEARCH ARCHITECTURE (POINTS OF VIEW 2 AND 3)

WANG JIE-SEOK SAID IN 2023 THAT THE MAINSTREAM APPROACH TO SOLVING KNOWLEDGE AND HALLUCINATIONS WAS NOT A MODEL CHANGE, BUT A VECTOR SEARCH FILLED KNOWLEDGE AS A "SMALL COPY"; THE RIGHT STRUCTURE WAS A SEARCH ENGINE TO RETRIEVE AND FEED THE RESULTS TO LLM。

This is the fact that today all of the AI products are standard. RAG has become the default structure for corporate AI, and OpenAI, Google, and Anthropic have made it a platform-level capability; ChatGPT Search literally means "retrieving the Bing Index, feeding the results to GPT, and generating the quoted answers." Google AI Overviews used grounding to do about 2 billion months of work, and a company based solely on this structure ran into about $20 billion。

WHEN GPT-4 HAD NOT BEEN PUBLISHED AND INDUSTRY AGREED TO "MICRO-INFLUENCING" HIS BET WAS "NO-MOTION MODEL PARAMETERS, NO-LOOP SEARCH" AND THE MECHANISM AND TIMING WERE RIGHT。

What is needed is honesty: he imagines a "static one-off search" and the reality is more complex -- the context, the contextGraphragIt's not like I'm going to be able to do this. The argument that the RAG is dead in 2026 is proof that the general direction is not dead, but that it is simply a simple one-off search, with the conclusion that it has been upgraded to a mixed search rather than reverting to model parameters. And one thing: RAG, the term, in 2020, Meta, the paper came up, not his first -- he just put it in the window and it'll be the mainstream。

LUI IS NEW CONTINENT

Wang Jianxi said in 2023 that the greatest thing about ChatGPT is not AIGC, but the opening of the Lui (natural language user interface), which re-creates humans as well as the GUI used to create a new industry much larger than the Big Model itself。

"New Continent" is almost all of it. Natural language has become a popularly dominant interactive layer (the ChatGPT 900 million weeks of living) and has led to the creation of an independent new industry — angent, coding anent, all the protocols. The most specific phrase, "greater than the model itself", was strongly confirmed: the MCP agreement became the Lui-era "operating system standard" and was fully adopted by Openai, Google, Microsoft in 2025 and transferred to the Linux Foundation at the end of the year; Claude Code, a single product, managed to collect about $2.5 billion a year。

BUT HE USED THE STRONG TERM "RECONSTRUCTING, REPLACING GUI," WHICH APPEARS TO BE THREE YEARS LATERCoexistence, not substitutionI don't know. Three types of reverses are hard: the MIT report shows that 95% of the enterprise GenAI pilots do not have measurable ROI; the direct-operator-use anent is about 78% of the top model at the test set and has just touched the human baseline; the language hardware that has been removed from the screen is almost completely destroyed (Humane Pin is permanently suspended in 2025). More precisely, Lui is the new interactive layer that superimposed the GUI。

Robot network and new location (view 9)

Wang Jian-seok in 2023 said that in the next decade or so there will be a "robots network" — the automatic handshake and mutual call between agents in a natural language that no longer requires traditional API; a new domain name location system will emerge. This set of things "can be finished in two or three years."。

It's amazing in the direction. MCP, A2A (donated to Linux Foundation, over 150 organizations supporting) solves angent interchange; Agent Network Protocol is directly based on W3C's W3C DID's `uncentralised agent address' and aims at `billions of agent collaboration networks' — a highly constructed "new domain name system " 。

Two are to be amended: the first is "no longer needed API" and the bottom of the mainstream agreement is structured schema, which is essentially a layering of standards over API; the second is that "two or three years after completion" is not in effect, and Gartner data indicates that only about 17% of the organization has actually deployed anent as of 2026. It's interesting that he actually had a layer of words -- "two or three years" in his form, "about ten years" in his maturity。It's a perfect rhythm. The mature cycle is the tenth gradeI don't know. Look at the two layers separately, the quality of this is higher than it looks。

China must be able to make large models

Wang Jie-seok said in 2023 that China would be able to make a large model available and that the gap with the top would be closed quickly in about three years (Red Flag Browser chasing Netscape)。

This timeline fits unexpectedly. Stanford 2026 AI Index measured the baseline gap between top Central American models from 17.5 to 31.6 percentage points in May 2023 to narrow down2.7%; and US private AI investment is about 23 times that of China — it has been bridged with much smaller inputs. DeepSeek, Qwen, Kimi, GLM became the global mainstream, with open-source ecology even leading。

But the word "prompt" is optimistic -- the real maturity takes about 14 months, not months. And that's how it works, not how it's defined: As of early 2026, there were no Chinese models that exceeded OpenAI o3. It was clear that he was wrong about the judgment that "the door will not close" and that OpenAi, in July 2024, had voluntarily cut off the API against China, which had been shut down by the supply; that the words of his famous leader had instead fallen off the line, and that the real good thing was that it was DeepSeek, the bean bag, the thousand questions。

Ignorance. Turing test only

In 2023, Wang Jae-seok said that ChatGPT had no sense of being "intentional, listener has a heart" and that the Turing test book only measured "doesn't make you think it does" rather than it really does。

The core of the test, "Message" , is stable and is being held in an experimentally symmetrical manner: in 2025, in the Turing test by UC San Diego, GPT-4.5 was found to be as much as 73 per cent human, higher than real people, at the hint of "playing humans," but with purely performance skills — that is the best note to say "just whether or not you think it is."。

"The machine must be unconscious."The theory of absoluteization was pushed into the ash zone for three yearsI don't know. Anthropic setModel welfareIn addition, Claude was given the function of "actively ending abusive conversations". These turn "never" into "low-probability, but not excluded." But based on "possible, supposed" rather than "proven" , the kernel was not overturned, but was too full of words。

The rest is right (view 6, 11, 12, 16, 18, 19)

  • NO, AGI, BUT IT WAS A BIG STEP
    : Stop both. Altman himself said "not AGI, lack of continuous learning" during the GPT-5 era, while the IMO gold medals, ARC-AGI went from nearly zero to 85%, "a big step" uncontroversial。
  • No unemployment
    In April 2026, the unemployment rate in the United States was only 4.3 per cent. The blind spot in the "Short" - Stanford study shows that it's the young 22-25-year-olds who are being pulled off at the first level of the career ladder, and the mechanism of "sucking out " has failed。
  • IT WON'T BE FLOODED BY AI
    : Net well-being is in the right direction, but he seriously underestimates the scale - AI already accounts for about 52% of the new page, and "AI slop" becomes the annual word。
  • Year of Entrepreneurship
    : The wave is right, the xAI (created in March 2023) has reached 230 billion. But he locked the Great Company in 2023 -- a real trillion-scale OpenAI, Anthropic was created earlier。
  • 1994 Browser Time
    : Relatively sequenced, OpenAI 2025 really launched Atlas Browser, turning metaphors into literally realism. It's just that ChatGPT spreads more strongly than a browser, which is more conservative than a metaphor。
  • prompt. add facts to illusions
    : The direction is confirmed, the hallucination rate for the GPT-5 offline is 47%, and the reverse "fact" is the key variable. Just underestimating root causes for training incentives, not prompt。

Three. Wrong. Wrong

GPT-4 IS A 100T PARAMETER (VIEW 4) - A COMPLETE ERROR

> IN 2023 WANG JAE-SEOK SAID: (THE ANECDOTAL) GPT-4 IS A 100T PARAMETER APPROXIMATELY 600 TIMES MORE THAN 175B OF GPT-3。

BOTH NUMBERS ARE WRONG. GPT-3 IS 175B, 2023 THE BEST ESTIMATE FOR THE JULY LEAK IS ABOUT GPT-41.8 T, 16 MoE of experts, only about 10 timesI don't know. 100T and the actual difference is about 55 times. The only source of "100T" is the second-hand recitation of the "approach" by Cerebras CEO 2021, which Sam Altman had been facing in January 2023 as "complete fullshit"。

He said "challenges" and retained uncertainty. At a deeper level, the framework of "multiplier parameters" itself is out of date: OpenAI's later GPT-4.5, GPT-5 simply does not publish the parameters. It is the only hard mistake of numbers and outdated perspectives。

LLM MATHEMATICS (VIEW 1) -- CORRECT DIAGNOSIS, WRONG TOP CONCLUSION

WANG JIANXIE SAID IN 2023 THAT LLM MATHEMATICS IS FUNDAMENTAL, THAT IT IS NEITHER POSSIBLE NOR NECESSARY FOR IT TO LEARN MATHEMATICS ITSELF, AND THAT THE RIGHT APPROACH IS AN OUTSIDE TOOL。

The "diagnostic plus tool route" is all right - the root cause is token-by-token generation, which leads to unreliable position (the 2025 Mechanism paper accurately confirms the "end-of-the-ground, middle-of-the-ground instincts"); and the lifting of the peripheral tool (O4-mini allows Python, AIMS 2025 to 99.5%)。

The error is in the "unlikely, needless" capping. The "unpossible" is perjured - in July 2025, Gemini Deep Think and OpenAI models took gold medals in IMO in pure natural language without tools. The key turning point is the "drimatic model" that emerged only in 2024-2025, which was not foreseeable in March 2023 -- so the prediction should be interpreted with tolerance, not harshness。

Value capture -- half the bet, the core goes backwards

> In 2023 Wang Jianxi said that values end up at the application level and that companies that create the base level (modelers) do not necessarily end up making money。

The money did start to flow to the application layer (Cursor three years to earn 2 billion years) -- half a pair. But the "not making money" in the base floor was..Ying Wei Da will testify directly: FY2026 net gain of about $120 billion, market value of $5 trillion +, is the only clear-cut large-scale profiter in the entire market. The model layer that he suggested would win (OpenAI 2026 lost about $14 billion) is what he said most likely to be the "basic layer of burning money without making money."。

He does not distinguish between the "basic" and "model" levels, or between "profit" and "profit." Values are more extremely captured in 2026 than in 2023 by the algorithm, rather than shifting to the application layer. By the way, it's the cloud factory that buys chips, not the Young Waida that sells chips. This is precisely the wrong place for his "railway overbuilding" analogy。

Copyright (opinion 14) — registration right, circumvention

> IN 2023 WANG JIANXIE STATED THAT THE CREATION OF AI CONTENT MAY CIRCUMVENT COPYRIGHT (PROTECTION OF EXPRESSION OF UNPROTECTED IDEAS); THE CREATION MAY NEITHER INFRINGE NOR REGISTER。

"Unable to register" became an established legal fact (in 2025, the United States Copyright Agency expressly "not enough to enter a hint to claim the author's identity"). But the error of `avoiding the infringement' is clear: the court has repeatedly held that the AI output would still constitute a violation if it was materially similar to the original; and Anthropic because of the pirated language$1.5 billionRECONCILIATION IS THE LARGEST COPYRIGHT COMPENSATION IN THE HISTORY OF THE UNITED STATES. NOT ONLY DID AI CIRCUMVENT COPYRIGHT, IT PAID THE BIGGEST PRICE IN HISTORY。

The world's homogeneity - the mechanism, the trend is reversed

Wang Jae-seok said in 2023 that ChatGPT had made the human perspective a "weighted average" that would counter the shivering information cocoon, giving the possibility of a "same world"。

The institutional hierarchy - several studies in 2025 confirmed LLM's viewing of the crowd and systematic underestimation of minorities. But the social judgement is reversed: his added "at least not now a thousand people" was overturned in three years — OpenAI, from 2025 to April 2025, turned the cross-dialogue memory and personalization into a default capabilityAI IS HEADING FOR A THOUSAND FACESI don't know. More crucially, he imagines a "weighted average" as a neutral number of world conventions, but it is actually measured as a shift in direction, and it is superb, which can be used to manipulate positions – which point to the "making of new cocoons" rather than to the "dilution of polarization."。

Local warfare and cost (view 17) — all qualitative, quantitative testimony

In 2023, Wang Jian-seok said: "More large models will quickly become a "local war" and the cost will be known (about $5-1 billion tops off the curve), and many players will enter。

Qualitative orientations are striking — large numbers of players are pouring in, rapidly commercializing, and open sources are closing down, all of them. But the hard number "5-1 billion caps" is wrong on both ends: the front end is severely underestimated (GPT-5 level 200-500 million dollars in 2026, hundreds of billions of data centers and $500 billion Stargate); the back end is overestimated (DeepSeek puts marginal training costs at a million dollars). The "cost" of the same model is 200 times different by caliber, except in the compartment he gave。

Emergence capabilities (view 5) - right direction, numbers and frames Wrong

> IN 2023 WANG JIANXIE SAID THAT THERE WAS A NEW CAPABILITY IN THE ORIGINAL LANGUAGE ABOVE APPROXIMATELY 60B PARAMETERS THAT COULD NOT BE EXPLAINED BY RESEARCHERS。

Intuitive orientation is established, but two formulations do not stand: one, there is no single "60B threshold" — the real threshold of the thought chain is about 100B, with different abilities on a scale ranging from 13B to 540B; and the other, "unexplainable" was challenged by a NeurIPS outstanding paper at the end of 2023. Many of the ‘mixed changes’ are a false assessment of the selection of indicators and a smooth and predictable curve behind successive indicators. It is fair to say that what he repeated in that year was an absolute mainstream narrative, and that what could really be corrected was to use "60B" as a hard threshold and "unexplainable" as a qualitative conclusion。

Four, three years back, a few patterns

After an article-by-article reconciliation and one step back, Wang Jian-seok's 20 judgments contain several more rules than any single one worth remembering。

i. Direction is far more reliable than numbers and degrees。ALMOST ALL OF THE 20 ARTICLES (RAG, LUI, ROBOTIC NETWORK, TURING TEST) THAT DETERMINE THE MECHANISMS AND DIRECTION, AND ALMOST ALL OF THE ONES THAT GIVE THE EXACT NUMBER OR TOP PHRASE (100 T PARAMETERS, 60 B THRESHOLD, 5-1 BILLION COSTS, MATHEMATICAL "NO"). FOR FAST-CHANGING AREAS, THE DIRECTION, THE MECHANISM, THE PRECISE NUMBER OF LESS BETS, AND BEWARE OF WORDS SUCH AS "UNLIKELY, CERTAINLY, CAPPED, NEVER" - THEY'RE HIGH-HAIRED AREAS HIT BY TIME。

In terms of time, he tends to overestimate the speed and underestimation。WHEN IT COMES TO "QUICK, TWO OR THREE YEARS", MATURITY IS GENERALLY SLOWER; BUT THE CEILING OF POWER LEAPS IS UNDERESTIMATED — MATH CAN GO FROM "NO" TO IMO GOLD MEDALS, AND FRONT-LINE COSTS CAN RISE TO LEVELS THAT WERE NOT POSSIBLE THAT YEAR. ONE SENTENCE: THE SHORT TERM IS TOO OPTIMISTIC AND THE LONG TERM TOO CONSERVATIVE。

Thirdly, the most hidden mistakes are repeated in the distribution。Not in the wrong direction, but in the same way in terms of aggregates and distributions. The "no unemployment wave" is true, but the harm is highly concentrated among young newcomers; the "value-deficient application" is half-right, but does not distinguish between the arithmetical and model layers。The sum is correct, covering up the disaster- It's the best lesson。

4. Where there is room for words, they will be tested after three years。“At least now,” “at least” has been “broadly reduced, not eliminated” “two or three years, or about ten years of maturity” – all those years with qualifiers and layers of judgment can stand even more today. Instead, it is an absolute sentence, the easiest to turn over. The honesty of predictions lies in the dare to say that the other is in the way of uncertainty。

There are some problems, and three years is not enough。WHO DOES THE VALUE ULTIMATELY BELONG TO, WHETHER THE TRUTH IS CHANGING, WHETHER THE MACHINE IS CONSCIOUS, WHETHER THE CONTEXT IS GOING TO EAT RAG -- THESE DEBATES OF THE YEAR ARE STILL CONTROVERSIAL IN 2026. IT IS MORE IMPORTANT TO DISTINGUISH BETWEEN THOSE WHO ALREADY HAVE ANSWERS AND THOSE WHO HAVE TO WAIT THAN TO RUSH TO CONCLUSIONS ABOUT EVERYTHING。

WANG JAE-SEOK, THREE YEARS AGO, INTUITIVELY POINTED IN 20 DIRECTIONS IN THE FOG THAT GPT-4 HAD NOT COME OUT YET. PERHAPS THE BEST THING TO REMEMBER TODAY IS THAT IT IS NOT SO HARD TO LOOK IN THE BIG DIRECTION, BUT TO ADMIT THAT IT WAS THE LAST TIME YOU THOUGHT ABOUT NUMBERS, SPEED AND DISTRIBUTION. THESE 20 BILLS ARE NOT SO MUCH POINTS TO THE PAST AS RULES FOR THE NEXT THREE YEARS. NEXT THREE YEARS, 2029, ONE MORE TIME。

QQlink

암호화 백도어 없음, 타협 없음. 블록체인 기술 기반의 탈중앙화 소셜 및 금융 플랫폼으로, 사용자에게 프라이버시와 자유를 돌려줍니다.

© 2024 QQlink R&D 팀. 모든 권리 보유.