Original title: How I'd Become a Quant If I Had to Start Over Tomorow

original by: gemchanger, founder

Translator, comment: Mr. Ryan Chi, insideers.bot

In 2026, quantitative transactions were the basis for every trader

Last week I was invited by the University of Hong Kong's Association of Artificial Intelligence and Management (@camo hku) to share the money-seeking methods of the Age. My greatest gain from this whole operation is that:

AI AGE = AGE OF TECHNOLOGY PARITY

In the past, quantification was the exclusive preserve of a small number of institutions. Numerous studios and even individuals are now involved in creating quantitative strategies and reaping sustained gains. In other words, if you do not understand the nature of quantification, you will face a great disadvantage in the market。

In OpenClaw today, anyone can make money by quantifying. But this requires two premises。

First, infrastructureThat's exactly what we're trying to do at @inindersdotbot, with databases and Skills, through the production of Agent and algorithmic business platforms. The official version is based on Agent's retrospect and will be part of this ecology。

Second, and most important as individuals, is the ability to design structures and strategies。The strategy doesn't need 100% precision, but it has to be unique, crafty, capable of capturing big opportunities that others don't know。

As long as you have your own strategy, plus a cool bottom-up facility, you're not far from wealth freedom with Vibe Coding's powers。

In the case of learning strategy and architecture, @gemchange ltd is the most complete "quantitative trading knowledge map" I have ever seen. It uses the forecast market as a stamp, and it makes every piece of puzzle required to become a top-heavy (quantitative trader/Quant) clear in the right order of learning。

And after seeing it, even White, you know how to start a quantitative transaction and how to design a strategy that belongs to you。

If you're a prognosis trader, then this is the article you have to read。

If you're a dealer of other assets, many of the ideas in this article are common, and you're sure you'll get it。

The text is very hard-core and academic. I've done a lot of rewrites and additions to make it clear to any user who has just come into contact with Polymark, even without any mathematical background. I assume you don't know anything about complex mathematics, add 20 full-Chinese graphics to it, and use the best white language, popular analogies and practical examples to break down every concept。

If you want to make money in the forecast market for a long time instead of being a gambler, this article is your starting point。

By the way, this article has been structured to optimize Agent. It's like the insiders.bot platform optimizes real people and AI traders. So, you're welcome to feed this article to your OpenClaw, Manus, Claude, or any AI, and start building your quantitative model immediately。

Preamble: Are you trading or gambling

Let me ask you a question。

You saw a contract on Polymarket for "Trump wins the election" at $ 0.52. You think he's more likely to win, and it cost $520 to buy 1,000 shares of YES。

You think you're making a deal. But actually, you're just gamblingI don't know. Because you didn't answer those questions:

How did you figure out 52%

Do you have better sources of information than other participants in the market

:: How should your probability estimate be updated if there is a news story tomorrow

* How many slots should you buy so you don't blow up if you're wrong

These questions are not answered by feeling. They need math。

In 2025, the entry-level leniency pay for top quantitative companies (Jane Street, Citadel, HRT) ranged from $300K to $500K per year. Financial recruitment for AI and machine learning increased by 88 per cent over the same period. It's not because these companies like mathematicians. It's because math really makes money through more accurate valuation models。

Polymarket, on the other hand, happens to be a trading market where all the core concepts of quantitative finance are perfectly integrated:Probability theory, information theory, calibration, integer planningAll of it。

Chapter I: Probability, Uncertainty World's only language

Most people have a huge misunderstanding about quantitative transactions. They think that a quantitative deal is a "selection of shares" and that they have a unique view of something。

Not at all。

Quantification of the nature of the transaction = pure mathematics。

And more specifically, what you're looking for is:

:: Statistical relevance

:: Ineffective pricing

:: Structural advantages。

These advantages exist because markets are a complex system of human beings, who always make systemic mistakes。

In a world of quantitative finance, all problems can eventually be reduced to one question: what is the rate of compensation and what is the advantage of this rate for me

So first, you have to understand the nature of probability。

Conditional thinking: absolute right and wrong to say goodbye

Ordinary people think, like with absolute right and wrong. One thing either happens or it doesn't。

But the way to think is conditional。

They would ask: What is the likelihood of this happening when certain information is known

"The probability of knowing certain information" is the probability of conditions。

In big white: When you get a new clue, what happens to the original probability

Sounds a little entanglement? We're looking at a practical example on Polymark。

Assuming you're trading a contract "if a certain token will increase today." Historical data show that the probability of this token rising every day is 60 percent. That's Base Late. But if the amount of transactions in the coin today exceeds historical averages, the probability of it rising will be 75%。

That 75% of the probability is the real signal. And that isolated 60 percent is just noise-filled background data。

A more intuitive example. The probability of rain is 30%. But what if the sky is dark? The probability of rain could be 85%. "Blow clouds" is your condition information, which makes your probability estimate jump from 30% to 85%. That is the nature of the probability of conditionality。

Bayes Theory: How to Update Your Faith in Real Time

The Bayesian Theorem is the soul of a quantitative deal. The question it answered was: How should you update your original conviction when you have new data

Here's the formula:

P(A|B) = P(A∩B) / P(B)

* P(A|B): THE PROBABILITY OF B OCCURRING, A OCCURRING IS KNOWN

* P(A∩B): A AND B AT THE SAME TIME PROBABILITY

:: P(B): B PROBABILITY OF OCCURRENCE

The logic of the Bayesian theorem is essentially this:

* There's an estimate in your mind (e.g., I think there's a 50% probability of this happening)。

:: Suddenly, you saw new evidence (e.g., a good news story)。

:: You ask yourself two questions: if this is really going to happen, what are the possibilities for this news? If this doesn't happen at all, what's the chance of this news

* Based on the answers to these two questions, you adjust your estimates (e.g. from 50% to 58%)。

We understand with a Polymarket scene。

Your model calculates that the reasonable price for a particular entry should be $0.50. That's your a priori belief。

Suddenly, a breaking news came out. Economic data is 3% better than expected。

By the Bayesian formula, you can precisely calculate your new beliefs. The assumption is 58%. Your new reasonable price is $ 0.58。

In the market, those who are able to complete this probability update as quickly and accurately as possible can earn most of the money. That is why the quantitative team spends millions of dollars on low-delay systems. It's not because they like speed, it's because it's 0.1 seconds, which means tens of thousands more。

If you want to lay the groundwork for reading Harvard's Free Introduction to Probability, the first six chapters are enough. And then try to write a code with Python, simulated the throwing of 10,000 coins, and see for yourself how the big law works。

Expectations are different: two of your best friends

There are two numbers in the deal that matter more than anything。

Expected value, your confidence。

If the expectations of a deal are positive, it means that if you repeat it enough, you'll make money in the long run。

Varance, your risk。

It tells you how much you're going to go up and down before you get to the "long" that makes money。

Take an example. Assuming you have a strategy, the expected gain per transaction is $2, but the standard difference is $50. This means that while you make an average of $2 per transaction, the result of a single transaction can fluctuate sharply between $100 and $100. If your principal is $200, you'll probably be out of three before the "long" arrives。

Kelly formula: scientificly size the bet

Now that you know the difference between expectations and expectations, how much should I buy in the face of a good opportunity? All the way

Absolutely not. Here we need to introduce Kelly Cliterion。

Kelly's formula is designed to tell you:With the given odds and odds, you should put a few percent of the total money in order to get your money rolling as fast as possible without going bankrupt。

If you count 20 percent, that means you can only take 20 percent of the total money to bet。

In the field, because our estimates of success tend to be miscalculated (you think you have a 60% chance of winning, which is probably only 55%), top-of-the-art broads usually use half-KellyThat's half of Kelly's formulaI don't know. This would significantly reduce the upward and downward volatility of funds while retaining a large proportion of the rate of earning money。

After-school work in the first chapter (2 hours per day, approximately 3-4 weeks):

1. Reading: Reading the Introduction to Probability, co-authored by Blitzstein & Hwang (Harvard provides a free version of PDF with links: http://probabilitybook.net[1] (https://stat110.hsites.harvard.edu/)

2. Programming exercise 1: Simulation of 10,000 coins and visualization of "large-number laws" with graphs。

3. Programming exercise 2: A Bayesian upgradeer: Enter probabilities and probabilities, and output probabilities。

Chapter 2: Statistics = your noise detector

When you learn the language of probability, the next step is to learn to listen to data。

That's statistics。

The first lesson that statistics teach us is that the vast majority of things that look like signals are just noise。

Hypothetical tests and multiple comparison traps

Assuming you write a trading robot, backtracking data shows that it makes 15 percent a year. Is that true, or is it just luck

this is when you have to count a p-value: if this strategy is a piece of junk, what is the probability that it will run out of 15% of the proceeds? statistics can tell you how small it is (e.g. less than 5%)。

But here's a huge trap called Multicomparisons Problem。

Imagine you let 1,000 monkeys throw darts each 100 times. It's just luck. There's always a few monkeys that can hit hearts in a row, and it looks like a darts master. But you wouldn't hire them as investment managers, would you

The same goes for writing a trade strategy. If you automatically generate 1,000 blind strategies to run historical data, with pure luck, about 50 of them seem to make a lot of money。

Every new recruit who's just started is seriously overestimating the "effective strategy" he found. I can tell you responsibly that the first 10 strategies you wrote were definitely the lucky monkeys。

What's the solutionYou need to use Bonferroni correction to raise your visibility thresholdOR USE THE FDR CONTROL. IN SHORT, IF YOU'VE TESTED 100 STRATEGIES, YOUR THRESHOLD OF VISIBILITY IS NOT 0.05, BUT 0.05/100 = 0.0005. SO WE CAN FILTER OUT THE FALSE SIGNALS OF LUCK。

Return analysis: Dismantling your revenue sources

The linear return is the primary instrument of the financial community. In quantitative transactions, you're going to compare your strategic gain with a big one。

Here's the cut-off item Alpha, which is your excess gain. It's the kind of money you can't be explained by a big up and down。

Take an example. Suppose your strategy earned 20% this year. But if the whole market goes up by 18 percent, your technical score is only 2 percent。

And even worse, if your strategy is just to "scrambling up and down," then after removing the large fluctuationsYour Alpha could be zero or even negativeI don't know. This means that your so-called "trading advantage" is nothing more than a covert flow。

In financial data, there is also a particular concern:There is often self-relevance between data (the price today is related to yesterday) and heterogeneity (the volatility is not constant). So you need to fix your return with a Newey-West standard error, otherwise your statistical test will be overly optimistic。

MAX. APPEARANCE (MLE): ART OF REVERSE REASONING

AND WHEN YOU HEAR A TOP-LEVEL GUY SAY THEY'RE CALIBRATING A MODEL, THEY'RE ALMOST ALWAYS SAYING ONE THING: MLE。

MLE'S THEORY IS REALLY WELL UNDERSTOOD. IT'S A "REVERSE REASONING"。

Take the example. You saw a 2-metre diameter puddle on the side of the road. You want to know how much rain it rained last night. You have a rainfall model that tells you how much water can be produced by different rainfalls。

WHAT MLE DID IS PUSH THE OTHER WAY: SINCE I'VE SEEN A 2-METRE PUDDLE, WHICH OF ALL THE POSSIBLE RAINFALL IS MOST LIKELY TO CREATE SUCH A PUDDLE

WHETHER IT IS A GARCH MODEL FOR VOLATILITY RATES OR A CALIBRATION OF OPTIONS BASED ON MARKET QUOTATIONS, MLE IS A CORE TOOL。

THE SAME IS TRUE IN TRANSACTIONS. YOU SEE MARKET OPTIONS PRICES (WATER PITS) AND YOU WANT TO REVERSE THE MARKET'S EXPECTATIONS OF FUTURE FLUCTUATIONS (RAINFALL). MLE IS THE HIDDEN PARAMETER THAT HELPS YOU FIND THE BEST EXPLANATION FOR THE CURRENT PRICE。

As a link, you can try to download some real asset price data (e.g. by using the yfinance library in Python). Test them for normal distribution。

Spectrum:Absolutely not. The real world is full of fat tail effects, i.e. extreme events are much more frequent than predicted by normal distribution. Try to match a t-distribution with MLE to see what the real risk is。

After-school work in Chapter II (approximately 4-5 weeks):

1. Reading: reading chapters 1 to 13 of All of Statistics by Wasserman. (CMU Open PDF version: https://www.stat.cmu.edu/~brian/valerie/617-2022/020-%20 %20books/2004%20-%20wasserman%20-%20all%20of%20statistics.pdf)

2. Programming exercise 1: Download real stock yield data from yfinance and test them for regularity (exhaustible: the probability is rejected, indicating that the rate of return is not subject to normal distribution). And then a single t distribution is proposed with the maximum approximation (MLE), comparing the differences。

3. Programming exercise 2: Fama-French three-factor returns from a combination of stocks using the Statsmodels library。

4. Programming exercise 3: A replacement test (Permutation Test): randomly shattering the date 10,000 times, comparing the difference between the performance after the disruption and actual performance。

Chapter 3: Linear algebras, quantifying the world ' s bottom engine

A lot of people find linear algebra boring, a bunch of matrix calculations. But it's actually a machine that runs the whole quantitative world。PORTFOLIO CONSTRUCTION, PRIMARY COMPONENT ANALYSIS (PCA), NEURAL NETWORK, COORDINATED DIFFERENTIAL ESTIMATES, FACTOR MODELS ARE ALL DEPENDENT ON IT。

There's even a rumor that 30% of the world's youth really beat BuffettGrand Medal Fund, is based on the adoption of the Markov model based on linear algebra。

If you can't think fluently using the matrix, you can't be a leniency。

Aligning matrix: understanding asset association

A coordinated matrix of sigma captures how each asset moves in relation to all other assets。

IF YOU LOOK AT 500 MARKETS, THIS MATRIX IS THE SIZE OF 500 X 500 AND CONTAINS 125,250 UNIQUE ENTRIES. EACH ENTRY TELLS YOU: "AS ASSETS A RISES, ASSETS B TENDS TO RISE OR FALL, AND HOW MUCH RISES OR FALLS."

And the difference in the entire portfolio can collapse into an extremely elegant mathematical expression:

2 p = w^T w

* w is your holding weight vector

* THE RADIUM IS THE MATRIX

This secondary formula is at the heart of Markowitz portfolio theory, at the heart of risk management and at the heart of everything。

In other words, if you're dealing with a number of related transactions at the same time (e.g. "Trump wins the election" and "Republican wins the Senate" ), you're not going to be able to win the electionYour overall risk is not simply to add up the risks of each market. You need to consider the correlation between them. And the matrix is the tool to do this for you。

CHARACTERISTIC DECOMPOSITION AND PCA: FIND HIDDEN DRIVERS

The way you look at the world changes when you first use Eigendecomposition for the main ingredients。

THE MAIN INGREDIENT ANALYSIS CAN BE EXPLAINED BY AN ANALOGY: IF YOU WANT TO DESCRIBE A PERSON'S SIZE, YOU CAN RECORD DOZENS OF DATA ON HEIGHT, WEIGHT, ARM LENGTH, LEG LENGTH, SHOULDER WIDTH, ETC. IN FACT, MANY OF THESE DATA ARE LINKED (THE TALL ONES USUALLY HAVE LONG LEGS). THE ROLE OF THE PCA IS TO CONCENTRATE THESE DOZENS OF COMPLEX DATA INTO SEVERAL CORE "HIDDEN LABELS" SUCH AS "COLLECTIVE HEAD SIZE" AND "SKINNYNESS"。

The same is true in financial markets. If you watch 500 tokens rise and fall, you'll findOnly the top five "hidden labels" (characterized vectors) can explain the volatility of 70% of the entire market. Everything else is basically noise。

You don't need to understand what 500 tokens are doing. All you have to do is understand these five "hidden drivers" (e.g., the whole mood of a large disc, changes in interest rates, the heat of a given track, etc.). That's the magic of decline。

If there is enough time, it is recommended to see the linear algebra course of Professor Gilbert Strong of MIT. And then, with Python, do a PCA decomposition of the yield of the Python 500 and see for yourself what the former main ingredients are。

You'll find that the first major ingredient is almost equal to "the whole market rise and fall."。

Chapter III Post-school work (approximately 4-6 weeks):

1. Watch the video: After reading all the MIT 18.06 linear algebra videos of Gilbert Strong, one section cannot jump. (MIT OpenCourseWare free of charge: https://ocw.mit.edu/courts/18-06-linear-algebra-spring-2010/video galleries/video-lectures/)

2. Reading: Reading Strang ' s Introduction to Linear Algebra as a subject in a book. (Network of teaching materials: https://math.mit.edu/~gs/linearalgebra/)

3. PROGRAMMING EXERCISE 1: COMBINING THE YIELD DATA OF THE STANDARD 500 BY DRAWING A CHARACTERISTIC SPECTRUM (I.E. HOW MUCH VARIANCE IS EXPLAINED BY EACH PRIMARY COMPONENT) AND IDENTIFYING THE FIRST THREE MOST IMPORTANT PRIMARY COMPONENTS。

4. Programming exercise 2: Markowitz average-square margin optimization from scratch。

Chapter 4: calculus and optimization, capturing the language of change

The calculus is about the language of change. In financial markets, everything is changing: prices, volatility, relevance, and the overall probability distribution is shifting in seconds。

The calculus is used to describe and exploit these changes。

Lines with Taylor: A simple approach to complexity

Derivatives, which is a mathematical guide, not a financial derivative, appears in the reverse transmission of every neural network, and in the calculations of each option "Greeks."。

In quantitative transactions, Taylor is often used to conduct approximate calculations. By its very nature, the guide provides the necessary input for Taylor to launch。

what taylor began was to model the relationship between x (key factors) and y (value of assets) by fine-tuning one multi-dimensional function。

Suppose you're drawing a very complex curve, but you have only a straight foot. What do we do

In the first step, you fit a point in the curve with a straight line. Around this point, the line is similar to the curve。

Step two, if you want to fit it better, you can bend the line a little bit into a parabolic line。

The more you bend, the closer the line you draw to that complex curve。

In transactions, price changes of options are an extremely complex formula. We couldn't figure it out, so we started using Taylor and broke it down into a few simple pieces:

The effect of price direction (Delta)+ the effect of price bending (Gamma)+ the effect of time running (Theta)+ the effect of fluctuations (Vega)。

Cam Optimization: Looking for the best solution

In quantitative finance, almost all the issues of "optimization" can be described as those of "optimization". For example, in the case of a given risk budget, how can funds be allocated to maximize benefits

Imagine you're blindfolded in a valley, asking you to go to the bottom。

* If this valley is puddled, you may not be able to get out of a puddle on a half-mountain。

* But if this is a perfect valley in the shape of a bowl, you just have to follow the direction of the downward slope, and you must be able to reach the only point at the bottom with your eyes closed。

If you can write financial questions into a mathematical formula in the form of a bowl, computers can quickly help you find the perfect answer. That's what Cam did for you. The original author mentions that Boyd and Vandenberghe of Stanford University have written a free textbook, Convex Optimization, which is a Bible in this field. Python's cvxpy library allows you to solve complex optimization problems using several lines of code。

Here, too, a wave of Andrew Ng's AI courses is recommended, and earlier issues will mention the decline of the gradient and the local best/full best. To facilitate a better understanding of the need for excellence. Links: https://www.youtube.com/watch?v=JPcx9qHzzgk

Chapter IV Post-school work (approximately 4-5 weeks):

1. Reading: reading chapters 1 to 5 of the " Convex Optimization " by Boyd & Vandenberghe. (Stanford provides free PDF version: https://web.stanford.edu/~boyd/cvxbook/bv cvxbook.pdf, homepage of books: https://stanford.edu/~boyd/cvxbook/)

2. Programming exercise 1: A gradient reduction algorithm, starting from scratch, is used to obtain the minimum value of the Rosenbrook function. (Rosenbrook function is one of the most classic test functions in the area of optimization, which appears to be simple but, in fact, difficult to optimize and very difficult to perform

3. programming exercise 2: cvxpy solves a portfolio optimization problem and adds transaction cost constraints。

Chapter V: Random calculus, from data scientists to genuine leniency Change

Before learning random calculus, you were just a "data scientist like finance." After learning about it, you're a real leniency。

This is where you learn how to model randomness over time. Here, you will derive the famous Black-Scholes equation from the first principle and truly understand why the trillion-dollar derivatives market will operate in the way it is。

*Note 5.1: For a better understanding of the Black-Scholes equation and its meaning, reference may be made to the previous book Polymarket as a city Bible. Link: https://x.com/MrRyanchi/status/2036346480067747844

*Note 5.2: Why is random calculus different from ordinary calculus? It's because, in a random process, the second-class Taylor will not disappear. In ordinary calculus, the second tier is negligible when the time interval approaches zero. In the random process, however, due to the special nature of the Brown movement, (dW)2 = dt, the second tier has become a first-order scale and cannot be ignored. Details are given below。

Brown Movement: purely random mathematical expression

The Brown movement (also known as the Weiner Process, W t) is a random walk of time。

Imagine a drunk walking in the square. Every step he takes is completely random. The twisted, irregular trajectory he walked out was the Brown movement. The swing in stock prices is seen in mathematics as the pace of this drunk。

There are many examples of the Brown movement, such as the scientific movement of air particles and random Brown movement。

Here is the insight that determines everything: in the Brown movementTime runs and distance squares are equal (i.e. (dW)2 = dt)I don't know. It is precisely because of this nature that random calculus is different from ordinary calculus。

Itto's reasoning: the chain law of the random world

STOCK PRICES ARE TYPICALLY MODELLED BY GEOMETRIC BROWN MOVEMENT (GBM):

dS t = μS t dt+S t dW t

Translation: Changes in prices = Trends from expected rates of return + Random shocks from fluctuations

Itô's Lemma is the chain law in the random world。

:: In ordinary calculus (e.g., calculating the trajectory of a smooth-moving car), you only have to consider speed (first-class guide)。

:: But in random calculus (e.g. calculating a car track travelling on an extremely bumpy and bad road), the road itself is too bumpy (volatility rate), which would substantially change the car track。

So Ito's reasoning tells us that when calculating random changes, you can't just look at directionYou have to add the "mixedness" to the formula。If you don't, you're wrong about the price。

Black-Scholes and risk neutrality pricing

A miracle happened when you applied Ito's reasoning to an option price and built a hedge portfolio。

In the Black-Scholes equation that was extrapolated, the variable that represented the "expected rise and fall" went off in the equation and disappeared

What does that mean? This means that the price of options is not dependent on your expectation that the stock will rise or fall in the future。

That is, assuming you're buying an increase in options. You think the more options are, the more people look up. Wrong! Under a perfect mathematical model, the price of options is only one thing:How volatile is the future of this stock. It does not matter whether it rises or falls sharply。

When you first really understood the concept, that feeling was extremely shocking. It explains why an extremely high-profile trader and a very high-visibility trader could have a pleasant deal at the same option price. Because the essence of their transactions is not direction, but volatility。

Greek letters (The Greens): the dimensions of the dismantling risk

With the Black-Scholes pricing model, risks can be accurately disintegrated to several separate dimensions. These dimensions are named after Greek letters, so they're called Greeks:

:: Delta (Δ) - price sensitivity:Change in asset target $1 and change in option price. It tells you how much cash you need to buy to hedge the risk。

* Gamma (knocking) - Curvature:Delta's rate of change. It tells you how often you need to adjust your hedge positions. Gamma is the largest and the highest risk when the probability of an event is close to 50%。

♪ Theta (success) - Time decay:Every day, the value of the loss of options. You can interpret this as the "rent" to be paid every day to hold options。

:: Vega (V) - Fluctuation sensitivity:The volatility rate is 1%, and the options price changes how much. This is where most Wall Street derivatives counters actually make money (or lose money)。

Rho (old) - Interest rate sensitivity:Impact of changes in interest rates on prices. Often the impact is small and negligible。

Chapter V Post-school work (approximately 6-8 weeks):

1. Reading: Reading Stochastic Calculus for Finance II (Financial Random Analysis II: Continuous Time Model), Shreve, a recognized gold standard course in this field. (PDF version: https://cms.dm.uba.ar/academico/meterias/2docuat2016/analisis cutitativo en finanzas/Steve ShreveStochastic Calculus for Finane II.pdf)

2. Alternative teaching materials: If Shreve is too hard to eat, Arguin ' s " A First Course in Standard Calculus " , which is updated and easier to read. (AMS official page: https://bookstore.ams.org/amstext-53)

3. Inductive exercise 1: for f(S) = inn(S) to apply Itô's Lemma, where S is subject to geometric Brown Movement (GBM). The key 2/2 amendment is derived. (This amendment is central to understanding the relationship between a logarithmic rate of return and a continuous compounding

4. Boosting exercise 2: From the Delta sprinting argument, the Black-Scholes equation is fully extrapolated。

6. Programming exercise: From zero, the Black-Scholes formula is priced, followed by a Monte Carlo simulation, comparing the results with the results of the validation of Monte Carlo to decomposition as the number of simulations increases。

Chapter 6: Polymarket and LMSR, Math engine for forecasting markets

Now, let's take all our mathematical weapons back to the world's most interesting trading market today: Polymarket。

The mathematics behind Polymark is perfectly connected to everything mentioned in this article: probabilistic theory, informational theory, calibration and integer planning。

LMSR = Neural Network Softmax

In the early forecast markets, the AMM usually uses a mechanism called LMSR (Logarithmic Market Scorring Rule). It was invented by the economist Robin Hanson。

Its cost function is: C(q) = b. Inn (q i/b))Of which:

* q i is the balance of a given result

*b is a liquidity parameter (the larger the market, the thicker the price, the harder it is to be pushed)。

based on this cost function, we can calculate any result i the corresponding entry price:

p i = e^ (q i/b) / Σ (q j/b)

If you learn a little bit of machine learning and see the LMSR price formula, you'll be shocked: Isn't that the Softmax function

What's Softmax? Suppose you have three apples, weighing 100, 50, 20. You want to turn their weight into a percentage probability. Softmax is a probability converter. Not only does it add up to 100 percent probability, but it magnifies the difference. A slightly heavier apple would get a much bigger probability share。

The formula that drives the forecast of market pricing and the formula that drives everyone to predict the next word (e.g. ChatGPT) is mathematically fully equivalent. It's not a coincidence. The underlying logic of both is the same: to transform a tumultuous set of numbers into a legal probability distribution。

This mechanism guarantees several extremely elegant features。

:: The price of all possible outcomes is always equal to 1, perfect for probabilities. Prices are always between 0 and 1。

:: It can provide unlimited mobility (someone will always deal with you)。

* the maximum potential loss of business is strictly limited to b x inn(n), where n = the number of possible outcomes。

The CLOB mechanism for Polymark: from theory to reality

It is worth noting that, although LMSR is the classic theoretical basis for predicting the market AMM, Polymark has evolved to use the CLOB mechanism。

More details can be found in this article last October: https://x.com/MrRyanchi/status/1977932511775760517

IN THE CLOB MODEL, PRICES ARE NO LONGER IMPOSED BY A FIXED MATHEMATICAL FORMULAIt's entirely generated by buyers and sellers in the market through the Bids and Asks gameI don't know. It's like the traditional stock exchange platform or the Binance contract market。

WHY DOES IT MATTER? BECAUSE UNDER THE CLOB MECHANISM, THE ROLE OF MARKETER HAS CHANGED DRAMATICALLY。

The core difference between LMSR (traditional AMM) and CLOB (Polymarket current):

:: Price formation:LMSR IS AUTOMATICALLY CALCULATED BY A MATHEMATICAL FORMULA; CLOB IS CREATED BY A SINGLE GAME BETWEEN BUYERS AND SELLERS。

:: Sources of mobility:LMSR IS PROVIDED AUTOMATICALLY BY THE SYSTEM POOL; THE CLOB MUST BE PROVIDED ON A MARKET BASIS。

:: The role of the market:THERE IS NO NEED FOR PROFESSIONAL BUSINESS IN LMSR; HOWEVER, IN CLOB, BUSINESS IS THE LIFELINE OF THE MARKET。

:: Price differential control:THE DIFFERENCE IN THE SALE PRICE OF THE LMSR IS DETERMINED BY THE SYSTEM PARAMETERS; THE DIFFERENCE IN THE PRICE OF THE CLOB IS DETERMINED BY THE INTERNAL VOLUME COMPETITION BETWEEN THE MARKET PLAYERS。

:: Counterpart demand:UNDER THE CLOB MODEL, MARKETERS FACE A VERY HIGH RISK OF UNILATERAL EXPOSURE AND HAVE TO ENGAGE IN EXTREMELY COMPLEX CROSS-MARKET HEDGES。

IN A SIMPLER WAY, IN LMSR MODE, AMM AUTOMATICALLY PROVIDES LIQUIDITY, AND YOU ONLY HAVE TO DEAL WITH FORMULAE. HOWEVER, UNDER THE CLOB MODEL, LIQUIDITY IS PROVIDED ENTIRELY BY THE MARKET. YOU NEED TO CALCULATE THE REASONABLE PROBABILITY (USING THE ABOVE-MENTIONED BAYESIAN UPDATE AND STATISTICAL MODEL) AND THEN HANG OUT THE PURCHASE AND SALE ORDERS AROUND THAT PROBABILITY AND EARN THE DIFFERENCE。

If you're wrong about Pollymarket, or if you're wrong about the risk of hedge correlation, the list that you hang will be immediately eaten by a smarter quantitative fund as a herb harvest。

Chapter VII: Professional drawings and toolboxes for broadband visitors

If you want to turn the system into your profession or form your own quantitative team, you need to know the ecology of the industry。

Four core roles

Quant Researcher:Those who look for models and build prediction models in big data. They need a very high talent in mathematics and statistics. In Polymarket's context, they are responsible for building probabilistic models to determine what the "reasonable price" of a contract is。

:: Quant Development Engineer:People building infrastructure. They need to know C++, Rust, or Python, to build a low-delayed trading system. In Polymarket ' s context, they are responsible for constructing a trading engine with API to ensure that orders are submitted and executed within milliseconds。

:: Quant Trader:People who manage funds, control risks, make real-time decisions. Their income differentials are greatest. On Polymarket, they are those who market simultaneously in multiple markets, adjust price differentials in real time and position。

:: Risk Quant:Team guardian. They are responsible for model validation, calculation of maximum loss (VaR) in extreme cases and stress testing。

Salary levels in top institutions

:: Top companies (e.g. Jane Street, Citadel, HRT):THE ENTRY-LEVEL RECRUITS RECEIVE AN ANNUAL SALARY OF $300K TO $500K; SENIOR EMPLOYEES RECEIVE AN ANNUAL SALARY OF $1M TO $3M; STAR TRADERS CAN GET $3M TO $30M。

:: Middle and upstream companies (e.g. Two Sigma, DE Shaw):NEW PERSON SALARY OF $250 TO $350K; SENIOR EMPLOYEE $575K TO $1.2M。

*Note: Jane Street earned an average of $1.4 million per year in the first half of 2025。

List of recommended readings (in order of study)

* Probability and Statistics - Blitzstein & Hwang

:: Statistical progress - Wasserman All of Statistics: hypothetical testing, regression, MLE

* Linear Algebra - Strang

:: Optimism - Boyd & Vandenberghe Convex Optimization: Optimization Theory and Practice

:: Random calculus - Shreve "Stochastic Calculus for Finance II " : Brown Movement, Ito Introduction, BS Model

:: Quantified finance - Hull "Options, Futures, and Other Derivatis: Derivatives Pricing Panorama "

:: Operational strategy - Ernest Chan Quantitative Trading: A guide to hole avoidance from retrofitting to disk

I wish I'd known three things earlier

At the end of the article, the original author shared three extremely deep insights. And that's the advice I'd like to give to all Polymark dealers。

1. Your real enemy is the estimated error

Many prefer to use the whole-Kerry formula, or unbridled Markowitz optimization, or machine learning models filled with hundreds of features. They end up failing for the same reason: they overcomparison noise-filled historical data。

Math is perfect when the parameters are perfect. But in reality, you never get the perfect parameters. The gap between theory and practice is always a miscalculation。

The best of all, not those who use the most sophisticated models, but those who keep their guard against error. They will take the initiative to reduce their position (with half-Kelly instead of all-Kelly), to simplify the model (with three core features instead of 30), and to join the constraint。

2. Tools have been democratized, but confidence is not

Today, anyone can download PyTorch for free. Anyone can access Polymarket's API. Technology is a necessary but no longer sufficient condition。

True trading advantages (Edge) exist in unique data, unique models or unique implementation capabilities. No more Python libraries than others。

And that's why we postponed the update of @insidersdotbot for a whole month to prioritize the refinement of our smart money bank and better PNL algorithms (e.g. more accurate than the official Split revenue calculation model)。Because of the unique data and models, you can really make more money or lose more。

On Polymark, what does that mean

It means you need to find sources of information that others do not have (e.g. a network of experts in a small field) or build models that others do not (e.g. a pricing engine that handles multi-market relevance in real time) or have enforcement capabilities that others do not have (e.g. a trading system that can cross-market hedges within 10 milliseconds)。

Mathematics is the true moat

AI can help you write codes, even suggest a trade strategy. However, it can be deduced why Ito has one more reason to prove that the discount price at a risk-neutral measure is Martingale and to judge when the convex reduction is tight in a combined market。

This deep mathematical intuition is the fundamental watershed in the distinction between “growners who create advantages” and “growners who borrow advantages”. And the advantage borrowed will expire sooner or later。

The forecast market is experiencing changes in the traditional options market in 1973. Those who can take the lead in introducing rigorous mathematical models, volatility pricing and complex arbitrage algorithms into the market will take the largest dividends。

Stop betting on instinct. Learn probability, write codes, build your math moat。

Full Toolbox

Python Technology Inn

Data processing: pandas, polas (Polars processing large data sets 10 to 50 times faster than pandas)

numpy, scipy

machine learning (table data orientation): xgboost, lightgbm, catboost

machine learning (deep learning direction): pytorch

optimizing solver: cvxpy

Derivative pricing: Quant Lib (industrial-grade library, bottom C++, performance high)

statistical analysis: statsmodels

Return frame: Nautilus Trader

Retrospective framework (a simpler and easier choice):backtrader, vectorbt

Quantified research platform: Microsoft Qlib (over 17,000 stars on GitHub, in favour of AI)

Enhanced learning transaction: FinRL (over 10000 stars on GitHub)

C++ and Rust

C++ COMMON LIBRARY:Quant Lib, Eigen, Boost

Rust: RustQuant can be used for options pricing, and NautilusTrader uses the Rust + Python hybrid structure (Rust core ensures speed, upper level Python API facilitates research)。

Data Sources

Free:yfinance, Finnhub (60 requests per minute), Alpha Vantage

Medium:Polygon.io ($199 per month with a delay of less than 20 ms), Tiingo

Enterprise level:Bloomberg Terminal (Pembo Terminal, approximately $32,000 per year), Refinitiv, FactSet

Block chain data:Alchemy (free set to support historical archived data access)

Besides, @insidersdotbot is about to open the API. This will include a readily available smart money database and transactional functionality. Welcome to Little Bell。

Solver

Gurobi:The fastest commercial mixed integer planning solver and students and academic users can apply for a free licence. It's a combination of arbitrage problems。

Google OR-Tools:Free solver is the strongest。

PuLP / Pyomo:Python Modelling Interface (Python) to easily define and call various solvers。

References

How I'd become a Quant If I Had to Start Over Tomorow.

https://x.com/gemchange ltd/status/2028904166895112617

[2] Blitzstein, J.K., & Hwang, J. (2014).

https://projects.iq.harvard.edu/stat110

Markowitz, H. (1952).

Strang, G. MIT 18/06 Linear Algebra. MIT OpenCourseware.

https://ocw.mit.edu/courts/18-06-linear-algebra-spring-2010/

Boyd, S. & Vandenberghe, L. (2004).

https://web.stanford.edu/~boyd/cvxbook/

Hanson, R. (2003).

[7] Polymark Documentation.

https://docs.polymark.com/training/overview

Black, F., & Scholes, M. (1973).

Shreve, S. (2004).

Original Link

In 2026, how do ordinary people start quantitative transactions