Token Factory Economics, is reconstructing the entire AI industry

Author: Haishan

From the “vegetable price” price war of Token in 2024 to the collective price increase of Alibaba Cloud, Tencent Cloud, and Baidu Smart Cloud in 2026.

In just two years, the Token industry has completed a staggering turnaround from money-burning competition and overcapacity to supply shortages and simultaneous growth in volume and price.

Since 2026, the A-share AI computing power sector has seen a cumulative increase of over 55%, with leading large model companies like Moonlight and Zhipu AI breaking 1 billion yuan in monthly revenue, and some companies surpassing their entire 2025 annual revenue in just 20 days.

This industry revolution, defined by Huang Renxun as “Token Factory Economics,” has long surpassed mere technical hype, becoming a deterministic trend driven by genuine demand explosions, supply-demand imbalances, and global energy and computing power competition. Its underlying logic restructuring is reshaping the entire AI industry’s rules of the game and overturning the fundamental operational logic of the world.

01 The “Oil” of the New Era

The essence of this industry inflection point is the shift of the AI industry from a “model arms race” to a “Token capacity race.”

Before 2024, the core narrative was “whose model parameters are larger, who is smarter,” with major companies frantically burning money to train large models, offering free Tokens or dumping at low prices to capture market share, even leading to a distorted situation where “selling Tokens is less profitable than selling bottled water.”

However, in February 2026, the explosive popularity of OpenClaw (commonly known as “Lobster”) intelligent agent completely shattered this logic.

Traditional large models are “people seeking AI” in a single-turn interaction mode, consuming only 1,000-3,000 Tokens per interaction. In contrast, agents adopt a “planning-action-observation-reflection” cycle architecture, requiring dozens to hundreds of model calls to handle complex tasks. Medium tasks consume 100k Tokens, and complex tasks can reach millions, earning the industry the nickname “Token Crusher.”

Data from the National Bureau of Statistics confirms this explosion: China’s daily Token call volume surged from 100 billion at the start of 2024 to 140 trillion in March 2026, an increase of over 1,000 times in two years, with the first quarter of 2026 alone growing 40% compared to the end of 2025.

The industry narrative has shifted completely: no longer competing on the “IQ ceiling” of models, but on who can produce massive amounts of Tokens at lower cost and more stably, and who can seize the initiative in intelligent supply.

Faced with massive demand, the rigid mismatch constraints of supply and demand are the core support for the continued strength of Token prices. This imbalance is not short-term volatility but a structural contradiction determined by the long cycle of the industry chain.

There are three insurmountable bottlenecks on the supply side: first, core hardware capacity is monopolized, and expansion cycles are long.

High-bandwidth memory (HBM) is the “heart” of AI servers, with Samsung, SK Hynix, and Micron accounting for over 95% of global capacity. Their expansion cycle is 24-36 months, leading to a shortage of HBM exceeding 40% in 2026.

As a result, ordinary DDR5 memory prices have increased by 300% over six months, with 256G server memory units costing over 40k yuan each. The delivery cycle for AI servers has extended from 3 months to 12 months.

Second, electricity and energy have become the largest hidden bottlenecks. Power consumption in intelligent computing centers is 10-20 times that of traditional data centers, with electricity costs accounting for over 60% of Token production costs. Large data centers’ power infrastructure construction cycles are 3-5 years, and in eastern China, there is already a situation where computing power indicators are hard to meet.

Third, infrastructure and operation capabilities cannot keep up with demand surges. Liquid-cooled data centers’ penetration increased from 15% in 2024 to 45% in 2026, but the shortage of technical talent and construction capacity has led to many built clusters being unable to operate at full capacity.

Supply-side capacity is insufficient, while demand shows a “three-stage rocket” explosive growth, with strong and sustained momentum.

The first stage is the popularization of C-end intelligent agents, with individual users shifting from simple chat entertainment to using AI assistants for emails, coding, and planning. Daily Token consumption has skyrocketed from dozens to thousands, with potential to break tens of thousands in the future.

The second stage is the full deployment of B-end production-level applications, where enterprises no longer see AI as an embellishment but incorporate Tokens as a core production factor. Companies like Kunlun Wanwei and 58.com consume over 1 trillion Tokens monthly, and AI transformations in manufacturing, finance, and healthcare are releasing trillion-level Token demands.

The third stage is the explosive demand for global expansion, with domestic large models’ Token prices only 1/5 to 1/3 of overseas Claude and GPT models, quickly capturing markets in Southeast Asia, the Middle East, and Latin America due to high cost-performance. In the first quarter of 2026, Chinese cloud providers’ overseas Token revenue grew 320% year-on-year, becoming a new growth pole.

On a deeper level, Tokens are becoming the fundamental bulk commodity of the AI era, reconstructing the entire value system of the digital economy. Just as electricity was the core energy in the industrial age and traffic was the core asset in the internet age, Tokens are the core production material in the intelligent era, with measurable, tradable, and priceable attributes, serving as a universal value anchor connecting computing supply and intelligent demand.

This shift has brought a complete revolution in business models: the industry is moving away from the old internet path of “burning money for scale” to a new stage of “pay-as-you-go, profit-driven.”

Major companies generally adopt a strategy of “C-end subsidies to cultivate habits, B-end scaled harvesting,” offering limited-time free Tokens to individual users and charging enterprises precisely based on consumption. In the first quarter of 2026, the gross profit margin of leading cloud AI businesses generally exceeded 35%, achieving scale profitability for the first time.

For China, this Token industry revolution presents a historic opportunity to leapfrog. China has the lowest green electricity costs globally, the most complete computing infrastructure (accounting for over 60% of global server capacity), the broadest application scenarios, and the most cost-effective large models, making it fully capable of becoming the “world’s Token factory.”

Just as China once became the “world’s factory” with cost advantages, today it is leading global Token production and supply through its energy, computing power, and scenario advantages.

In the short term, supply-demand mismatch will persist until the end of 2027, keeping Token prices high, with industry concentration rapidly increasing.

In the long term, as chip capacity is released and model efficiency improves, Tokens will enter a “vegetable price” era, penetrating every corner of the national economy and becoming the core engine of digital economic growth.

02 What is the situation in segmented industries?

As the Token industry shifts from “low-price competition” to “supply-demand shortage,” its segmented tracks have shown structural differentiation.

There are differentiated trends in upstream price control, midstream profit enhancement, and downstream monetization, with three major sectors: upstream hardware manufacturing, midstream Token hub scheduling, and downstream scenario application, each with distinct barriers, prosperity levels, and value distribution logic.

First, upstream hardware, as the core capacity of the Token factory, is a critical need under monopoly conditions.

It covers four sub-sectors: AI chips, computing servers, liquid cooling, and intelligent computing center operations, with industry presenting an oligopoly pattern.

AI chips are the core engine of Token production, with overseas Nvidia holding over 90% of the high-end GPU market.

Meanwhile, domestic leading companies in A-shares are accelerating breakthroughs: Cambrian’s Si Yuan 590 chip has achieved mass production, suitable for large model inference and training, with AI chip revenue in the first quarter of 2026 up 320% year-on-year.

Hygon Information’s DCU products have penetrated over 30% of domestic intelligent computing centers, deeply binding with top firms like Sugon and Inspur. Jingjia Micro’s JM9 series GPUs have been deployed in government and financial scenarios, becoming a core supplier of domestic general-purpose GPUs.

Computing servers are the carriers of Token capacity, with A-share leaders occupying nearly half of the global market.

Inspur remains the top global AI server supplier, with a 180% increase in shipments in the first quarter of 2026, and Sugon’s liquid-cooled servers leading domestically, supporting over 80% of national-level intelligent computing centers.

Liquid cooling is a rigid demand for high-power intelligent computing centers, with penetration rising from 15% in 2024 to 45% in 2026.

Inveric is the absolute leader in liquid cooling, with core clients including Nvidia, Inspur, and Huawei, and liquid cooling orders increasing by 210% year-on-year in 2026.

Shenling Environment’s liquid-cooled data center solutions have been implemented in multiple national intelligent computing centers, with order growth exceeding 150%.

In the operation segment of intelligent computing centers, companies like Baoxin Software, Halo New Network, and Runze Zhishuan leverage core locations and green energy resources to become the largest third-party intelligent computing operators domestically, with their computing rental income in the first quarter of 2026 increasing over 100% year-on-year.

Next, midstream Token hubs shift from price wars to value wars.

Midstream players handle computing power scheduling, model services, and Token standardization output, mainly divided into large model vendors and cloud service providers.

Currently, leading large model companies in A-shares have established clear Token commercialization paths.

For example, Kunlun Wanwei’s Tiangong large model’s daily Token calls exceed 1.2 trillion, with over 100B-end paying customers. Its enterprise Token services are priced at only a quarter of overseas models, with AI business revenue in the first quarter of 2026 up 450% year-on-year.

iFlytek’s Spark large model focuses on education, healthcare, and office scenarios, with 70% of Token consumption coming from B-end production applications.

Cloud service providers like Alibaba Cloud, Tencent Cloud, and Volcengine, though not listed in A-shares, benefit from related ecosystems: UFIDA and Kingdee International (HK stock) build enterprise AI applications based on Alibaba Cloud, becoming important channels for Token consumption.

Finally, downstream application scenarios, as the ultimate outlet for Token value, are penetrating into inclusive C-end and essential B-end needs.

Downstream scenarios are divided into three categories: C-end personal applications, B-end enterprise services, and vertical industry digitalization, with significant differences in Token consumption scale and commercialization pace.

C-end scenarios focus on inclusivity, mainly personal AI assistants, content generation, and creative design.

For example, in A-shares, Wanshing Technology’s AI creative software (Miao Ying Factory, Wanshing AI Painting) has over 5.5 million paying users worldwide. After AI features are fully implemented, user willingness to pay and usage time have greatly increased, with Token consumption in the first quarter of 2026 up over 320% year-on-year.

Caisun’s AI email and smart office assistants have over 300 million users, with daily Token calls exceeding 50 billion.

B-end enterprise services are the main driver of Token consumption, accounting for over 65% of total.

For example, Tonghuashun’s AI investment advisory service covers over 100 million investors, with daily Token calls exceeding 80 billion, and AI-related revenue in the first quarter of 2026 up 190% year-on-year.

Zhongkong Technology’s industrial AI platform provides intelligent operation and maintenance services for industries like chemicals and power, with an average annual Token consumption per factory exceeding 5 million.

Runze Medical’s AI medical diagnosis system has been deployed in over 3,000 hospitals nationwide, with daily processing of medical text Tokens exceeding 20 billion.

Overall, B-end vertical industry scenarios are expected to be the long-term growth poles of the Token industry, with AI transformations in autonomous driving, smart manufacturing, and fintech releasing trillion-level Token demands.

03 Which stocks are on the rise?

From industry rules, the current Token industry has shifted from “model competition” to “capacity and monetization competition.” The supply-demand mismatch, combined with accelerating commercial value release, has led six leading A-share companies to establish a foothold in three major tracks: hardware, midstream models, and downstream applications, becoming the most promising core targets in this trillion-word economy.

First, Inspur, as the absolute leader in AI servers, is the backbone of Token capacity. As the company with the largest global AI server market share, Inspur is a core hardware provider supporting the operation of global Token factories. The company has deep ties with Nvidia, prioritizing high-end GPU quotas, with supply chain and scale barriers that are irreplaceable.

In the first quarter of 2026, AI server shipments increased over 150% year-on-year, with a global market share exceeding 25%. Unfulfilled orders are close to 40 billion yuan, with delivery schedules extending to the end of 2027, making it one of the most certain performance stocks in the industry chain.

Second, liquid cooling leader Inveric, as the cooling heart of Token factories, sees power density in intelligent computing centers soaring. Liquid cooling has become a rigid requirement for large-scale Token production, with industry penetration rising from 15% in 2024 to 45% in 2026. In the first quarter of 2026, liquid cooling revenue increased over 210% year-on-year, with order visibility extending to 2027, making it the most elastic upstream stock.

Kunlun Wanwei, as a pioneer in large model commercialization, is a benchmark for Token monetization. It was among the earliest in A-shares to achieve large-scale profit from Tokens. Its enterprise Token services are priced at only a third to a quarter of overseas models, rapidly capturing small and medium-sized enterprise markets.

In the first quarter of 2026, daily Token calls exceeded 1.2 trillion, with over 80B-end paying customers, and AI revenue up over 450% year-on-year, maintaining a gross margin above 42%, making it the most pure-play Token monetization stock in A-shares.

iFlytek, as a leader in vertical large models, is the core carrier of industry Tokens. Deeply engaged in education, healthcare, and industrial sectors, over 70% of Spark model’s Token consumption comes from B-end production applications, with extremely rigid demand.

Leveraging years of industry experience, scenario and data barriers, the company’s customized Token services for government and enterprise clients are rapidly growing, with AI-related revenue expected to surpass 60% of total in 2026. As vertical industry AI penetration continues, the company will fully benefit from the long-term Token demand driven by industry digitalization.

Wanshing Technology, a leader in overseas C-end AI applications, is a core player in personal Token consumption. Its video editing and AI painting products have over 5.5 million paying users globally. After full deployment of AI features, user willingness to pay and usage duration have greatly increased, with Token consumption in the first quarter of 2026 up over 320% year-on-year.

Overall, the current Token boom is a long-term demand-driven opportunity. In the short term, it is advisable to focus on upstream hardware leaders like Inspur and Inveric, mid-term on commercialization benchmarks like Kunlun Wanwei, and long-term on vertical scenario leaders like iFlytek. High-quality companies are expected to see both performance and valuation improvements during this high-growth cycle.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin