3:30 PM, the sun on California Avenue has already started to tilt toward Stanford.
The dog at the entrance of Zombie Café is again lying under the white chair at the door.
On the table are three printed copies: the Opus 4.7 release draft from Anthropic on April 16, the verbatim transcript of OpenAI’s GPT-5.5 launch event with Greg Brockman on April 23, and the DeepSeek V4 technical report just released early this morning, still wet with ink.

In eight days, three companies have laid all their cards for Q2 2026 on the table.

Before finishing the coffee, Silicon Valley’s Alan Walker explained this clearly.
No hype about benchmarks, no claims of which model “feels better,” no PR releases.
Just breaking it down from first principles—technology, chips, cost, audience, strategy, ecosystem—where each of the three stands in 2026, who’s leading, who’s defending, who’s shaking the table.

01 Open Source vs Closed Source—The Fundamental Route of This Battle

DeepSeek synchronized the open release of V4-Pro and V4-Flash models, under a standard MIT License, with weights directly uploaded to Hugging Face, accessible for download, fine-tuning, and commercial use by anyone.
Claude Opus 4.7 and GPT-5.5 are purely closed source—only providing an API endpoint, with model weights forever out of reach.

Many think this is a business model dispute.
Wrong.
This is a trust structure dispute.

The moat of closed source is “You can only come to me”—locking users at my doorstep in line.
The moat of open source is “You can’t leave my ecosystem”—building AI infrastructure for developers, enterprises, and entire nations on my architecture.
One is a toll booth, the other a highway.

DeepSeek has proven this with four consecutive open-source generations: V3, R1, V3.2, V4.
Today, any company worldwide wanting local deployment, aiming to run large models in finance, healthcare, government, military—first thing they think of is DeepSeek.
Chinese state-owned enterprises, Middle Eastern sovereign funds, European banks wary of handing data to US clouds—these will never use closed API models, never.
Anthropic and OpenAI are betting the opposite: cutting-edge intelligence gaps always exist, and the smartest clients are willing to spend the most.

But this bet has a time window.
Since R1’s release, the gap between open and closed capabilities has shrunk from a year to three months.
Once that shrinks to one month, the closed-source line begins to crack.

02 Model Architecture—The Fundamental Divergence of the Three

V4-Pro 1.6T parameters / 49B activations; V4-Flash 284B parameters / 13B activations.
Default context length is around 1 million tokens.
Core architecture: hybrid attention (CSA + HCA interleaved) + Manifold-Constrained Hyper-Connections + Muon optimizer + FP4 training.
In 1M token scenarios, V4-Pro’s single-token inference FLOPs are only 27% of V3.2’s; KV cache is only 10%.
V4-Flash is even more aggressive—FLOPs down to 10%, KV cache down to 7%.

The essence of this architecture’s bet: long context isn’t a capability issue, it’s an efficiency issue.
V3 relied on MoE to cut training costs; V4 relies on hybrid attention to reduce inference costs.
Each step targets the most expensive part.

GPT-5.5 is different.
OpenAI explicitly states—this is the first core model retrained from scratch after GPT-4.5.
Previous versions 5.1, 5.2, 5.3, 5.4 are iterative post-training on the same base.
5.5 reworked architecture, retrained on new pretraining data, and redesigned agent-oriented training objectives.
Pachocki said at the launch—“Model progress over the past two years has been surprisingly slow”—which actually means their previous base can no longer scale with new curves; they need a new engine.

Claude Opus 4.7 is a precise upgrade over 4.6.
Anthropic’s positioning is clear: notable improvement, not a paradigm shift.
SWE-bench verified scores rose from 80.8 to 87.6; visual resolution from 1568px to 2576px; overall throughput is 3.3 times higher than before; tokenizer was changed once (the same text now uses 1 to 1.35 times more tokens).
Mythos Preview is their true next-gen monster, but it’s still under wraps, only available to 12 partners for testing, consumer release pending.

03 Underlying Chips—The Most Underrated News Today

Mainstream English media headlines focus on V4 benchmarks.
Wrong.
The real game-changer today is: part of V4’s training was done on Huawei’s Ascend chips.

On the same day, Huawei announced full support for the Ascend SuperPoD series for V4 Pro and Flash.
Cambricon announced compatibility simultaneously.
SMIC’s Hong Kong stock jumped 10% that day.
Reading these three together is the real news—China’s AI training and inference stack now runs entirely on domestically produced hardware, with no Nvidia chips in the critical path.

This is more impactful than all benchmark scores combined.

Over the past three years, the most effective US leverage against China has been export controls on advanced GPUs.
The logic is simple—without H100 or B200, you can’t train the strongest models.
V4’s release cuts that leverage’s torque by half.
Top-tier open-source models can now be trained and deployed on non-Nvidia hardware.
Once this is widely validated in the market, sanctions in AI become essentially ineffective.

Claude and GPT-5.5 run entirely on Nvidia H100/H200/B200 + Google TPU + Anthropic’s self-developed Trainium2.
No alternative path, no second supplier.
This is a barrier, a single point of failure—if Nvidia’s prices rise or capacity can’t keep up, both will suffer.
DeepSeek now has an independent supply chain—another card in hand.

04 Training Cost Structure—How Muon, FP4, and 32T tokens add up to today’s prices

DeepSeek’s technical report clearly states:
Using Muon optimizer (faster convergence, more stable training), FP4 precision (memory usage halved), two-stage post-training (domain experts independently fine-tune + RL, then distilled into a unified model), and 32 trillion tokens of pretraining data.
These aren’t just gimmicks—they’re the real machinery lowering training costs.

The result:
V4-Pro’s API price can be below V3.2’s;
V4-Flash is at the lowest range among open-source small models.

GPT-5.5’s approach is to openly raise prices.
$5 per million tokens input, $30 output—double GPT-5.4.
OpenAI claims—“token efficiency improved by 40%, overall cost only increased by 20%.”
Nice words.
But run a real prompt in production, and you’ll see—long prompts, short outputs, bills double.
OpenAI bets that “cutting-edge intelligence remains scarce enough” to sustain this for a cycle, so they dare to double prices.

Claude Opus 4.7’s approach is to quietly raise prices.
Official price remains the same—$5/$25, identical to Opus 4.6.
But Anthropic’s own docs say—new tokenizer uses up to 1.35 times more tokens for the same text.
In other words, prices stay the same, but bills can increase by up to 35%.
A high-emotional intelligence price hike, but engineering teams running at high volume will notice immediately in their monthly reports.

DeepSeek takes the opposite route—lower prices.
V3.2 is already low; V4-Pro is below that.
Once Huawei’s Ascend 950 mass production begins in a few months, prices will drop again.
This is China’s traditional strategy: use scale and efficiency to crush competitors, then lock users into the ecosystem.

05 API Pricing—How Much AI Can You Buy for One Dollar

Look at the price list.

Third-party evaluation site Artificial Analysis provided an equivalence:
At the same Intelligence Index score, GPT-5.5 (medium) ≈ Claude Opus 4.7 (max),
the former costs about $1,200 for full testing, the latter about $4,800.
V4-Pro at similar intelligence level costs only one-third to one-tenth of those.

This isn’t “cheaper.”
It’s bringing the unit cost of high-end intelligence down by an order of magnitude.

What does this mean for a company spending a million dollars a month on tokens?
Previously, they could run only 10 agent lines; now, 80.
Experiments that were too expensive before are now affordable.
Once validated by three or four leading firms (e.g., one cut 70% of core customer service costs switching from Opus to V4-Pro without quality loss), everyone will follow.
It’s reflexive—each migration reduces the psychological barrier for the next.

OpenAI and Anthropic’s counter-strategies are twofold:
Either re-expand the gap in closed-source frontier models (Mythos to be released ASAP),
or increase switching costs through enterprise relationships, compliance, and reliability.
The former takes time and money; the latter requires customer patience.

06 The Real Economics of a Million-Token Context

All three have reached 1M context length.
On the surface, it’s a head-to-head number.

But—being able to do it and doing it cheaply are two different things.

V4-Pro scored 83.5 on MRCR long document retrieval, surpassing Gemini-3.1-Pro’s 76.3, but behind Claude Opus 4.6’s 92.9.
On CorpusQA with 1M tokens, 62%, beating Gemini 3.1 Pro’s 53.8%.
Retrieval accuracy: 94% at 128K, 82% at 512K, 66% at 1M.
Absolute scores aren’t first place, but among open source, it’s the top, and the first to make 1M tokens the default.

Claude Opus 4.7’s 1M context doesn’t incur long-context premium—this is Anthropic’s hard skill.
GPT-5.5 is the same.
But the key point: inference unit costs differ tenfold among the three, and in long-context scenarios, that gap is magnified ten times.

A simple calculation:
A 500K token legal document analyzed once—Opus 4.7 costs about $2.50 just for input, total $3–4 including output;
GPT-5.5 roughly the same;
V4-Pro about $1.
If run 10k times a day, annual costs differ by several million to over ten million dollars.
For medium-sized enterprises, the biggest bottleneck in agentic analysis workloads is long-context cost—V4-Pro essentially eliminates this bottleneck.

07 Coding and Agent Capabilities—Three Companies Guard Their Turf

Opening the benchmark table makes this very clear.

This data shows not who’s stronger, but that
The three are heavily invested in different agent forms.

Anthropic focuses on “solving real problems within actual codebases.”
Clients like Cursor, Devin, Factory, Ramp use Opus—not toy tasks like “write a todo app,” but “fix a race condition buried three weeks ago in 2 million lines of code.”
NVIDIA deployed Codex to 10k employees, reducing debug cycles from days to hours—Anthropic can produce similar numbers.
Opus 4.7 scores 64.3% on SWE-Bench Pro—truly tested in production.

OpenAI’s focus is “controlling the entire computer through agents.”
Terminal-Bench 2.0, OSWorld, Codex running shell commands—pointing to a future where AI not only writes code but directly operates your terminal, types commands, and controls your Mac.
Brockman’s “agentic computing at scale” isn’t just rhetoric; it’s OpenAI’s next decade’s slogan.

DeepSeek’s focus is “public intelligent assets for open-source developers.”
It may not beat SWE-Bench Pro, but it raises open-source models to a Codeforces score of 3206.
This means any startup can run a near-competitive coding model on their own hardware without paying Anthropic or OpenAI a dime.

08 Audience Demographics—Three Companies Target Three Completely Different Wallets

Anthropic’s client list makes the direction clear at a glance:
PayPal, Hex, Devin, Factory, Ramp, Notion, GitHub Copilot, Stripe, Block—all fintech and enterprise SaaS.
These companies share two traits: lots of money, zero tolerance for errors.
Opus 4.7’s $5/$25 pricing, security audits, compliance narratives, multi-cloud deployment via Bedrock/Vertex AI/Foundry—these target clients with six-month procurement cycles, three-year contracts, and annual millions of dollars.
On Forge Global, Anthropic’s valuation exceeds $1 trillion, surpassing OpenAI’s $880 billion—capital is betting on this “enterprise customer density” story.

OpenAI’s foundation is consumer + developer + enterprise triad.
ChatGPT’s nearly 1 billion weekly active users is its real moat.
GPT-5.5 is pushed across Plus/Pro/Business/Enterprise, with doubled API prices—costs borne by consumer traffic.
The Codex developer community grew from tens of thousands to millions in half a year; companies like Nvidia, Stripe, Shopify deploy internally at scale.
OpenAI’s game is scale—unit costs are diluted by huge denominators.

DeepSeek’s audience is entirely different.
Chinese state-owned enterprises, banks, hospitals, government agencies;
Middle Eastern sovereign funds wary of data leaving US clouds;
European pharma companies with strict GDPR;
Southeast Asian and Latin American governments pursuing sovereign AI.
Plus a hardcore group of Silicon Valley developers and startups who just want to run models themselves without paying API fees.
This group isn’t the scale of 1 billion consumers, but a different scale—geopolitical and sovereignty-based.

Three very different wallets, three very different sales logics.

09 Security and Cyber Defense Posture—Three Companies’ Attitudes Toward “Models as Weapons”

Anthropic released Project Glasswing in early April.
Opus 4.7 is the first production model with built-in “automatic detection and rejection of high-risk cybersecurity requests.”
Their technical report states plainly—they deliberately suppressed offensive cyber capabilities during training.
CyberGym score: 73.1, nearly tied with Opus 4.6 (73.8), showing policy choice, not capability ceiling.
Mythos Preview scores 83.1 on the same benchmark but is only available to 12 partners, and that list is confidential—leaked recently via Discord (a community guessed the URL).
Anthropic also issued an incident report.

OpenAI’s approach is different.
GPT-5.5’s system card explicitly states:
“High” level cyber risk in the Preparedness Framework, not Critical.
Their solution isn’t to reduce model capability but to add stricter input classifiers, verify identities, and push a “cyber-permissive access program”—if you want offensive capabilities, you must verify your identity first.
Mia Glaese called it “first identity-verified release,” implying—capabilities are granted, responsibility is on you.

DeepSeek’s V4 technical report is mostly blank on this.
Open-source community’s tradition: “Open code, you take it, you’re responsible.”
This is a regulatory nightmare but a developer’s paradise.
The real risk: anyone can run a model close to Opus 4.7 on their own GPU, with no interception layer.
How regulators will handle this, in late 2026 to 2027, is a critical window.

10 Market Strategy—Three Very Different Bets, But Only One Will Be the Biggest

DeepSeek aims to be the Linux of AI.
Open source + extreme cost efficiency + domestically produced chips to democratize global AI infrastructure.
Once every country, enterprise, and developer runs on your architecture—no license fees, just ecosystem taxes.
Today, Hugging Face downloads, tomorrow every domestic chip SDK defaults to DeepSeek, and soon every new AI developer’s first line of code is “from deepseek import…”.
This approach was played by Linus Torvalds twenty years ago, now by Liang Wenfeng.
The difference: LLMs are a thousand times more expensive than OS, a thousand times more hot money, and a hundred times more geopolitically valuable.

Anthropic’s goal: build the world’s leading AI operation engine for top-tier enterprises.
Target clients aren’t a billion consumers but the top 10k companies’ IT and compliance budgets.
Opus 4.7’s “narrow but deep” positioning, Mythos Preview’s scarcity, multi-cloud deployment via Bedrock/Vertex/Foundry, and a $1 trillion valuation on Forge—all tell a story:
Legal, finance, R&D, customer service—every key function runs on their models, never down.
This is the logic of law firms and investment banks, not Facebook.
Fewer clients, higher prices, near-infinite switching costs.

OpenAI’s plan: the next Windows + Office + Google super app.
ChatGPT is their distribution moat (nearly 1 billion weekly active users),
Codex locks in developers,
Operator is their computer interface,
New Mac App is their desktop placeholder.
Brockman’s “agentic computing at scale” isn’t just rhetoric; it’s OpenAI’s next decade’s slogan.

All three paths lead to Rome.
But only one will become the biggest—and that will determine the wealth distribution of the next decade’s AI industry.

All three revealed their cards in the same week.

Claude Opus 4.7 is steady—narrow but deep, reliable enough for enterprises to sign three-year contracts and pay millions.
Its strength: any mid-sized or larger company wanting to use AI as a productivity tool but afraid of errors will find no more reliable choice than Opus.

GPT-5.5 is expensive—double pricing, ambitious super app, leading agent capabilities in command line and computer control.
Its strength: if “AI controls your entire computer” becomes reality by 2027, OpenAI will be the Microsoft of this revolution.
If not, the $5/$30 pricing will be an expensive footnote.

DeepSeek V4 is fierce—open source, low cost, domestically produced chips, gradually eroding the other two’s moats.
Its strength: if geopolitical fragmentation continues, splitting global AI infrastructure into China and US ecosystems, DeepSeek becomes China’s Linux.
The probability isn’t 50%, but it’s far higher than a year ago’s 5%.

In January 2025, when R1 launched, the market’s first reaction was “China’s AI has caught up.”
This time, with V4, the new phrase the market will learn is—“China’s AI is starting to rewrite the rules.”

Once the rules are changed, they can’t be undone.

4:30 PM, the wind on California Avenue begins to cool.
Zombie Café’s cup is empty.
Alan folds the three printed copies and tucks them into his backpack.
Walking out the door, the dog lifts its head briefly, then lies down again.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
241.63K Popularity
#
CryptoMarketSeesVolatility
295.59K Popularity
#
rsETHAttackUpdate
101.86K Popularity
#
US-IranTalksStall
409.61K Popularity
#
ETHMemeCoinFLORKSurges
58.44K Popularity

Sitemap

DeepSeek V4 Debuts with a Bang: Ten First Principles in Competition Against Rivals

01 Open Source vs Closed Source—The Fundamental Route of This Battle

02 Model Architecture—The Fundamental Divergence of the Three

03 Underlying Chips—The Most Underrated News Today

04 Training Cost Structure—How Muon, FP4, and 32T tokens add up to today’s prices

05 API Pricing—How Much AI Can You Buy for One Dollar

06 The Real Economics of a Million-Token Context

07 Coding and Agent Capabilities—Three Companies Guard Their Turf

08 Audience Demographics—Three Companies Target Three Completely Different Wallets

09 Security and Cyber Defense Posture—Three Companies’ Attitudes Toward “Models as Weapons”

10 Market Strategy—Three Very Different Bets, But Only One Will Be the Biggest

Trending Topics

WCTCTradingKingPK

CryptoMarketSeesVolatility

rsETHAttackUpdate

US-IranTalksStall

ETHMemeCoinFLORKSurges

Pin