Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Runway integrates voice into videos for agents; the days are getting harder for independent TTS vendors.
Voice Embedded Directly into Video Agent, Accelerating Productization
RunwayML quietly added custom voice capabilities in the Characters API, directly integrating TTS into real-time video Agents. Developers no longer need to connect to separate voice services themselves.
This is a clear bundling strategy: Runway’s GWM-1 world model links “text-to-speech” with facial expression synthesis, enabling faster mass production of brand virtual avatars for customer service and game NPCs. The underlying technology uses ElevenLabs’ eleven_ttv_v3, which allows tone design via prompts and voice cloning with 10-second samples, with lip-sync and gestures automatically aligned.
An important signal to note: Almost no one discusses this on Twitter, but the team says this is the “highest-demand” feature. API-first release methods are inherently non-marketing, targeting those actively building rather than marketing to the masses.
Independent Voice Services Face Structural Pressure
This update positions TTS as “infrastructure layer,” no longer a standalone product. ElevenLabs provides backend support, but the bundling accelerates the trend of pure TTS being “integrated” into larger platforms.
ElevenLabs v3 excels in emotional expression and technical metrics, but Runway’s “video-first” approach is the watershed: enterprises want complete Agents, not parts. Developers will naturally migrate toward full-stack multimodal platforms.
Don’t be misled by claims like “revolutionary cloning”—mainstream vendors’ audio quality isn’t vastly different; the real edge lies in integration capabilities across multimodal scenarios.
My view: Multimodal bundling lowers the barrier for non-professional users, giving Runway an advantage amid scattered, competing players.
From an investment perspective, the market has not fully priced in the “video-first + full-stack bundling” stickiness premium. For enterprises, reducing vendor connections is inherently cost- and hassle-saving.
In simple terms: Whoever bets early on integrated video Agents will gain first-mover advantage. Multimodal platforms benefit, while standalone TTS faces pressure. Companies ignoring bundling trends are likely to be passively caught up—when “voice” becomes a default capability, deployment speed depends on API accessibility and full-chain consistency, not just single-point audio quality.
Importance: Moderate
Category: Product Launch | Industry Trend | Developer Tools
Conclusion: Product teams and enterprise buyers are currently in an “early window,” making it worthwhile to validate and enter quickly. Investors and vendors focusing solely on speech are in a “defensive period,” needing to accelerate toward multimodal and integrated capabilities. Resources will flow toward all-in-one platforms and teams capable of rapid productization; pure TTS players will have short-term disadvantages.