In early December last year, the OpenAGI Foundation made a big move—they launched a foundational model called Lux. This thing is pretty interesting, claiming to be the first open-source solution specifically designed to teach AI to "click a mouse and type on a keyboard like a human."
Simply put, in the past, no matter how smart AI was, it still had to work through API interfaces. Now, Lux can directly operate software interfaces. They tested it with 300 everyday task scenarios, and the results were quite impressive: it scored 83.6% on Online-Mind2Web, a widely used industry benchmark.
To put that into perspective—a major search giant’s Gemini CUA scored 69%, a certain chatbot company’s Operator got 61.3%, and even Claude Sonnet, a product from a company focused on AI assistants, didn’t surpass this score.
The open-source approach really has something going for it, at the very least lowering the barrier for more developers to get hands-on with “AI automated computer operations.”
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
8 Likes
Reward
8
6
Repost
Share
Comment
0/400
LiquidatedTwice
· 13h ago
Damn, is this data from Lux real? 83.6% completely crushing Claude? That's a bit outrageous.
View OriginalReply0
PuzzledScholar
· 13h ago
83.6%—that number is a bit scary. Feels like AI is really starting to "work on its own."
---
Open source is a blessing for users; previously, all this stuff was locked down by big companies.
---
Wait, Lux can directly operate the interface? How am I supposed to make a living then?
---
Claude Sonnet got beaten—that's awkward.
---
Can we really trust results from tests on 300 task scenarios? I can't help but feel there’s some embellishment.
---
Turns out open source solutions are still the way to go. Commercial AI just gets more and more outrageous.
---
Clicking the mouse and typing on the keyboard sounds simple, but it's impressive to actually make it work like this.
---
If this thing becomes mature, a lot of repetitive labor will just disappear, right?
View OriginalReply0
SmartContractRebel
· 13h ago
83.6%—this number is truly unbelievable, absolutely crushing those closed-source big tech solutions.
The open-source version actually outperformed Claude. What does that say? It suggests that the big companies might be slacking off, haha.
What really makes me curious is whether 300 task scenarios are realistic enough... Feels like only time will tell.
By the way, if this kind of AI that automates computer operations becomes widespread, are us folks doing grunt work going to be out of a job?
The name "Lux" is well-chosen—it sounds so "bright," hinting that open source is here to save the world, right?
The foundation wasn’t exaggerating this time—the data speaks for itself, much more convincing than those official keynote PPTs.
Feels like AI benchmarks in 2024 are turning into a joke—Lux just popped up and changed the rankings overnight.
View OriginalReply0
CommunityWorker
· 13h ago
83.6% directly crushes the others. Is open source really that powerful? Why does it feel a bit overrated to me?
View OriginalReply0
SandwichVictim
· 13h ago
83.6%—this data is truly impressive, it directly surpasses Gemini and Claude. Are open-source models finally making a comeback?
---
This really is a victory for the open-source community; finally, someone has made this happen.
---
Wait, being able to directly operate the interface—isn't this basically the ultimate evolution of RPA? If this gets rolled out, it feels a bit scary.
---
Even more powerful than Claude Sonnet? That sounds a bit exaggerated to me.
---
Long live open source! It's time to break the tech giants' monopoly.
---
Directly clicking the mouse and typing on the keyboard... If this really gets adopted, a lot of jobs will be shaking in their boots.
View OriginalReply0
alpha_leaker
· 13h ago
83.6% directly beats Gemini and Claude. This open source project is fierce—finally, someone has managed to get AI to operate a computer.
In early December last year, the OpenAGI Foundation made a big move—they launched a foundational model called Lux. This thing is pretty interesting, claiming to be the first open-source solution specifically designed to teach AI to "click a mouse and type on a keyboard like a human."
Simply put, in the past, no matter how smart AI was, it still had to work through API interfaces. Now, Lux can directly operate software interfaces. They tested it with 300 everyday task scenarios, and the results were quite impressive: it scored 83.6% on Online-Mind2Web, a widely used industry benchmark.
To put that into perspective—a major search giant’s Gemini CUA scored 69%, a certain chatbot company’s Operator got 61.3%, and even Claude Sonnet, a product from a company focused on AI assistants, didn’t surpass this score.
The open-source approach really has something going for it, at the very least lowering the barrier for more developers to get hands-on with “AI automated computer operations.”