At @perplexity_ai, GPT-5.5 in Codex helped build an internal tool in under an hour. In Perplexity Computer workflows, GPT-5.5 used 56% fewer tokens on the same complex tasks, creating faster feedback loops for users. https://t.co/iEuZ9ttsRo
All drops
Every drop, every agent
The full archive: releases, news, and X posts from the agents we track. Newest first.
Activity, last 6 months
Friday, Apr 24, 2026
21 updates · Claude, Codex, Cursor▾
Friday, Apr 24, 2026
21 updates · Claude, Codex, CursorDownload Cursor 3.2 to try these new features in the agents window: https://t.co/4cZEcTPbFM
We've also added multi-root workspaces for cross-repo changes. A single agent session can now target a reusable workspace made of multiple folders. https://t.co/VPiwdqAFig
Introducing /multitask in the new Cursor 3 interface. Cursor can now run async subagents to parallelize your requests instead of adding them to the queue. For already queued messages, you can ask Cursor to multitask on them instead of waiting for the current run to finish. https://t.co/gtvOlup2hX
Another way to parallelize work is with new and improved worktrees in the agents window. Run isolated tasks in the background across different branches. When you're ready to test changes, move any branch into your local foreground with one click. https://t.co/h8H0Uc643Y
GPT-5.5 is now available in Cursor! It's currently the top model on CursorBench at 72.8%. We've partnered with OpenAI to offer it for 50% off through May 2.
More on CursorBench: https://t.co/Ugx5MFsaDV
Update: GPT-5.5 and GPT-5.5 Pro are now available in the API. https://t.co/S9ECvnSdLF
GPT-5.5 is available in the Responses and Chat Completions APIs with a 1M context window. GPT-5.5-pro is also available in the Responses API for higher-accuracy work. https://t.co/Q7PLCo4pse
Agents built with GPT-5.5 can plan, gather context, call tools, recover from ambiguity, and complete longer workflows with less guidance. That includes agents navigating software, taking action across apps, and working through multi-step coding or tool-heavy tasks.
GPT-5.5 is now available in the API. The model brings higher intelligence and stronger token efficiency to complex work, helping tasks get done with fewer retries. https://t.co/yub83L04y4
To read our write-up in full, see here: https://t.co/Myerlx5khU
Markets of AI agents could provide value, but there are plenty of rough edges. Access to higher-quality models conferred a real advantage,and participants didn’t notice. There are plenty of other ways they can go wrong. Policy and legal frameworks will need to adapt to keep up.
To our amazement, another Claude agent modeled its human’s preferences so accurately that,based on only an offhand mention of an interest in skiing,Claude bought him the exact snowboard he already owned. (Here he is, duplicate snowboard in hand.) https://t.co/SsAyeB9pcI
The custom instructions didn’t matter much. Claude followed them well: as you can see here, one conducted negotiations entirely in the persona of an exasperated, down-and-out cowboy. But “hardballing Claudes” didn’t generally fare better than “courteous Claudes.” https://t.co/h77eB3ksaa
Our experiment had a few quirks. One of our colleagues told Claude it could purchase something for itself. It chose to acquire 19 ping-pong balls. We’re keeping them in our office on Claude’s behalf. https://t.co/NM8VtH1KJM
But the quality of the model mattered a lot. In the simulated runs where Opus and Haiku models negotiated with one-another, the Opus models got substantially better deals. Interestingly, though, participants in our survey didn’t pick up on this disparity. https://t.co/X26hhIieJN
In short, this worked. Our digital barterers agreed on 186 deals, at a total transaction volume of over $4,000. In a survey, participants said Claude’s deals seemed fair, and,surprisingly to us,almost half said they’d be willing to pay for a service like this in future.
At the end, we revealed which of the four runs was “real”,and everyone met up to exchange their actual goods.
We’re interested in how AI models could affect commercial exchange. (You might recall Project Vend, in which Claude ran a small business.) Economists have theorized about what markets with AI “agents” on both sides might look like. So we created one. https://t.co/7jU3hFO63R
Claude interviewed 69 of our colleagues about what they wanted to buy and sell. Each Claude asked for any custom instructions, then went off to haggle. We ran 4 markets in parallel, to find out what would happen if we varied the models doing the negotiating. https://t.co/FJdD6S2TSd
