All articles
ResearchMarch 19, 2026· 7 min read
We rebuilt the browser-agent leaderboards
A fairer, reproducible way to compare browser agents on real tasks.
MT
Marco T.
Engineering
Benchmarks age badly. We rebuilt our browser-agent leaderboards around reproducible tasks and a fixed harness, so the numbers mean the same thing every time you read them.
What we measure
- Task success rate on a fixed set of real sites
- Median steps to completion
- Wall-clock time per task
Open by default
The harness and tasks are open so anyone can rerun them. A leaderboard you cannot reproduce is just a screenshot.
Build it on Ferr
Launch your first cloud browser for free.
Keep reading
All articlesResearchMarch 12, 2026
An autopsy of a Claude Code deep-research run
We traced a long deep-research session step by step to see where the time and tokens went.
Read article8 min read
ResearchSeptember 18, 2025
Benchmarking remote browsers
How we measure session start time, throughput, and stability — and what the numbers say.
Read article7 min read
Launch WeekJune 27, 2026
Launch Week v3: everything we shipped
Five days, five launches. Here is the full recap of Launch Week v3, from faster cold starts to Ferr Skills.
Read article6 min read