AI/ML News & Innovations Hub

AI/ML news, top picks, and generated innovation digests.

★ Visit ai-karthik.com
News Topics Blogs Keywords Sources Status
422Sources
8811News Items
8Top Picks
77Blogs
runningLast Run
← Back to News
METR 2026-04-10 07:00 UTC USR-0147-20260410-research-aca-95ad5342

MirrorCode: Evidence that AI can already do some weeks-long coding tasks

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity

A survey of 349 technical workers finds a median 1.4–2x self-reported change in value of work due to AI tools, expected to grow over time, though there are reasons to be skeptical of the magnitude.

Read more

Early Work on Monitorability Evaluations

We show preliminary results on a prototype evaluation that tests monitors' ability to catch AI agents doing side tasks, and AI agents' ability to bypass this monitoring.

Read more

How Does Time Horizon Vary Across Domains?

We build on our time-horizon work and analyze 9 benchmarks for scientific reasoning, math, robotics, computer use, and self-driving in terms of time-horizon trends; we observe generally similar rates of improvement to the 7-month doubling time in our original time-horizon work.

Read more
Source: METR · metr.org