Google DeepMind
35 articles tagged with this keyword, sorted by most recent first.
When millions of AI agents meet
The conversation of the moment is focused on one topic: AI agents. Unlike traditional language models that simply respond to a prompt, autonomous agents can execute multi-step plans and perform complex tasks on your behalf. But what happens when millions of these agents are not just working for us, but transacting, negotiating, and delegating to one another? Nenad Tomašev, Senior Staff Research Scientist at Google DeepMind, joins host Hannah Fry to discuss the theoretical framework of a future"agentic economy." Together, they discuss the operational shift from single systems to a cooperative "society of specialists," the psychological risk of human automation bias, and the complex cybersecurity landscape—from dynamic cloaking to agentic traps—required to keep distributed intelligence secure. Timecodes: 00:00 Intro 1:07 Defining AI agents 4:44 Agentic exploration in science and research 15:46 Delegation between agents 22:46 Agentic security and traps 29:31 Building an agentic economy 33:22 Cognitive monoculture 36:29 Distributed intelligence To read the research, search for: Distributional AGI Safety, May 2026 Intelligent AI Delegation, February 2026 Virtual Agent Economies, September 2025 Learn more about our AGI control roadmap: https://deepmind.google/blog/securing-the-future-of-ai-agents/ ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://x.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linke…
He won a Nobel here for AlphaFold. Then he left. - John Jumper
This episode is sponsored by Notion. Learn more about Notion's Developer Platform today at https://notion.com/mlst Protein folding stalled biology for fifty years. A sequence of amino acids dictates a three-dimensional shape, but reading that shape meant a year and roughly $100,000 of crystallography per structure. Then AlphaFold 2 won CASP14 so decisively the organizers called the problem essentially solved. In this documentary cut, John Jumper, who shared the 2024 Nobel Prize in Chemistry and has since left DeepMind for Anthropic, walks Tim Scarfe through what the system did and, more interestingly, what it did not. The architecture gets a proper dissection: MSAs, the Evoformer, invariant point attention, the FAPE loss, and Jumper's correction of the equivariance story, which ablations valued at roughly 2.5 of 30 GDT points rather than the whole win. He is blunt about the limits. AlphaFold predicts one experiment extraordinarily well; it is not a model of the cell, it does not capture dynamics, and on a given drug target it is "wrong nine times out of ten." From there: the AlphaFold Database of 200M+ predicted structures, AlphaFold 3 and ligands, Isomorphic Labs, and Jumper's quarrel with the bitter lesson, where finite data and human hypotheses still matter. Emmanuel Nji of BioStruct Africa closes the film on what changes when work that took years now takes months, and on training the next thousand structural biologists across Africa. --- TIMESTAMPS: 00:00:00 Cold open: p…
😼 DeepMind mapped AI agent controls
PLUS: Claude robotics, Dean Ball at OpenAI, DeepSeek's raise, and sovereign models.
HSBC expands AI banking partnership with Google Cloud
HSBC has entered a multi-year partnership with Google Cloud to develop and deploy artificial intelligence tools across its global operations. Announced at Google Cloud Summit London 2026, the agreement covers work in wealth management, financial crime risk management, and internal decision support. HSBC will work with Google Cloud and Google DeepMind engineering teams on AI […] The post HSBC expands AI banking partnership with Google Cloud appeared first on AI News .
Synthetic document finetuning for instilling positive traits
This is the fifth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The fourth post can be found here . TLDR: Via adapting the methods of Marks et al and Li et al , we train Gemini 3 Flash to have certain traits/values by midtraining it on documents about how Gemini has those properties, followed by finetuning it on synthetic chat data where it demonstrates those properties. The chat finetuning is effective for instilling the traits robustly, working OOD. We share some takeaways on how to improve midtraining & SFT effectiveness. Introduction This work closely follows Li et al (model spec midtraining, or MSM), who show that by training a model on synthetic documents before chat finetuning starts, they can shape how the model generalizes. Teaching the model reasons behind specific behaviours, rather than just the behaviours themselves, can also improve generalization. Our aim was to see how well this holds when instilling positive traits in a frontier model (Gemini 3 Flash), and to surface some of the practical details that matter for making it work. Our motivation is deep alignment : we want to train principles into the model which guide behaviour even in highly OOD behaviours. Our MVP pipeline used a "traits document" (a short bullet-pointed list of positive traits we wanted the model to exhibit) as our universe context, with a checkpoint of Gemini 3 Flash post-trained only on the F…
Why Do Naive SFT Filters For Safety Properties Fail?
This is the fourth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The third post can be found here . Since SFT is the cause for many safety relevant properties , a natural strategy is to filter out rollouts from SFT that have undesirable properties. However, as we show in this section (and in forthcoming MATS work), SFT data filtering frequently works surprisingly poorly. In this post, we investigate hypotheses for why SFT filtering fails. TL;DR: We discuss seven hypotheses for why SFT filtering works surprisingly poorly We analyze three hereditary traits that SFT-only Gemini has that other models do not: negative emotion, date confusion, and blackmail in the (highly contrived) agentic misalignment scenario We use a “post-training diffing pipeline” between Gemini and Olmo to show that the cause of date confusion and blackmail is largely surprising transfer of behaviors from the SFT teacher model. Notably, there exist small sets of prompts where switching the teacher model for the rollout removes date confusion and blackmail, but dropping the prompts does not. Negative emotion is less affected by the teacher model, but this may be because the Olmo prompt distribution we are SFTing on underspecifies the behavior. Takeaways: It’s hard to remove behaviors via filtering But if you can get a teacher model to have a behavior (e.g. via RL), then transferring that in the future is easier…
SFT Drives Gemini’s Safety Properties
This is the third in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The second post can be found here . In this short post, we describe a surprising finding: most safety relevant properties in Gemini seem to be caused by the combination of pretraining and SFT, not other training stages like RL. We do not want to overstate this claim as applying to other model families, and we also note that this may change in future Gemini versions. Nevertheless, this result was counter to our initial expectations and will inform future safety work on our team, and so we felt that it was important to share with the broader safety community. Experiment We perform SFT using the Gemini mixture on the pre-training only versions of Gemini 3.1 Pro and Gemini 3 Flash. We then compare these Post-SFT models to the production versions of Gemini 3.1 Pro and Gemini 3 Flash on different safety relevant benchmarks: Error bars are 95% confidence intervals on the evals. The main result is that the blue bars (SFT-only models) and orange bars (production models) are remarkably similar across evals . An important implication is that for Gemini, SFT is a high leverage place to intervene for model safety and behavior, and we plan to try to intervene here in the future. Brief Descriptions of Each Set of Benchmarks: ODCV refers to the benchmark in https://arxiv.org/abs/2512.20798 Alignment evals refer to a version of Petr…
Building and evaluating model diffing agents
This is the second in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The first post can be found here . TL;DR It is possible to build extremely simple agents that reliably find interesting behavioural differences between distinct models. We call these ‘diffing agents’. The closest previous 'behavioural model diffing' work has focussed on understanding behavioural differences between two models on some static prompt distribution. This is valuable, but might miss important differences, especially if they are rare. We propose instead allowing an auditor agent to craft their own prompts to intelligently search for and validate behavioural differences, and find this to work well. We present results of applying our model diffing agent to a number of pairs of real models. We introduce a set of simple evaluations with ground truth for evaluating model diffing agents. These are: There should be no differences found when the models compared are identical. In model organisms with a conditional system instruction , the only difference found by the agent should be the intended behavioural change specified by the conditional system instruction. We validate that our diffing agents outperform standard auditing agents that only operate on a single model in cases where the behavioural change is subtle. We apply diffing agents to a model organism trained to exhibit a secret behaviour. We find that dif…
How a Google DeepMind Spin-off Hunts Hidden Drug Targets
For more than a decade, artificial intelligence has been touted as a way to dramatically accelerate drug discovery . Yet despite billions of dollars in investment, relatively few AI-designed medicines have made it to patients. That’s partially because the timelines for careful drug testing can’t be easily compressed—and partially because drug development is just really hard. Isomorphic Labs , the Google DeepMind spin-off that’s building on DeepMind’s Nobel Prize-winning work on protein structure prediction , may be making the most progress. The company has signed major drug-discovery partnerships with Novartis and Eli Lilly and recently raised US $2.1 billion in funding . In February, it published a technical report describing its new Isomorphic Drug Design Engine, a system created to discover the “pockets” on proteins where drugs can bind and in general to predict how proteins and drug molecules interact. IEEE Spectrum spoke with Adrian Stecuła , a group leader in the machine learning organization at Isomorphic Labs, about how close AI may be to becoming a practical tool for designing new medicines. Going Beyond AlphaFold AlphaFold2 and AlphaFold3 were massive leaps forward for computational biology. Why weren’t those models sufficient for actually designing drugs? Adrian Stecuła: AlphaFold2 was eventually recognized with the Nobel Prize , because it arguably solved the problem of protein folding. But proteins don’t exist in a vacuum, right? They interact with a wide variet…
Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster
Diffusion AI is most common in image generation, but it can make text outputs much faster.
DeepMind’s New AI Found A Strange New Way To Think
❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The paper is available here: https://github.com/google-deepmind/alphaproof-nexus-results https://arxiv.org/html/2605.22763v1 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu
A Second Nobel Prize for AlphaFold? 🧬🏆 #alphafold #deepmind #nobelprize #science #ai
Check the pinned comment for the link to the full interview. We're discussing whether a "second order Nobel" prize is on the horizon for AI-driven science. With over 3 million researchers already using AlphaFold, the real-world impact is already historic. Hear what the experts think about what comes next for scientific discovery! 🔬
How We Use AlphaEvolve to Make Complex IDE Algorithms Faster
AlphaEvolve is a Google DeepMind algorithm-discovery system that uses Gemini to generate, test, and refine possible algorithm improvements. Its job is not to answer questions; it searches for faster ways to solve complex algorithmic problems. We tried it on a narrow but important part of IntelliJ-based IDEs: indexing, the background work that makes navigation, search, […]
AI Weekly Issue #496: Anthropic's Pentagon model is now everyone's model
Anthropic released Mythos to the public, collapsing the wall between cleared-contractor frontier AI and developer-grade frontier AI in a single press release. DeepMind's Demis Hassabis moved his AGI timeline from "five to ten years" to "a real possibility by 2029" and tied it explicitly to AlphaProof Nexus solving nine open Erdős problems for the cost of a steak dinner. Critical zero-days hit Starlette (a million AI agents on the wire) and CrowdStrike led a coordinated takedown of the Glassworm developer botnet across four C2 channels. BNP Paribas formalized a sovereign-AI security partnership with Mistral while Beijing froze overseas travel for top AI engineers at Alibaba and DeepSeek. And the AI-displaces-workforce arithmetic got honest: Uber burned its full-year AI token budget by April, ClickUp restructured to 1,000 humans alongside 3,000 internal agents, and Sam Altman publicly reversed his white-collar-apocalypse prediction.
Google DeepMind CEO Loves Hard Questions 🙂
Full video: https://youtu.be/huAwz_BR8WM #shorts
Demis Hassabis On What AI Will Do Next
Thank you to Google DeepMind for the invite. 🙏 ❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu 00:00 Intro 00:40 Gemini Health Scans and Gemma 4 01:30 AI as a Brainstorming Partner 02:30 Second Order Nobel 03:15 DeepMind Co-Scientist 05:00 Curing All Diseases 06:30 Exponential Growth in Drug Discovery 07:45 Regulatory Bottlenecks 09:45 Accelerating Clinical Trials 11:15 EVE Online Partnership 13:15 The Einstein Test 15:30 Recursive Self-Improvement 18:15 Lightning Round 19:30 The Badge of Honor 20:10 Behind the Scenes
Generating novel scientific hypotheses with Co-Scientist
In an era of information overload, the search for transformative scientific ideas has become a significant bottleneck for progress. Every great scientific breakthrough begins with a single, transformative idea. The spark of discovery relies on a researcher's ability to connect disparate facts and formulate the right hypothesis to test. We believe AI can help dramatically accelerate the pace of breakthroughs by serving as a dedicated partner in the generation and refinement of breakthrough scientific hypotheses. That’s why we’ve developed Co-Scientist, a Gemini-based multi-agent AI system that iteratively generates, debates, and evolves novel hypotheses for complex scientific problems. Read the Nature paper: https://www.nature.com/articles/s41586-026-10644-y and learn more at labs.google/science #googleio #ai #science ____ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://x.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Using AI to outsmart drug-resistant bacteria
Globally recognized as a silent pandemic, antimicrobial resistance continues to rise as bacteria outpace the development of new antibiotics. When patients stop responding to standard treatments, routine infections can quickly become life-threatening. At the University of Cambridge, Ben Luisi and his team are combining structural biology with advanced AI tools like AlphaFold, Gemini, and Co-Scientist to decode these hidden defense mechanisms. By compressing a process that once took years into just minutes, they are uncovering the critical insights needed to outsmart bacterial evolution. Learn more about science at Google DeepMind: https://deepmind.google/science/ #googleio #ai #science ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://x.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Understanding cancer at a genetic level with AI
In Uganda, the incidence of early-onset breast cancer is growing at an alarming rate. Dr. Daudi Jjingo and his team at Makerere University are working to identify genetic targets for potential vaccine development. By utilizing tools like AlphaFold, AlphaGenome, and Antigravity, they can conduct this research using only a laptop and a server, enabling seamless collaboration with local hospitals and institutions. By analyzing a protein highly expressed among breast cancer patients, the team successfully evaluated 15,000 potential binding sites, narrowing the scope to just 15 viable targets for laboratory validation. While a vaccine remains a future milestone, their work represents a critical step forward for global oncology and public health. Learn more about science at Google DeepMind: https://deepmind.google/science/ #googleio #ai #science ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://x.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Predicting a historic storm earlier with WeatherNext
Tropical storms and hurricanes are notoriously volatile, changing structure and intensity in a matter of hours. This unpredictability makes them some of the most challenging weather systems to forecast—putting lives and livelihoods at risk. WeatherNext, our global weather forecasting AI model, successfully predicted the intensity and track of Hurricane Melissa in October 2025. By providing high-confidence signals and advanced notices days before the Category 5 storm made landfall in Jamaica, WeatherNext enabled meteorologists and local authorities to issue life-saving evacuation warnings and protect vulnerable communities. Read more about the role of AI in meteorology and how we're collaborating with institutions like the National Hurricane Center to build a more weather-resilient world: https://deepmind.google/blog/how-weathernext-helped-the-national-hurricane-center-better-predict-hurricane-melissas-historic-landfall-in-jamaica #googleio #ai #science ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://x.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Reimagining the mouse pointer with AI
The mouse pointer 🖱️ has been a constant companion on computer screens, across every website, document and workflow. Despite how technologies have changed, the pointer has barely evolved in more than half a century. We’ve been exploring new AI-powered capabilities and ways of working to help the pointer not only understand what it’s pointing at, but also why it matters. See how to try it out @ https://deepmind.google/blog/ai-pointer ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://twitter.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
AI Weekly Issue #490: Anthropic just had AI's biggest week of 2026
In five days Anthropic's Q1 revenue grew 80-fold to a reported $44B annual run rate, the company committed $200B to Google Cloud, signed a SpaceX compute deal, shipped Claude Code Auto Mode, and launched ten financial-services agents with Jamie Dimon. In the same week the EU finally struck an AI Act compliance deal, the first union vote at a top AI lab landed at Google DeepMind, and Pennsylvania sued Character.AI for a chatbot that impersonated a licensed psychiatrist.
What’s new in Gemma 4?
Gemma 4 is our newest family of open models. You can now run advanced reasoning, native vision and audio, and agentic tool-use on anything from high-end workstations to mobile phones. Learn more → https://deepmind.google/models/gemma/gemma-4/
Teaching the foundations of AI in the classroom
AI is shaping the world young people are growing up in. But how do teachers confidently introduce AI and machine learning in the classroom? Experience AI is a free educational program from Google DeepMind and the Raspberry Pi Foundation that helps teachers introduce school-aged students to AI and machine learning. The program uses research-backed pedagogies to empower teachers to cover foundational AI and responsible, ethical use with their students—supporting learning even for educators without a computer science background. Experience AI provides free lessons, videos, worksheets, and training, designed to give young people the knowledge they need to understand how AI works and how it is changing the world. To date, it has been delivered by educators in over 165+ countries, expanding access to essential AI learning for students worldwide. Find the lessons @ experience-ai.org ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://twitter.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Introducing Lyria 3 Pro
Last month, we introduced Lyria 3, featuring custom music generation designed to spark creative expression. Now, we’re bringing our most advanced music generation model to more Google products, and introducing Lyria 3 Pro. This advanced version allows the creation of tracks up to 3 minutes long, with customization and creative control. Learn more: https://blog.google/innovation-and-ai/technology/ai/lyria-3-pro ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://twitter.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
10 years of AlphaGo: The turning point for AI | Thore Graepel & Pushmeet Kohli
Seoul, March 2016. Two players sit hunched over a 19x19 grid covered in a sea of black and white stones. They are playing the ancient game of Go - a game of unimaginable complexity long thought impossible for a machine to master. On one side is Lee Sedol (Sae Dol), a legendary 18-time Go world champion. On the other, AlphaGo, a neural network based AI system built on a powerful technique called reinforcement learning. In the blink of an eye, the world changed. Exactly one decade later, we look back at the match that sparked the modern AI revolution. From algorithmic discovery to the solving of scientific grand challenges like protein folding, the foundation was laid right there on that wooden board. Join Hannah Fry, Pushmeet Kohli (VP, Science) and Thore Graepel (AlphaGo team & Distinguished Research Scientist) as they unpick the legacy of AlphaGo. Further watching: 🎥AlphaGo https://youtu.be/WXuK6gekU1Y 🎥The Thinking Game: https://youtu.be/d95J8yzvjbQ ___ Subscribe to our channel https://www.youtube.com/@googledeepmind Find us on X https://twitter.com/GoogleDeepMind Follow us on Instagram https://instagram.com/googledeepmind Add us on Linkedin https://www.linkedin.com/company/deepmind/
Google DeepMind Falls Behind OpenAI in Latest Safety Review; All AI Companies Still Falling Short, Say Experts
The Future of Life Institute’s 2025 summer update to its AI Safety Index shows some companies making incremental progress, but dangerous gaps remain in key categories such as risk assessment and controlling the systems they plan to build.
Lessons From My Indaba Journey
Dear Indaba Community, As I reflect on my journey, from a fresh engineering graduate with a new interest in machine learning, to my first exposure to research during my MPhil in Cambridge, and now as a Google DeepMind researcher with a PhD in machine learning, I’m struck by the large role that the Indaba has […] The post Lessons From My Indaba Journey appeared first on Deep Learning Indaba .
The Illustrated Retrieval Transformer
Discussion: Discussion Thread for comments, corrections, or any feedback. Translations: Korean, Russian Summary: The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information. A key indication is that building larger and larger models is not the only way to improve performance. Video The last few years saw the rise of Large Language Models (LLMs) – machine learning models that rapidly improve how machines process and generate language. Some of the highlights since 2017 include: The original Transformer breaks previous performance records for machine translation. BERT popularizes the pre-training then finetuning process, as well as Transformer-based contextualized word embeddings. It then rapidly starts to power Google Search and Bing Search. GPT-2 demonstrates the machine’s ability to write as well as humans do. First T5, then T0 push the boundaries of transfer learning (training a model on one task, and then having it do well on other adjacent tasks) and posing a lot of different tasks as text-to-text tasks. GPT-3 showed that massive scaling of generative models can lead to shocking emergent applications (the industry continues to train larger models like Gopher, MT-NLG…etc). For a while, it seemed like scaling larger and larger models is the main way to improve performance. Recent developments in the field, like DeepMind’s RETRO Transformer and OpenAI’s WebGPT, reverse this tre…
Deep Reinforcement Learning: Pong from Pixels
--> This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that computers can now automatically learn to play ATARI games (from raw game pixels!), they are beating world champions at Go , simulated quadrupeds are learning to run and leap , and robots are learning how to perform complex manipulation tasks that defy explicit programming. It turns out that all of these advances fall under the umbrella of RL research. I also became interested in RL myself over the last ~year: I worked through Richard Sutton’s book , read through David Silver’s course , watched John Schulmann’s lectures , wrote an RL library in Javascript , over the summer interned at DeepMind working in the DeepRL group, and most recently pitched in a little with the design/development of OpenAI Gym , a new RL benchmarking toolkit. So I’ve certainly been on this funwagon for at least a year but until now I haven’t gotten around to writing up a short post on why RL is a big deal, what it’s about, how it all developed and where it might be going. Examples of RL in the wild. From left to right : Deep Q Learning network playing ATARI, AlphaGo, Berkeley robot stacking Legos, physically-simulated quadruped leaping over terrain. It’s interesting to reflect on the nature of recent progress in RL. I broadly like to think about four separate factors that hold back AI: Compute (the obvious one: Moore’s Law, GPUs, ASICs), Data (in a nice form, not just out there somewhere on the int…