When safety speaks 20 languages: Day in life of foreign workers at building site with AI as assistant
Builders deploy AI translators to deliver real-time instructions as foreign nationals make up a growing share of the construction workforce.
AI/ML news, top picks, and generated innovation digests.
8459 matching items
Builders deploy AI translators to deliver real-time instructions as foreign nationals make up a growing share of the construction workforce.
Existing measures of the AI workforce often group together a broad set of AI-related roles with varying skill requirements. This report focuses on the AI development workforce and describes our methodology for identifying AI development jobs and estimating AI development employment. It presents initial findings on the size and share of AI development jobs in the U.S. labor market. Regional breakdowns of AI development jobs are available in PATHWISE , CSET's emerging technology talent tracking tool. The post Identifying the AI Development Workforce appeared first on Center for Security and Emerging Technology .
This is the fifth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The fourth post can be found here . Thanks to Chloe Li for feedback on this post! TLDR: Via adapting the methods of Marks et al and Li et al , we train Gemini 3 Flash to have certain traits/values by midtraining it on documents about how Gemini has those properties, followed by finetuning it on synthetic chat data where it demonstrates those properties. The chat finetuning is effective for instilling the traits robustly, working OOD. We share some takeaways on how to improve midtraining & SFT effectiveness. Introduction This work closely follows Li et al (model spec midtraining, or MSM), who show that by training a model on synthetic documents before chat finetuning starts, they can shape how the model generalizes. Teaching the model reasons behind specific behaviours, rather than just the behaviours themselves, can also improve generalization. Our aim was to see how well this holds when instilling positive traits in a frontier model (Gemini 3 Flash), and to surface some of the practical details that matter for making it work. Our motivation is deep alignment : we want to train principles into the model which guide behaviour even in highly OOD behaviours. Our MVP pipeline used a "traits document" (a short bullet-pointed list of positive traits we wanted the model to exhibit) as our universe context, with a checkpoin…
Every engineering org has the files nobody wants to open. Here's what that actually costs.
On nine CodeScaleBench tasks designed to evaluate agent effectiveness in large codebases, Claude Sonnet 4.6 with the Sourcegraph MCP server outscored Fable 5, winning six of nine at roughly half the cost for each point of quality.
On its first day in an unfamiliar house, a home robot has to build memory as it goes: which rooms it has covered, where it last saw the car keys, whether the kitchen looks different now than it did this morning. And it has to answer those questions itself, where it stands, because the network isn’t always there, and it’s too slow to wait on even when it is. That memory has a concrete shape. As the robot moves, it turns what its camera sees into vectors and writes them to a store it carries onboard. To make a decision, it queries that store for the nearest matches to what it is looking at, filtered by where or when it saw them. Capture, embed, search, decide, and the loop runs entirely on the robot, in milliseconds, with no trip to a server. The engine underneath it is Qdrant Edge : the same Qdrant vector search engine, running in-process as an embedded library instead of behind an API.
Nature Machine Intelligence, Published online: 16 June 2026; doi:10.1038/s42256-026-01255-3 Pengfei Sun et al. develop a spiking neural network with a dual memory pathway, co-designed with a custom neuromorphic chip. The approach delivers over 4× throughput and 5x energy efficiency gains while using 40–60% fewer parameters than state-of-the-art implementations.
OpenAI introduces Deployment Simulation, a method to predict AI model behavior before deployment using real conversation data to improve safety and evaluation accuracy.
More of the iOS app loop, now inside Codex. The Build iOS Apps plugin lets Codex view and test your iOS app in the in-app browser, open SwiftUI previews, and hot reload edits without leaving Codex. Shoutout to the open source projects behind this: • Serve-sim powers the streaming simulator by @Baconbrix https://github.com/EvanBacon/serve-sim • SnapshotPreviews extracts SwiftUI previews by Sentry https://github.com/getsentry/SnapshotPreviews
In this session we will focus on how to bring VLM/VLA models to power real-world physical AI applications. We will focus on how to utilize SOTA of VLM (gemma 4) and or GR00T model for performing different pick and place tasks and orchestrate the outputs to control the robots using ROS 2 framework. You will learn how to bring vision-language models into real-world physical AI applications — from model selection to robot control. We'll cover: Choosing the right model for robotics — learn when to use a state-of-the-art VLM like Gemma 4 versus a specialized model like NVIDIA GR00T, and how runtime, throughput, and task requirements shape that decision. VLMs and VLAs in action — see how vision-language and vision-language-action models are applied to real manipulation tasks like pick and place, and what makes them viable for physical AI. Connecting model outputs to robot control — understand how to orchestrate model outputs through the ROS 2 framework to drive real robot behavior. Hands-on hardware demo — walk through a live example using the SO-101 or reBot Arm, putting everything together from model inference to physical actuation.
This session moves from running a local model to running a local autonomous agent. OpenClaw is a fully local AI assistant that runs on Jetson and connects to chat workflows, browser-based tools, and multi-step tasks. NemoClaw extends this with sandboxing, onboarding, inference routing, and policy controls for safer and more structured agent deployments. We'll show what changes when an AI system can take actions, use tools, and run privately on your own hardware — 24/7, at home, on the edge. Use cases include building dynamic browser-based games, prototyping smart computer vision apps, and running long research tasks without a cloud dependency. You will learn how to move from running a local model to running a fully local autonomous agent on NVIDIA Jetson. We'll cover: Building a local assistant with OpenClaw — extend the Episode 1 baseline into a full local assistant architecture that connects to chat workflows, browser-based tools, and multi-step tasks — running privately on your own hardware, 24/7. NVIDIA Orin Nano vs. AGX Orin vs. Thor — compare hardware paths side by side so you can make the right choice for your deployment constraints and performance needs. Why tool-calling models matter — see what changes when an AI system can take actions, use tools, and run autonomously, and what breaks when your model can't do it reliably. Safer local agents with NemoClaw — go further with sandboxing, onboarding, inference routing, and policy controls that make local agent deploymen…
This opening session builds the foundation for running popular OSS models such as Gemma, Qwen directly on Jetson — no cloud required. We cover when to use Ollama for rapid local prototyping versus vLLM for higher-throughput serving, show how the same workflow applies to both power different OSS models, and walk through the real decisions behind model choice, containers, quantization, and performance tuning on edge hardware. We close with a teaser of OpenClaw and a bonus take-home challenge to kick off community building. You will learn how to deploy open-source AI models on NVIDIA Jetson — no cloud required, from first launch to production-ready serving. We'll cover: Getting models running on NVIDIA Jetson — spin up popular OSS models (open-source large language models (LLMs) like Gemma and Qwen (LLMs and VLMs) using Ollama or vLLM on Jetson hardware and verify they're working end-to-end. Choosing the right inference engine — understand the practical tradeoffs between Ollama for rapid local prototyping, vLLM for higher-throughput serving, and llama.cpp, so you can pick the right tool for your use case. NVIDIA Jetson-specific serving strategies — walk through the real decisions behind model choice, containers, and performance tuning tailored for Orin and Thor, including what works, what doesn't, and why. Performance fundamentals — get introduced to quantization and speculative decoding: what they are, how they work, and when to reach for them on edge hardware. Real-world appl…
CSET’s Sam Bresnick shared his expert perspective in an op-ed published by Perry World House. In his piece, he argues that relaxing U.S. restrictions on advanced AI semiconductor exports to China would undermine long-term U.S. technological advantage. The post U.S. Semiconductors and China’s AI Military Ambitions appeared first on Center for Security and Emerging Technology .
In Adler and Ross 'Coupon subset collector problem' 2001, the following formula (7) is stated: Assuming I have a constant subset size > 1, can I get the expected number of draws to get the last n from s coupons by using formula (7) and simply summing from n. For example, if there are s=10 distinct coupons, if I want to find how many draws to collect the last n=5 distinct types, and each draw has m=3 coupons, can I sum from j=6 to s=10 to get the correct answer. Thanks in advance.
GitHub Copilot CLI for Beginners: Learn how to use slash commands to control your terminal AI agent. The post GitHub Copilot CLI for Beginners: Overview of common slash commands appeared first on The GitHub Blog .
SpaceX is trading up on its first full day on the public markets. Here's what it has to do to maintain the momentum.
A new repository-level dataset, published on GitHub under CC0-1.0, helps researchers and developers discover multilingual developer content across READMEs, issues, and pull requests. The post Accelerating researchers and developers building multilingual AI with a new open dataset appeared first on The GitHub Blog .
Starting with JetPack 7.2, NVIDIA officially supports the Yocto Project on Jetson. But the story began years earlier with meta-tegra, the community project created and maintained by Matt Madison that brought Yocto Project to Jetson and became a trusted foundation for embedded Linux developers. Join Matt Madison and NVIDIA experts as they discuss the origins of meta-tegra, the journey from community-driven project to official NVIDIA support, and what this milestone means for developers building products with Jetson. What you'll learn: - The origin of meta-tegra and the problem it was built to solve - What NVIDIA's commitment to OE4T means in practice for the codebase and community - Whether the Yocto Project is right for your Jetson product—and how to get started Join us live, bring your questions, and hear the story directly from the people who built it.
Choosing a vector database usually comes down to a tradeoff between a full search service and an in-process library. This post showcases benchmarks that compare OpenSearch and LanceDB on the COCO 2017 images embedded with SigLIP. We measure ingestion throughput, query cost, storage layout, and overall infra cost.
How LanceDB uses the Lance format's flexible data evolution features to enable scalable feature engineering for multimodal datasets.
Send one OpenTelemetry trace stream to both Arize AX and Databricks Unity Catalog so engineers can debug agents in Arize while data teams analyze the same spans in governed lakehouse storage. The post One agent, two trace destinations: Arize AX + Databricks Unity Catalog appeared first on Arize AI .
Test LLM inference natively on mobile devices with new standardized benchmarks and expanded NPU acceleration. The post MLCommons Releases MLPerf Mobile v6.0 with New Generative AI Benchmarks for On-Device LLMs appeared first on MLCommons .
AI & job cuts; China’s nuclear lead; GLP-1s & cancer++
Cloudflare is deepening our investment in AI with the addition of team members from Ensemble AI, focusing on machine learning infrastructure and efficiency.
Most <a href="https://www.
Where are your agents right now?
PLUS: Anthropic raced to DC to save its banned AI
Researchers and DPhil students from the Oxford Internet Institute are set to attend the Association of Computing Machinery (ACM) Conference on Fairness, Accountability and Transparency (FAccT) in Montréal, from 25-28 June 2026.
As AI-accelerated warfare is rapidly becoming a means of rubber-stamping killing at unprecedented speed and scale, Access Now, Amnesty International, and more than 200 civil society organizations and individuals are calling attention to the militarization of artificial intelligence (AI) technologies and for an immediate halt to the use of AI systems in the military kill chain. The post AI-accelerated warfare must stop appeared first on Access Now .
We, the undersigned organizations and individuals, are deeply alarmed by the rapid militarization of artificial intelligence (AI) technologies. AI systems embedded into military kill chains are accelerating the speed and scale of military assaults in a manner that creates significant new risks for accountability in conflict and risks facilitating violations of international criminal, human rights, and humanitarian law. The post Joint statement on AI in warfare appeared first on Access Now .
The post Heads, founders win. Tails, platforms win. AI fund Activate wins either way appeared first on The Ken .
Recent product updates and news from around the community.
In our ongoing SIG Spotlight series, we shine a light on the groups that keep the Kubernetes project moving forward. This time, we catch up with SIG Storage , the group responsible for persistent data, volume management, and the interfaces that connect Kubernetes workloads to the storage systems beneath them. We spoke with Xing Yang , Co-Chair of SIG Storage and Software Engineer at VMware by Broadcom, about the SIG's history, the features shipping in recent Kubernetes releases, and where storage in Kubernetes is headed as AI workloads become the norm. Introductions Could you introduce yourself and share your role(s) within SIG Storage? My name is Xing Yang , a software engineer at VMware by Broadcom. I'm a co-chair in SIG Storage, alongside another co-chair Saad Ali from Google. There are also two Tech Leads in SIG Storage: Michelle Au from Google and Jan Šafránek from Red Hat. What first drew you to storage in Kubernetes, and how did you start contributing? I have always been working in the storage domain, so SIG Storage was a natural place for me to get started when I began to learn Kubernetes. I started attending SIG Storage meetings , trying to figure out what I could do to help. This was before the first Container Storage Interface (CSI) release — lots of things were still evolving. It was a very exciting time. What subprojects or areas do you actively maintain or review today? I'm a maintainer in Kubernetes CSI. There are multiple CSI sidecars — such as csi-provisione…
The US government yanked Anthropic's newest models days after launch, while state attorneys general opened formal process against OpenAI. That turns frontier capability into something investors have to discount: a model can be state-of-the-art on Monday and policy-frozen by Friday. The market still wants the upside, but the asset now has a kill-switch.
This is the fourth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The third post can be found here . Since SFT is the cause for many safety relevant properties , a natural strategy is to filter out rollouts from SFT that have undesirable properties. However, as we show in this section (and in forthcoming MATS work), SFT data filtering frequently works surprisingly poorly. In this post, we investigate hypotheses for why SFT filtering fails. TL;DR: We discuss seven hypotheses for why SFT filtering works surprisingly poorly We analyze three hereditary traits that SFT-only Gemini has that other models do not: negative emotion, date confusion, and blackmail in the (highly contrived) agentic misalignment scenario We use a “post-training diffing pipeline” between Gemini and Olmo to show that the cause of date confusion and blackmail is largely surprising transfer of behaviors from the SFT teacher model. Notably, there exist small sets of prompts where switching the teacher model for the rollout removes date confusion and blackmail, but dropping the prompts does not. Negative emotion is less affected by the teacher model, but this may be because the Olmo prompt distribution we are SFTing on underspecifies the behavior. Takeaways: It’s hard to remove behaviors via filtering But if you can get a teacher model to have a behavior (e.g. via RL), then transferring that in the future is easier…
It's a one-way door and we weren't ready for it.
PLUS: Set up your Claude Code like The Creator Boris Cherny
OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The Nemotron 3 Ultra paper is available here: https://research.nvidia.com/labs/nemotron/Nemotron-3-Ultra/ Free Rendering course and source code: https://users.cg.tuwien.ac.at/zsolnai/gfx/rendering-course/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi Thumbnail design: https://felicia.hu #nvidia
How Pinecone indexes vectors: the algorithms it uses (Ananas, PQFS, and IVF), how it selects one per slab automatically by size, and why it has never used HNSW.
A dispatch from Egypt's new capital city in the desert. $58 billion, Chinese-financed, almost entirely empty.
Plus: Fable, iPhone babies & bad CEOs++
This is the third in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The second post can be found here . In this short post, we describe a surprising finding: most safety relevant properties in Gemini seem to be caused by the combination of pretraining and SFT, not other training stages like RL. We do not want to overstate this claim as applying to other model families, and we also note that this may change in future Gemini versions. Nevertheless, this result was counter to our initial expectations and will inform future safety work on our team, and so we felt that it was important to share with the broader safety community. Experiment We perform SFT using the Gemini mixture on the pre-training only versions of Gemini 3.1 Pro and Gemini 3 Flash. We then compare these Post-SFT models to the production versions of Gemini 3.1 Pro and Gemini 3 Flash on different safety relevant benchmarks: Error bars are 95% confidence intervals on the evals. The main result is that the blue bars (SFT-only models) and orange bars (production models) are remarkably similar across evals . An important implication is that for Gemini, SFT is a high leverage place to intervene for model safety and behavior, and we plan to try to intervene here in the future. Brief Descriptions of Each Set of Benchmarks: ODCV refers to the benchmark in https://arxiv.org/abs/2512.20798 Alignment evals refer to a version of Petr…
This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore. As robots advance in terms of dexterity and other physical capabilities , it becomes more likely that humans may find themselves working alongside them. If that happens, how will robots’ emotional capabilities need to advance for them to successfully work with people? In a recent study, researchers trained collaborative robots to read human emotions by not only accounting for facial expressions, but also contextual factors in the interactions as well. Through experiments with 40 volunteers, the researchers then evaluated how a robot’s ability to read human emotions and adjust its behavior in turn impacted a human’s perception of the robot and its capabilities as the two collaborated on tasks. The results —which show that the emotional capabilities of robots only go so far with humans—were published 18 May in IEEE Robotics and Automation Letters . Seung Chan Hong led the study as part of his undergraduate thesis while studying at Monash University, in Melbourne, Australia. He notes that, while there has been a lot of hype in the advancing physical abilities of robots, this is only one piece of the puzzle. “We need to also innovate when it comes to them actually interacting with humans, not just their physical capabilities,” he says. This prompted him to dig deeper into the emotional aspects of human-robot interactions. First, Hong and his co-authors decided to train a robot to rea…
Could you provide a link to the Wall Street Journal article where this information comes from please? I would like to read the whole article.
Piketty’s road to Animal Farm
We are in the strangest timeline.