AI/ML News & Innovations Hub

AI/ML news, top picks, and generated innovation digests.

★ Visit ai-karthik.com
422Sources
8459News Items
8Top Picks
68Blogs
successLast Run

Latest AI/ML News

8459 matching items

EU AI Office 2026-06-17 07:07 UTC Score 36.0 AI-165-20260617-regional-ai--034af26b Full article

Digital Decade 2026: eGovernment Benchmark 2026

Digital Decade 2026: eGovernment Benchmark 2026 dumimar Wed, 06/17/2026 - 09:07 The eGovernment Benchmark 2026 assesses availability and functionality of a selection of digital public service websites and portals across nine life events. The eGovernment Benchmark 2026 evaluates the availability and functionality of selected digital public service websites and portals at both Member State and European Union (EU) level. It covers 96 key services across nine life events: moving, transport, starting a small claims procedure, family, career, studying, health, starting a business and running a business. The Benchmark supports the monitoring framework of the Digital Decade Policy Programme (DDPP) by assessing the two key performance indicators (KPIs) for online public services for citizens and businesses. Using a user-centred approach, the eGovernment Benchmark applies the Mystery Shopper methodology. Trained evaluators act as ordinary users when interacting with portals and webpages, generating objective, first-hand evidence on service availability and functionality. Read more about the 2026 State of the Digital Decade package . Deliverables Insight report Executive summary Background report Factsheets Method Paper Downloads 1. eGovernment Benchmark 2026 - Final Results Download 2. eGovernment Benchmark 2026 - Machine readable format Download Related topics Creating a digital society eGovernment, Trust services and eID Digital Decade Digital Decade reporting Digital Decade 2026

Nemotron 3 Ultra and the Open Model Landscape | Nemotron Labs
NVIDIA Developer YouTube 2026-06-17 06:54 UTC Score 62.0 AI-144-20260617-podcasts-and-91029169 Full article

Nemotron 3 Ultra and the Open Model Landscape | Nemotron Labs

Nemotron 3 Ultra is NVIDIA's latest frontier-intelligence open model — 5x faster inference, up to 30% lower cost, and fully open: weights, training datasets, and fine-tuning recipes included. In this livestream, we're joined by Nathan Lambert, ML researcher and open model advocate, to dig into what Ultra means for developers building on open models today. We'll cover what sets Ultra apart technically — the hybrid Mamba-Transformer backbone, Multi-Teacher On-Policy Distillation (MOPD), and how it fits into a system-of-models pattern. Nathan brings a researcher's perspective on post-training for agentic systems, and we'll get into where the open frontier model landscape is heading and what it takes to build models worth building on. What you'll learn: - How Ultra's post-training approach compares to what the open model ecosystem has seen at scale - What the hybrid Mamba-Transformer architecture means for long-context, multi-turn agent workflows - How open weights, datasets, and recipes enable domain-specific fine-tuning from day one - Where open frontier models are heading for agentic applications — and what tradeoffs matter most Have questions about Ultra, post-training, or the open model landscape? Drop them live — Nathan and the team will answer them in real time.

Data and Society AI 2026-06-17 04:00 UTC Score 30.0 USR-0143-20260617-research-aca-1c0ec4de Full article

Greening AI in the Public Sector: An Introductory Handbook for Procurement

Based on 10 months of fieldwork and five months of co-design interviews and focus groups with city staff and green software practitioners, this guide introduces staff and officials to the rapidly developing field of greener computing and suggests entry-level, low-effort actions that can augment a standard procurement process. The post Greening AI in the Public Sector: An Introductory Handbook for Procurement appeared first on Data & Society .

Asia News Network AI 2026-06-17 02:30 UTC Score 18.0 AI-158-20260617-regional-ai--2f55a895 Full article

World Cup predictions become new battleground for AI

The 23rd edition of the World Cup, featuring 48 teams, is being hosted by the United States, Canada and Mexico. It opened on Thursday and runs through July 19.

Weaviate Blog 2026-06-17 00:00 UTC Score 23.0 USR-0073-20260617-ai-specialis-32e3e752 Full article

Weaviate Cloud is now free to start

Weaviate Cloud is now free to start across the entire product suite.

AI Weekly 2026-06-17 00:00 UTC Score 19.0 AI-133-20260617-newsletters-8ec8e640 Full article

AI Weekly Issue #504: America blocked its best AI. China just raised $7.4 billion.

Four days after Washington cut foreign access to Anthropic's top models, the fallout is clear — and it's flowing to everyone but Anthropic. Cohere says it's drowning in government inbounds, DeepSeek just closed a record $7.4B round, and China's labs are slashing token prices up to 99%. The export control meant to protect America's AI lead is fast-tracking the alternatives. Also this week: 144 poisoned npm packages turn the AI supply chain into an open credential heist.

OpenAI News 2026-06-17 00:00 UTC Score 43.0 AI-044-20260617-official-ai--916e534a

Introducing LifeSciBench

Introducing LifeSciBench, an expert-authored, expert-reviewed benchmark for evaluating how AI systems handle real-world life science research tasks and decisions.

AI Now Institute 2026-06-16 22:51 UTC Score 35.0 USR-0135-20260616-ai-specialis-01698592 Full article

AI Now is Hiring a Senior Fellow, Global Programs

The Senior Fellow, Global Programs, will lead a tightly scoped, policy-responsive workstream at the intersection of AI, industrial policy, and global political economy; building on AI Now’s existing research on AI nationalism and translating it into a directed research and policy agenda. The terrain around “AI sovereignty” is rapidly being reshaped by an aggressive US […] The post AI Now is Hiring a Senior Fellow, Global Programs appeared first on AI Now Institute .

LatAm Journalism Review AI 2026-06-16 22:12 UTC Score 20.0 AI-176-20260616-regional-ai--2d16d11c Full article

As Brazilian media embrace prediction markets, experts warn of election distortion

News outlets are citing unregulated betting platforms alongside survey-based polls. Critics say the markets are easily manipulated and do not reliably reflect voter sentiment. The post As Brazilian media embrace prediction markets, experts warn of election distortion appeared first on LatAm Journalism Review by the Knight Center .

Coherent Breaks Ground on Expanded Texas Facility, Scaling AI’s Optical Backbone
NVIDIA Blog 2026-06-16 22:10 UTC Score 32.0 AI-055-20260616-official-ai--73f0fe71 Full article

Coherent Breaks Ground on Expanded Texas Facility, Scaling AI’s Optical Backbone

AI runs at the speed of light. More and more, that light is made in Texas. Coherent broke ground today on an expanded manufacturing building in Sherman, Texas. The company makes the lasers, optical components and compound semiconductors that wire AI systems together — and runs what it calls the world’s first 6-inch indium phosphide […]

TWIML AI Podcast 2026-06-16 22:10 UTC Score 51.0 AI-148-20260616-podcasts-and-8979913e Full article

Why AI Agents Break the GenAI Security Model with Devvret Rishi - #770

In this episode, Sam talks with Dev Rishi, GM of AI at Rubrik, about what happens when agents move beyond answering questions and start taking action across tools, systems, and business processes. We explore why the enterprise playbook of static guardrails plus human approval starts to break down in the agent era. Agents are useful because they can plan, call tools, update systems, write code, send messages, and operate across workflows at machine speed, but those same capabilities make them difficult to govern with rules written in advance or approval prompts reviewed one at a time. Dev explains why tool access increases blast radius, why agents can route around controls in surprising ways, and why human-in-the-loop review can become security theater when agents operate at scale. We also discuss what enterprises need instead: better visibility, runtime enforcement, policy-aware governance, agent observability, and recovery mechanisms for when something goes wrong. Along the way, we dig into MCP and tool sprawl, small language models for policy enforcement, defense in depth, agent rewind, and why AI may be needed to help secure AI. 🗒️ Full show notes: https://twimlai.com/go/770.

GitHub AI Blog 2026-06-16 20:58 UTC Score 29.0 USR-0061-20260616-ai-specialis-51ed7c9b Full article

What are git worktrees, and why should I use them?

Git worktrees have been around since 2015, but it wasn't until recently they became popular. Learn what they are, how to use them, and why you might. The post What are git worktrees, and why should I use them? appeared first on The GitHub Blog .

AI Alignment Forum 2026-06-16 19:55 UTC Score 67.0 USR-0151-20260616-community-fo-1b774dbe

Predicting LLM Safety Before Release by Simulating Deployment

Paper link Before releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part of our pre-deployment safety review, we leverage targeted evaluations, red-teaming, and other checks to understand model behavior. We’ve now started using a method for simulating model deployments before they happen, which adds a complementary signal: a deployment-like preview of how a candidate model may behave before it reaches users. Deployment Simulation is a method for simulating a future deployment before it happens. We do so by replaying previous conversations in a privacy-preserving manner with a new candidate model. By doing so, we can study how the new model responds in realistic contexts before release, including whether new undesired behaviors emerge and how often they may appear. In our GPT-5.4 study, these forecasts were informative. For categories whose production rates changed by at least 1.5x, deployment simulation predicted the direction of change 92% of the time, compared with 54% for a baseline built from challenging prompts. Simulated deployments also looked much closer to real production traffic on evaluation-awareness measures: traditional evals often visibly have stage lights; production prefixes mostly do not. The hardest case is agentic tool use, where realistic behavior depends on external state: fil…

Reduce Class Flickering: Introducing Track Class Lock
Roboflow Blog 2026-06-16 18:40 UTC Score 27.0 USR-0088-20260616-ai-specialis-f55ee605 Full article

Reduce Class Flickering: Introducing Track Class Lock

Fix class flickering on video with Track Class Lock, a Roboflow Workflow block that freezes a tracked object's label once the detector agrees.

Cornell AI Initiative 2026-06-16 17:06 UTC Score 30.0 USR-0014-20260616-research-aca-bad62da3 Full article

ILR School dean to help NYS shape, protect the AI workforce

Alexander Colvin, Ph.D. ’99, will serve on a blue-ribbon commission charged with developing recommendations on how New York state can protect workers’ economic security while harnessing the economic benefits of AI. The post ILR School dean to help NYS shape, protect the AI workforce appeared first on Cornell AI Initiative .

Why Tejal Patwardhan stopped underestimating the models - Episode 21
OpenAI YouTube 2026-06-16 17:00 UTC Score 42.0 AI-146-20260616-podcasts-and-07dda109 Full article

Why Tejal Patwardhan stopped underestimating the models - Episode 21

The old tests are getting too easy. Tejal Patwardhan leads OpenAI’s frontier evals team, which is finding new ways to measure and forecast progress as models become more capable. She and host Andrew Mayne discuss why evals matter for research, how benchmarks can break or get gamed, and what models need to be judged on next. Chapters 00:00:24 Growing up at OpenAI 00:03:10 Why reasoning changed everything 00:06:28 What made o1 surprising 00:11:20 Why old benchmarks stopped working 00:14:45 What makes a good benchmark 00:17:35 Why evals are getting harder 00:22:09 Measuring voice and vision models 00:24:48 Testing models on real science 00:33:23 How OpenAI tracks frontier progress 00:40:47 What AI means for work

Contact Lens Defect Inspection
Roboflow Blog 2026-06-16 16:39 UTC Score 31.0 USR-0088-20260616-ai-specialis-09dd376e Full article

Contact Lens Defect Inspection

Train a Roboflow object detection model, detect defects on each contact lens, and sort results into pass, review, and fail with a Custom Python Block.

HPE AI Factory With NVIDIA Expands for the Era of Agents
NVIDIA Blog 2026-06-16 16:30 UTC Score 51.0 AI-055-20260616-official-ai--a8a99da0 Full article

HPE AI Factory With NVIDIA Expands for the Era of Agents

Enterprises are moving agentic AI from proof of concept to production — and the next generation of AI factories are built for the era of agents. At HPE Discover Las Vegas, running through Thursday, June 18, NVIDIA and HPE are expanding the HPE AI Factory with NVIDIA, including NVIDIA Vera CPU and NVIDIA Agent Toolkit […]

AI Now Institute 2026-06-16 16:17 UTC Score 30.0 USR-0135-20260616-ai-specialis-bcfac33d Full article

AI Now Co-Executive Director Sarah Myers West Testifies Before Senate Banking Committee

On Thursday, June 11, 2026, AI Now Co-Executive Director Dr. Sarah Myers West testified at a Hearing before the U.S. Senate Banking Committee on “AI and the American Dream: Promoting Innovation, Affordability, and American Dominance”. In her testimony, Dr. West highlighted the risks the AI industry poses to the US economy and broader public – […] The post AI Now Co-Executive Director Sarah Myers West Testifies Before Senate Banking Committee appeared first on AI Now Institute .

They Looked Inside Claude’s AI's Mind. It Got Weird
Two Minute Papers 2026-06-16 15:53 UTC Score 42.0 AI-139-20260616-podcasts-and-23a619a4 Full article

They Looked Inside Claude’s AI's Mind. It Got Weird

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here: https://www.anthropic.com/research/natural-language-autoencoders https://transformer-circuits.pub/2026/nla/index.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu

MLPerf / MLCommons Benchmarks 2026-06-16 14:56 UTC Score 49.0 AI-102-20260616-model-datase-9c7b49da Full article

MLCommons Releases MLPerf Training v6.0 Results

New benchmarks and increased diversity of submissions reflect important changes in AI ecosystem The post MLCommons Releases MLPerf Training v6.0 Results appeared first on MLCommons .

Arize AI Blog 2026-06-16 14:00 UTC Score 46.0 USR-0079-20260616-ai-specialis-65dc26e6 Full article

What is agent orchestration? Frameworks, runtimes, and observability explained

Agent orchestration is not one problem. It spans expression, runtime, and observability, and separating those layers clarifies how teams should build, run, and improve production agents. The post What is agent orchestration? Frameworks, runtimes, and observability explained appeared first on Arize AI .

Gradient Flow 2026-06-16 13:00 UTC Score 40.0 USR-0119-20260616-ai-specialis-c7a54541 Full article

Your AI bill is a tax on scale

Subscribe • Previous Issues The Hybrid AI Stack Is Coming for the Pricing Power of OpenAI and Anthropic OpenAI and Anthropic are going public while still capturing much of the money spent on foundation-model usage. But deployment patterns are starting to tell a more complicated story. Companies are building hybrid model portfolios, using proprietary models where convenience, Continue reading "Your AI bill is a tax on scale" The post Your AI bill is a tax on scale appeared first on Gradient Flow .

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM
Machine Learning Mastery 2026-06-16 12:00 UTC Score 27.0 AI-039-20260616-ai-specialis-e9483392 Full article

Building an End-to-End Sentiment Analysis Pipeline with Scikit-LLM

Traditional machine learning pipelines for predictive tasks like text classification usually rely on extracting structured, numerical features from raw text — for instance, TF-IDF frequencies or token embeddings — to feed into classical models such as logistic regression, ensembles, or support vector machines.

Gradient Flow 2026-06-16 11:00 UTC Score 33.0 USR-0119-20260616-ai-specialis-b8375784 Full article

Tokenomics: AI’s New Design Constraint

The Cost Reality of Running AI at Scale Budget shock is already happening. Multiple major players have pulled back on AI features or subscriptions due to unexpectedly high token costs. Amazon removed its token leaderboard and Microsoft cancelled Claude Code subscriptions. These are early signals that the deploy-everywhere approach is hitting hard financial limits, not Continue reading "Tokenomics: AI’s New Design Constraint" The post Tokenomics: AI’s New Design Constraint appeared first on Gradient Flow .

Stack Overflow AI Blog 2026-06-16 07:40 UTC Score 44.0 USR-0063-20260616-ai-specialis-949ca187 Full article

If context is king, architecture is the castle​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌…

Recorded live at the AI Agent Conference, Ryan sits down with Apollo GraphQL CEO Matt DeBergalis to discuss how enterprises can leverage GraphQL and MCP as a structured semantic architecture to feed clean data to autonomous agents, safeguard internal microservices against unprecedented "east-west" data exfiltration risks, and rein in skyrocketing token spend by explicitly querying only the exact context required.​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌‍‌‌‌‍‍‌‌​​‍‌‍‌‌‌‍‌​‌‍‍‌‌‌​​‍‌‍‌‌‍‌‍‌​‌‍‌‌​‌‌​​‌​‍‌‍‌‌‌​‌‍‌‌‌‍‍‌‌​‌‍​‌‌‌​‌‍‍‌‌‍‌‍‍​‍‌‍‍‌‌‍‌​​‌‌‍​​​‍​​‌‍​​​​​‍‌‌‍​‍​​‍​‍‌​‌​​​‌​​‌‍‌​​‍‌​‌​‌‍​‌​‌​​​​‍‌​‍‌‌‍​‌‌‍​‍‌‍​​‍‌‌‍‌‌​‌​​​‌‍‌​​​‌‌‍​‍​‌​​​​‌‍‌‍‌‍​‌‍​‌​‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‌‍​‍‌‍​‌‌​‌‍‌‌‌‌‌‌‌​‍‌‍​​‌‌‍‍​‌‌​‌‌​‌​​‌​​‍‌‌​​‌​​‌​‍‌‌​​‍‌​‌‍​‍‌‌​​‍‌​‌‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‌‍‍‌‌‍‌​​‌‌‍​​​‍​​‌‍​​​​​‍‌‌‍​‍​​‍​‍‌​‌​​​‌​​‌‍‌​​‍‌​‌​‌‍​‌​‌​​​​‍‌​‍‌‌‍​‌‌‍​‍‌‍​​‍‌‌‍‌‌​‌​​​‌‍‌​​​‌‌‍​‍​‌​​​​‌‍‌‍‌‍​‌‍​‌​‍‌‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‍‌‍‌​​‌‍‌‌‌​‍‌​…

Analytics Vidhya 2026-06-16 07:30 UTC Score 21.0 AI-034-20260616-ai-specialis-bc6634de Full article

Autoregressive Models: Predicting the Future Using the Past

Autoregressive models are one of the most important ideas in time series forecasting and sequence modeling. The name may sound technical at first, but the concept is surprisingly intuitive. An autoregressive model predicts the next value by looking at previous values. That is the core idea. For example, tomorrow’s temperature may depend on the temperatures […] The post Autoregressive Models: Predicting the Future Using the Past appeared first on Analytics Vidhya .

MERICS China AI 2026-06-16 07:27 UTC Score 28.0 USR-0207-20260616-research-aca-62de34d9 Full article

MERICS Data Insight: EU-China trade

MERICS Data Insight: EU-China trade H.Seidl Tue, 06/16/2026 - 09:27 Comment Jun 16, 2026 1 min read MERICS Data Insight: EU-China trade In this edition of MERICS Data Insights, MERICS Visiting Fellow Esther Goreichy looks at the European Union's trade deficit with China. She finds that the deficit is widening despite the bloc's trade defense measures. Author(s) Esther Goreichy Visiting Fellow Author(s) Esther Goreichy Visiting Fellow Related content about EU-China The EU must confront China’s trade challenge Comment Jul 01, 2026 Fragmented Europe: Dealing with China as a technology and innovation power Report Jun 30, 2026 Executive Summary: Fragmented Europe: Dealing with China as a technology and innovation power Report Jun 30, 2026 Related content about Trade and Investment The EU must confront China’s trade challenge Comment Jul 01, 2026 Chinese FDI in Europe reaches 7-year high, with Gregor Williams and Andreas Mischer Podcast Jun 05, 2026 China in 26: Diplomatic strength, economic weakness, investment increase Podcast May 22, 2026

Stack Overflow Machine Learning Tag 2026-06-16 04:01 UTC Score 34.0 AI-112-20260616-social-media-a747ea81 Full article

Uncertainty Estimation vs Oversampling

I am currently doing some work with a fraud detection dataset as part of a research to leverage uncertainty to improve neural networks ensemble of experts' results. Firstly I had to take the dataset's training data and split it into 6 different domains (5 in-distribution and 1 ood). The goal is to train 5 different experts on different fraud patterns and then compute uncertainty of each expert (using Monte-Carlo dropout) and the uncertainty of the ensemble of experts. When we talk about fraud detection we expect the data to be heavily imbalance favoring the non-fraud class. In this case is 1:90. Given this it makes sense to use oversampling when training the neural networks and so I did. I made a sampler which increases the ratio to 1:10, not by creating new transactions (Because due to the data split, some domains have very few transactions to get reliable simulated transactions with oversampling) but by having the fraudulent transactions seen x amount of times more to increase the ratio. Now, with this, I'm having a problem with the uncertainty signal. In a perfect scenario, fraudulent transactions would be more uncertain than legit ones, but due to the oversampling the experts seem to be more uncertain about the legit transactions than the fraudulent ones. I have tried different percentages of oversampling and only without oversampling the uncertainty signal is correct, but then the overall predictive results underperform. So what should I do? Should I try different metho…

Data and Society AI 2026-06-16 03:03 UTC Score 26.0 USR-0143-20260616-research-aca-40475750 Full article

AI in Science

The post AI in Science appeared first on Data & Society .