Learn the system
are live models making a comeback?
AI/ML news, top picks, and generated innovation digests.
8705 matching items
are live models making a comeback?
Vector Institute awards 100 scholarships to Ontario’s top AI graduate students TORONTO, May 12, 2026 – Today, the Vector Institute awarded scholarships to 100 exceptional graduate students pursuing studies across […] The post Vector Institute awards 100 scholarships to Ontario’s top AI graduate students appeared first on Vector Institute for Artificial Intelligence .
A deep dive on Modal's deep tech for fast boots.
Masashi Sugiyama, Center Director delivered an ELLIS Distinguished Lecture at Aalto University in Finland on May 11, 2026. Title: Machine Learning from Imperfect Information: Foundations of Robust Intelligence in the Era of Foundation Model
Sapu is an early-stage biopharmaceutical company developing treatments for hard-to-treat cancers. From its San Diego facility, the team is pioneering a nanomedicine pipeline that takes existing FDA-approved drugs and re-engineers them at the nanoscale, making them smaller, more effective, and less toxic. Building on already-approved compounds gives Sapu a stronger and faster path to therapeutic success in an industry where most candidates never reach patients. Behind the lab work sits an AI tooling suite that does the reading, searching, and synthesis that would otherwise take researchers thousands of hours. Sapu’s internal AI platform supports research paper authorship, references standard operating procedures, and lets the team query its document corpus with the precision biotech R&D requires. As the company grew, so did the volume of documents, the variety of use cases, and the demands placed on the underlying retrieval infrastructure.
Big testimony is expected this week in a trial that's already produced major revelations. Here's what to look out for + the biggest news so far.
“At a time of global pressure on journalism, the advance of artificial intelligence (AI) and on the eve of Brazil’s 2026 elections, the 3i Festival is returning for its seventh edition with discussions on the challenges facing digital journalism and the future of information. The event will take place May 29-31 at Porto Maravalley in […] The post Tickets now available for Brazil’s 3i Festival 2026 on journalism innovation appeared first on LatAm Journalism Review by the Knight Center .
“At a time of global pressure on journalism, the advance of artificial intelligence (AI) and on the eve of Brazil’s 2026 elections, the 3i Festival is returning for its seventh edition with discussions on the challenges facing digital journalism and the future of information. The event will take place May 29-31 at Porto Maravalley in […] The post Tickets now available for Brazil’s 3i Festival 2026 on journalism innovation appeared first on LatAm Journalism Review by the Knight Center .
As enterprise AI agent adoption scales, the absence of centralized, organization-level tool infrastructure is producing compounding costs. When adoption is built around optimizing for deployment speed, enterprises expose themselves to a combination of risks: duplicated engineering effort, security exposure, and operational opacity. Every enterprise needs its own shared tool registry, one that reflects its specific regulatory environment, security posture, and operational conventions. To be clear, this is not an argument for a public package manager, something like npm, PyPI, or Maven. The infrastructure each enterprise needs is internal; scoped to its own teams, its own data, its own policies, its own domain. Trying to expand the scope beyond the confines of individual organizations would be premature standardization in a fast-moving, nascent space. A shared enterprise tool registry is not an optimization or a nice-to-have. It is foundational infrastructure as agent deployments scale beyond early experiments. The case for it rests on two pillars: reducing coordination cost and enabling risk management, both for the humans building with agents and for the agents themselves. AI agents depend on tools that retrieve data, write records, trigger workflows, and call external APIs. According to McKinsey, in most large organizations, these tools are built by individual teams in an ad hoc fashion: undocumented, ungoverned, and invisible to the rest of the organization. This pattern i…
We’re excited to announce that the Early Access Program (EAP) for ReSharper and .NET Tools 2026.2 is now underway! While our EAP announcements usually cover a wide range of new features, performance updates, and bug fixes, this release is different. We are dedicating this first preview entirely to a singular, game-changing initiative: bringing true AI […]
Learn how Pinecone full-text search uses BM25 scoring and Lucene syntax for exact match, boolean, and phrase queries — and how to combine it with vector search.
What laws does superintelligence demand?
ELLIS Institute Finland and RIKEN AIP Joint Workshop was held on May 11, 2026 at Aalto University in Finland. Researchers from both sides participated engaged in active discussions on AI and machine learning. For more information, please se
ELLIS Unit Milan and RIKEN AIP Joint Workshop was held on May 7-8, 2026, at the University of Milan. Over the two days, researchers from both sides participated both in person and online, and engaged in active discussions on AI a
Qdrant 1.18.0 is out! Let’s look at the main features for this version: TurboQuant: A new quantization method that, at twice the compression ratio of scalar quantization, delivers similar recall and speed. Memory Monitoring: Inspect a collection’s disk, RAM, and page cache usage broken down by component (vectors, payload, indexes, and more) via a new Web UI view and API endpoint. Adding and Removing Named Vectors: Add or remove named vectors to an existing collection’s schema without having to recreate it.
Artificial Analysis uses Ai2’s open IFBench eval because it captures a stubborn, real-world capability many benchmarks miss: whether models can reliably follow complex, multi-part user instructions.
Summary In February–April 2026, we ran a survey of 349 technical workers (including 87 software engineers, 71 researchers, 129 academics and PhD students, and 48 founders and managers) about their usage of AI tools. Compared to previous work, our survey is one of the more detailed surveys of technical workers’ self-reported gains from frontier AI tools. 1 We attempt to capture gains due to AI in terms of ‘value’ (how much more value are you creating with AI), rather than ‘speed’ (how long would it have taken you to do these tasks without AI). These can give different answers in principle, in particular if using AI changes the distribution of tasks you work on. For example, researchers could use AI to quickly build an interactive dashboard for their data, which would have taken significantly longer without AI but isn’t that important for their project. We provide more detail on the distinction between value and speed gains in our previous research . We think that the distinction between ‘value’ and ‘speed’ gains is important because value is closer to the idea that survey designers typically care about, whereas our sense is that it is common for respondents to think in terms of speed, and we expect that speed changes would typically overstate value changes. See methodological details here . Participants self-reported a median 1.4–2x change in the value in their work due to AI tools. The median self-reported speed change (which we expect to be higher than value change) is 3x.…
Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinforcement learning (RL). However, existing captioning-RL methods and evaluation metrics often emphasize a narrow notion of caption quality, inducing trade-offs across core dimensions of captioning. For example, utility-oriented objectives can encourage noisy, hallucinated, or overlong captions that…
This is 100 Years From Now. Once a week we skip a century and try to picture what life actually looks like when the stuff we're building now has had time to settle in. This week: the last vote.
They have some challenges 😅
By Adam Wolf Zero trust is an architectural principle, not a product. It means assuming breach, verifying every connection explicitly, and granting the minimum access required for each interaction. This post covers how those principles apply to Kubernetes AI infrastructure and specifically how ClearML’s security model slots into each layer: network segmentation, workload identity, access […]
Volume group snapshots were introduced as an Alpha feature with the Kubernetes v1.27 release, moved to Beta in v1.32, and to a second Beta in v1.34. We are excited to announce that in the Kubernetes v1.36 release, support for volume group snapshots has reached General Availability (GA) . The support for volume group snapshots relies on a set of extension APIs for group snapshots . These APIs allow users to take crash-consistent snapshots for a set of volumes. Behind the scenes, Kubernetes uses a label selector to group multiple PersistentVolumeClaim objects for snapshotting. A key aim is to allow you to restore that set of snapshots to new volumes and recover your workload based on a crash-consistent recovery point. This feature is only supported for CSI volume drivers. An overview of volume group snapshots Some storage systems provide the ability to create a crash-consistent snapshot of multiple volumes. A group snapshot represents copies made from multiple volumes that are taken at the same point-in-time. A group snapshot can be used either to rehydrate new volumes (pre-populated with the snapshot data) or to restore existing volumes to a previous state (represented by the snapshots). Why add volume group snapshots to Kubernetes? The Kubernetes volume plugin system already provides a powerful abstraction that automates the provisioning, attaching, mounting, resizing, and snapshotting of block and file storage. Underpinning all these features is the Kubernetes goal of workl…
New EU reporting rules for data centre energy and water use may look like progress, but loopholes risk undermining genuine environmental accountability.
Overview of adaptive parallel reasoning. What if a reasoning model could decide for itself when to decompose and parallelize independent subtasks, how many concurrent threads to spawn, and how to coordinate them based on the problem at hand? We provide a detailed analysis of recent progress in the field of parallel reasoning, especially Adaptive Parallel Reasoning. Disclosure: this post is part landscape survey, part perspective on adaptive parallel reasoning. One of the authors (Tony Lian) co-led ThreadWeaver ( Lian et al., 2025 ), one of the methods discussed below. The authors aim to present each approach on its own terms. Motivation Recent progress in LLM reasoning capabilities has been largely driven by inference-time scaling, in addition to data and parameter scaling ( OpenAI et al., 2024 ; DeepSeek-AI et al., 2025 ). Models that explicitly output reasoning tokens (through intermediate steps, backtracking, and exploration) now dominate math, coding, and agentic benchmarks. These behaviors allow models to explore alternative hypotheses, correct earlier mistakes, and synthesize conclusions rather than committing to a single solution ( Wen et al., 2025 ). The problem is that sequential reasoning scales linearly with the amount of exploration. Scaling sequential reasoning tokens comes at a cost, as models risk exceeding effective context limits ( Hsieh et al., 2024 ). The accumulation of intermediate exploration paths makes it challenging for the model to disambiguate amon…
EMO is a new mixture-of-experts model trained so modular expert groups emerge from data, enabling users to select small task-specific expert subsets while preserving near full-model performance.
We reviewed the “Risks from automated R&D” section of Anthropic’s February 2026 Risk Report , producing two corresponding review documents: our original review and our updated review . We recommend that readers refer to our original review, which represents our review of the report as originally received. 1 The following is the executive summary of our original review. The full documents are available as PDFs ( original , updated ). Executive summary This document is METR’s external review of the “Risks from automated R&D” section in the Anthropic Risk Report: February 2026 (henceforth ‘the report’), which makes the argument that catastrophic risk from Claude Opus 4.6 or a less capable Anthropic model automating R&D in any domain is very low. Anthropic shared additional non-public materials with us for our review, and we used some non-public information shared as part of a previous review . We further detail this process in an appendix. We lay out our findings in two sections: Synopsis of Anthropic’s case . Our assessment : We do not think the report adequately supports its conclusion. We note significant issues in a few key areas: Analytical rigor: We have a number of significant issues with the analytical rigor in the overall argument and interpretation of the results of the model use survey. We think that the cited results of the survey provide little evidence about the level of overall risk , due to issues including sample size, question granularity, survey framing, and…
Summary: We describe three different definitions of the productivity impact of AI (AKA uplift), and show there’s reason to expect: \[\text{uplift on old tasks} \leq \text{uplift in value} \leq \text{uplift on new tasks}\] Three Measures of Uplift One complication in measuring AI’s effect on productivity is that it has different effects on different tasks, and this causes people to change how they allocate their time between tasks. This makes it more difficult to talk about the effect of AI on overall productivity. We use “old tasks” to mean the set of tasks you’d do in a typical day before AI is available – your average workday in 2021, say. “New tasks” means the set of tasks you’d do in a typical day after AI is available. Not all new tasks necessarily use AI; they’re just the tasks you choose knowing AI is an option. We have found it important to distinguish between three measures of AI’s uplift: Uplift on old tasks: The factor by which pre-AI time exceeds post-AI time to complete the old tasks. Uplift on new tasks: The factor by which pre-AI time exceeds post-AI time to complete the new tasks. Uplift in value: The factor by which post-AI value exceeds pre-AI value, allowing for reshuffling of tasks between the pre-AI and post-AI cases. In some cases value has a natural definition; in others, it can be operationalized using related definitions discussed more in the accompanying note. This note discusses the distinction and its implications for interpreting AI productivity…
An interview with Takayuki Osa, Team Director of the Robot Learning Team, was published in Nikkei Tech Foresight on May 7, 2026. The article introduces recent advances in robot
Data from 1,281 agent runs across 40+ large open source repos reveals five repeatable failure patterns in coding agents, and the infrastructure fixes for each.
At Apple, we believe privacy is a fundamental human right. As AI capabilities increase and become more integrated into people’s daily lives, advancing research in privacy-preserving techniques is increasingly important to ensure privacy is protected while users enjoy innovative AI experiences. Apple’s fundamental research has consistently pushed the state-of-the-art in this domain, and earlier this year, we hosted the Workshop on Privacy-Preserving Machine Learning & AI. This two-day event brought together Apple researchers and members of the broader research community to discuss the…
Current critic-less RLHF methods aggregate multi-objective rewards via an arithmetic mean, leaving them vulnerable to constraint neglect: high-magnitude success in one objective can numerically offset critical failures in others (e.g., safety or formatting), masking low-performing “bottleneck” rewards vital for reliable multi-objective alignment. We propose Reward-Variance Policy Optimization (RVPO), a risk-sensitive framework that penalizes inter-reward variance during advantage aggregation, shifting the objective from “maximize sum” to “maximize consistency.” We show via Taylor expansion…
We propose HeadsUp, a scalable feed-forward method for reconstructing high-quality 3D Gaussian heads from large-scale multi-camera setups. Our method employs an efficient encoder-decoder architecture that compresses input views into a compact latent representation. This latent representation is then decoded into a set of UV-parameterized 3D Gaussians anchored to a neutral head template. This UV representation decouples the number of 3D Gaussians from the number and resolution of input images, enabling training with many high-resolution input views. We train and evaluate our model on an…
In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Maslow’s hierarchy of observability: telemetry for logging, monitoring for known signals, and post-production or online analytics to surface unknown unknowns. We dig into examples of real-world failures Scott’s team has seen in production systems, such as “lazy” tool-use hallucinations that standard evals miss, and how mapping traces into vector fingerprints enables clustering and topic discovery to uncover emergent behaviors. Scott explains how analytics can feed the data flywheel by generating evals, guardrails, and training data, and why online, adaptive approaches are essential for non-stationary models. We also touch on practical how-to’s such as instrumentation with OpenTelemetry, the GenAI semantic conventions, and the role of dedicated analytics tools. The complete show notes for this episode can be found at https://twimlai.com/go/767.
A knowledge engine is the data infrastructure category that lets agents query trusted, compiled knowledge instead of brute-forcing retrieval over raw data. How one is built, how agents query it, and how it compares to RAG, vector databases, and semantic layers.
CSET’s Julie George shared her expert perspective in an op-ed published by Bulletin of the Atomic Scientists. In the piece, she argues that while the Defense Department’s decision to narrow its list of critical technologies is a positive step, the Pentagon must also improve how it prioritizes and funds emerging technologies to address overlooked capability gaps and strengthen long-term military innovation. The post Beyond AI: What the Pentagon is missing with its trimmed ‘critical technologies’ list appeared first on Center for Security and Emerging Technology .
Read our translation of a Chinese government plan that calls for making data more plentiful and accessible in industries such as manufacturing, agriculture, transportation, finance, scientific research, and healthcare. The post Three-Year Action Plan for “Data Factor of Production ×” appeared first on Center for Security and Emerging Technology .
Dynamic Resource Allocation (DRA) has fundamentally changed how platform administrators handle hardware accelerators and specialized resources in Kubernetes. In the v1.36 release, DRA continues to mature, bringing a wave of feature graduations, critical usability improvements, and new capabilities that extend the flexibility of DRA to native resources like memory and CPU, and support for ResourceClaims in PodGroups. Driver availability continues to expand. Beyond specialized compute accelerators, the ecosystem includes support for networking and other hardware types, reflecting a move toward a more robust, hardware-agnostic infrastructure. Whether you are managing massive fleets of GPUs, need better handling of failures, or simply looking for better ways to define resource fallback options, the upgrades to DRA in 1.36 have something for you. Let's dive into the new features and graduations! Feature graduations The community has been hard at work stabilizing core DRA concepts. In Kubernetes 1.36, several highly anticipated features have graduated to Beta and Stable. Prioritized list (stable) Hardware heterogeneity is a reality in most clusters. With the Prioritized list feature, you can confidently define fallback preferences when requesting devices. Instead of hardcoding a request for a specific device model, you can specify an ordered list of preferences (e.g., "Give me an H100, but if none are available, fall back to an A100"). The scheduler will evaluate these requests in…
Lessons from my trip to talk to most of the leading AI labs in China.
The post Inside Porsche Cup Brasil’s AI-powered race operations appeared first on Source .
Today, we announced at .local London that MongoDB 8.3 is built for the speed AI demands—and our customers can't afford to wait. The data layer has to move at AI speed The old contract between databases and the applications on top of them was simple: databases improve slowly, and architectures evolve around them. AI has changed that contract. The workloads our customers are shipping today—agents retrieving at sub-100ms, retry storms hitting in milliseconds, multi-region deployments that can't trade compliance for latency—were edge cases 18 months ago. Now they're the baseline. MongoDB 8.3, generally available today, is our fourth significant release in 19 months. These releases compound. Customers running on 8.0 have seen 36% faster reads and 59% higher throughput for updates. 8.3 adds another 35% to write throughput, 45% to reads, and 15% to ACID transactions over 8.0 — without changing a line of application code. Enterprises like Adobe, running the most demanding AI in production, have made the requirements clear: sub-100ms retrieval, sub-second context updates, zero downtime. That's what MongoDB Atlas is built for. That's the commitment: when the data platform keeps pace, our customers can focus on shipping. MongoDB.local London Core Blog 2026 - Image 1 media Run anywhere. Stay secure. Where you run your agents isn't just an infrastructure decision anymore. Now, it's a critical compliance and security decision as well. While most platforms force a trade-off between global…
The post Submission on amendments to the Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021 (“IT Rules, 2021”) appeared first on Access Now .
How MLCommons engineered a stable, accessible Mixture-of-Experts (MoE) pretraining benchmark for MLPerf Training v6.0 that runs on a single 8-GPU node. The post GPT-OSS 20B: A Sparse MoE Pretraining Benchmark for MLPerf Training v6.0 appeared first on MLCommons .
Free ChatGPT got instantly better.
China resumes fuel exports + US oil sanctions + Emission quotas for local cadres c.groth Thu, 05/07/2026 - 13:14 picture alliance / CFOTO | CFOTO Download (pdf - 546.11 KB) MERICS Briefs MERICS China Essentials May 07, 2026 10 min read China resumes fuel exports + US oil sanctions + Emission quotas for local cadres Top Story China resumes fuel exports as national supply worries ebb – and regional ones rise China is moving to prevent the worst for Asian economies by resuming exports of jet and motor fuels to some regional countries in May. Having suspended shipments from refineries shortly after the US and Israel attacked Iran at the end of February, China will allow 500,000 metric tons of fuel to be exported this month. This is still much lower than its pre-war average of more than double that amount, but a sign that Beijing’s persistent caution about its own energy supply is ebbing – and that its worries about compounding pressure on regional supply chains and markets are increasing. China is a major importer of oil and gas, but a major exporter of fuel, with its many refineries providing gasoline, diesel, and jet fuel to countries from nearby Vietnam to far away Australia. Asian economies were deeply disrupted by the energy shock triggered by the closure of the Strait of Hormuz, a critical global energy artery between Iran and Oman, and were hit again when Beijing stopped shipments of its refined petroleum products. China’s partial reversal should help ease the fuel crunch…
In this fully connected episode, Dan and Chris break down one of the biggest questions in AI today: do open vs. closed models still matter? From the rise of physical AI and edge devices to the shifting landscape of open-source models like LLaMA, they explore whether the “model wars” are becoming irrelevant. The conversation then dives into a bigger transformation, the rise of agentic systems, workflows, and AI-driven infrastructure. Featuring: Chris Benson – Website , LinkedIn , Bluesky , GitHub , X Daniel Whitenack – Website , GitHub , X Upcoming Events: Register for upcoming webinars here ! Midwest AI Summit 2026
Ai2 is bringing NSF OMAI compute online to power a fully open AI research ecosystem, turning national infrastructure investment into reusable models, data, methods, and tools that can accelerate scientific discovery.
Full text search in Pinecone, built for agents and RAG. Lucene queries, BM25, 17-language tokenization, and text-match filters in a single query alongside vectors.