Is SaaS dead?
MCP comeback in works
AI/ML news, top picks, and generated innovation digests.
8582 matching items
MCP comeback in works
What AI-driven miracles will happen this year?
with Rob Lee
Position title: Project ManagerReports to: Research DirectorLocation: Cape Town, South Africa (Hybrid)Duration: Full-time contract Overview Research ICT Africa is seeking a highly organised and proactive Project Manager to support the […] The post Vacancy: Project Manager appeared first on Research ICT Africa .
As an engineering leader, you don’t need to be told your codebase needs attention. The issue isn’t awareness – it’s the rational risk calculation that follows. For four teams, that calculation kept producing the same answer: defer. They found a way out not by avoiding the calculation, but by changing what went into it. To […]
Google unveils AI model Gemini 3.5 and AI agent Gemini Spark, Omni turns images, audio, and text into video, Musk loses OpenAI court battle
The post Fake Academic Journals Are Publishing AI-Generated Papers Under Real Professors’ Names appeared first on Data & Society .
The post AI Job Losses Are Increasing. Are Training Programs the Answer? appeared first on Data & Society .
The post Women Are Leading the Rebellion Against AI Data Centers appeared first on Data & Society .
Sign up for Qiskit Global Summer School 2026: A decade on the cloud — a free, virtual program for learning quantum computing with Qiskit.
We present HELM Arabic Enterprise, a leaderboard for transparent, reproducible evaluation of large language models on Arabic-language benchmarks designed around enterprise use cases. The leaderboard was developed in collaboration with Arabic.AI and builds on the HELM evaluation methodology: standardized prompting, fully logged requests and responses, and reproducible scoring through the open-source HELM framework.
Thank you to Google DeepMind for the invite. 🙏 ❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu 00:00 Intro 00:40 Gemini Health Scans and Gemma 4 01:30 AI as a Brainstorming Partner 02:30 Second Order Nobel 03:15 DeepMind Co-Scientist 05:00 Curing All Diseases 06:30 Exponential Growth in Drug Discovery 07:45 Regulatory Bottlenecks 09:45 Accelerating Clinical Trials 11:15 EVE Online Partnership 13:15 The Einstein Test 15:30 Recursive Self-Improvement 18:15 Lightning Round 19:30 The Badge of Honor 20:10 Behind the Scenes
Mohamad Moosavi, Assistant Professor, Chemical Engineering, University of Toronto | Vector Institute Faculty Member The path to breakthrough climate technologies often moves at a frustrating pace. Consider metal-organic frameworks – […] The post Mohamad Moosavi: Accelerating the search for climate solutions with AI appeared first on Vector Institute for Artificial Intelligence .
GovAI's Annual Report 2024.
GovAI's Annual Report 2025.
Boo 👻 it's the ghost token
Team Sakura, including Adam Nohejl, Postdoctoral Researcher, and Hitomi Yanaka, Team Director of the Explainable AI Team, won the Open Track of the BEA 2026 Shared Task “Vocabulary Difficulty Prediction for English Learners,” held in conjunction w
Over the weekend: Musk, Zuckerberg, and Sacks killed Trump's draft AI safety executive order in three Wednesday-night phone calls. Anthropic closed a $30B+ round the same Saturday — while Microsoft quietly cancelled its internal Claude Code pilot after token billing ate the entire annual AI budget, redirecting developers to Copilot. CISA logged 15,000 attacks on a same-week Drupal SQL flaw. The first cross-registry supply chain attack — TrapDoor — hit npm, PyPI, and Crates.io at once, using .cursorrules and CLAUDE.md config files as the carrier. And the White House personally overrode the Pentagon to keep Claude inside the NSA.
45 papers have been accepted at the International Conference on Machine Learning (ICML) 2026, a major conference on Artificial Intelligence (July 6-11, 2026, Seoul, South Korea). For more details, please refer to the link below. <stro
“Humans were not put on this earth to maintain Excel models.”
Short note on a DeepSeek Sparse Attention from-scratch implementation added to the LLMs-from-scratch repository.
Humans are angry
I am developing a Flutter app called Talk to Deaf , which aims to enable real-time two-way communication between deaf and hearing users. The app will allow normal users to input text or voice and the deaf user will respond in sign language, while the app will convert those signs back into text or speech. I am unsure about which type of dataset to use for training my machine learning model: a dataset with individual alphabets (A-Z) or a dataset with complete words/phrases. I want to ensure accurate and smooth communication. Which type of dataset would be more suitable for building a robust real-time sign language interpreter, and what are the trade-offs of each approach? Any guidance on dataset selection or best practices for training a model for this type of two-way communication app would be highly appreciated.
This talk by Cline's Ara Khan explains why they went from "evals are useless" to using them as a core part of my agent improvement loop. I share practical heuristics for interpreting, running, and creating evals, and why doing them anyway is better than pure "vibes".
The importance of independent evaluation
“Budget” and “financials” are different words, but embeddings understand they’re related. That’s the foundation behind semantic search and one of the core building blocks of modern multimodal systems. Learn how embeddings power retrieval across text, audio, images, and video in Building Multimodal Data Pipelines: https://hubs.la/Q04hJ9w10
This "Values & Interests" panel discussion, held in partnership with PBS and moderated by acclaimed journalist Ann Curry, is available to view in full.
Levi Boxell, Tilman Drerup, Alexandr Lenk The Economics Team at Instacart is an applied science team that operates at the intersection of machine learning engineering and economics. Similar to other applied science teams, our work involves a good chunk of engineering, steeped in statistics, math, theory, and strategy. And while that is still at the heart of what we do today, the surprisingly rapid emergence of artificial intelligence has also fundamentally altered our work in ways that we did not see coming. With this post, we want to provide a brief check-in and share an analysis of the patterns we are seeing from a distinctly economic perspective. To do so, we analyze the empirical dynamics of our project portfolio between 2023 and today, looking at the evolution of both the nature and quantity of our work over time. To start, let’s have a quick refresher of what economists at Instacart do and provide a theoretical framework to think about the impact of technological change through AI. Background & Theoretical Framework At Instacart, economists spend their day-to-day on a diverse portfolio of tasks and activities. Similar to other applied science teams within the company, our work relies on a blend of skills, including economics, statistics, math, machine learning, data manipulation, coding, and AI. Due to this versatility in tasks, the team’s work provides a particularly rich testing ground for predictions derived from economic theories concerning the impact of technologi…
AI agents fail in unpredictable ways that traditional testing can't catch — hallucinations, wrong tool calls, policy violations, and more. Teams only discover these failures after users hit them in production. A simulation sandbox gives you a controlled environment with realistic users, tools, and workflows where you can run hundreds of scenarios against your agent before it ships, catching edge cases and adversarial inputs that would be impossible to test manually. This talk by Veris AI's Andi Partovi covers why simulation-driven development is becoming essential infrastructure for any team building production AI agents, and how it closes the gap between "works in demos" and "works at scale."
Modern enterprises don't struggle to experiment with AI — they struggle to operationalize it reliably. In this talk, CrewAI's CEO outlines how leading organizations are moving beyond one-off automations to build recurring, governed, and deeply embedded workflows that drive real business outcomes. Drawing on lessons from production deployments, João explores how to design systems that are auditable, scalable, and aligned with enterprise controls — without sacrificing speed.
From centralized to distributed: In the old world, organizations relied on one centralized data and AI platform. In the new world of AI agents, every agent needs its own sandboxed, secure, and modern data stack. In this 20-minute talk with live demo by Spice AI's Luke Kim, he explores why this architectural shift is critical and the key patterns required to give agents reliable, real-time data.
The next major shift in enterprise AI is underway; enterprises are moving from generic AI they rent to specialized AI they own. The benefits are clear: higher quality, dramatically lower costs, full control, and a quality improvement flywheel while in production. But building specialized AI models has been prohibitively hard; each use case requires months of effort and deep AI expertise. Well, it used to. VibeML is enabling engineers to build specialized AI models automatically from a prompt, in minutes. An AI agent builds your AI model end-to-end; evaluation, data synthesis, training and repeat. This talk by OUMI's Manos Koukoumidis & Stefan Webb demonstrates how VibeML can give deep AI experts superpowers while enabling non-experts as well.
At AI Dev 26 x San Francisco, Flower Lab's Daniel Beutel talked about Flower SuperGrid, the industry standard for Federated AI. With SuperGrid Agents, you can now build and run context-rich agents that learn from interactions, access sensitive data and (soon) collaborate with other SuperGrid Agents.
Most agentic systems rely on hardcoded heuristics to navigate execution decisions (e.g. which models, tools, and test-time compute scaling approaches to use) leading to efficiency leakage across cost, latency and accuracy. AI21 Maestro optimizes agents by learning to predict success, cost and latency probabilities across diverse actions and contexts, and driving runtime orchestration that intelligently navigates the full agentic action space. In this session, AI21's Or Dagan demonstrated how this approach yields state-of-the-art results and Pareto frontier on challenging agentic benchmarks, as well as the process required to optimize production agents.
In this talk by Zencoder's Andrew Filev, attendees learned how decomposing tasks into pipelines and dynamically routing them across models improves quality, reduces cost, and makes AI systems more reliable.
Building your first agent is exciting. Building a platform that can evolve into an office where dozens of teams can safely deploy their own agents is a different beast entirely. In this talk, Diamond Bishop from Datadog shared lessons learned building production agents, then turning this into an agent office/platform made to power the next-gen enterprise with diverse agent workloads.
"The 2026 JournalismAI Skills Lab is a 14-week, free, virtual program designed for professionals to learn how to practically implement LLMs, GenAI and agents in their work. The programme helps individuals upskill in using AI technologies in a hands-on manner. It equips participants to develop their own AI-based tools, prototypes or proofs-of-concept. The ultimate outcome […] The post Latin American journalists invited to apply for 2026 JournalismAI Skills Lab appeared first on LatAm Journalism Review by the Knight Center .
"The 2026 JournalismAI Skills Lab is a 14-week, free, virtual program designed for professionals to learn how to practically implement LLMs, GenAI and agents in their work. The programme helps individuals upskill in using AI technologies in a hands-on manner. It equips participants to develop their own AI-based tools, prototypes or proofs-of-concept. The ultimate outcome […] The post Latin American journalists invited to apply for 2026 JournalismAI Skills Lab appeared first on LatAm Journalism Review by the Knight Center .
More code, fewer staff — the industry is on a bender. But what about quality? At AI Dev 26 x San Francisco, Paul Everitt from JetBrains discussed the rise of agentic engineering and how old lessons can be adapted to build new professional practices.
A generation is being told AI is their enemy. And they’re starting to believe it.
Introducing Mistral Medium 3.5, remote coding agents in Vibe, plus new Work mode in Le Chat for complex tasks.
Connect enterprise data to your AI applications with reusable connectors, direct tool calling, and human-in-the-loop approval controls.
The public sector moves slowly by design. That might actually help it get AI right.
PETs offer U.S. critical-infrastructure AI a path beyond patchwork security. Why Attribution-Based Control should be the standard. The post Moving Fast Doesn’t Have to Break Things: The U.S. Must Stop Compromising Critical Infrastructure with Patchwork AI Security Approaches appeared first on OpenMined .
I am developing a spatiotemporal tree-based ensemble framework (utilizing LightGBM, XGBoost, and CatBoost) to forecast dengue outbreaks based on climate variables (temperature, precipitation, humidity) and lagged historical case counts. While tree-based algorithms are theoretically invariant to monotonic feature scaling, I am implementing scaling primarily because: I am calculating SHAP (Shapley Additive Explanations) values for post-hoc model interpretability and global feature importance. I am applying forward aggregation across temporal slices to prevent data leakage, meaning the range and variance of features dynamically shift across training validation windows. I am debating between StandardScaler (Z-score normalization) and MinMaxScaler (0-1 normalization). Given the spatiotemporal and epidemiological nature of the data, StandardScaler appears to behave more robustly, but I want to ensure my architectural justification is sound. Here is a minimal visualization of how the choice impacts extreme climate outliers (e.g., a massive monsoon rainfall anomaly): import numpy as np import pandas as pd from sklearn.preprocessing import MinMaxScaler, StandardScaler # Simulating a climate feature with a severe anomaly (monsoon spike) np.random.seed(42) weekly_rainfall = np.random.normal(loc=150, scale=30, size=100) weekly_rainfall = np.append(weekly_rainfall, [650]) # Extreme outlier event df = pd.DataFrame({"Rainfall": weekly_rainfall}) # Applying both scalers df["MinMax"] = MinMa…
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers 📝 The paper is available here: https://github.com/ailuntx/Thinking-with-Visual-Primitives https://huggingface.co/datasets/NodeLinker/deepseek-ai-Thinking-with-Visual-Primitives-deleted-repo/blob/main/Thinking_with_Visual_Primitives.pdf Our Patreon if you wish to support us: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi My research: https://cg.tuwien.ac.at/~zsolnai/ Thumbnail design: https://felicia.hu #deepseek
Dependency prefixes like ^ and ~ make updates easy, but the version ranges they create widen the path a compromised package can take into production.
Streaming vision-language models (VLMs) continuously generate responses given an instruction prompt and an online stream of input frames. This is a core mechanism for real-time visual assistants. Existing VLM frameworks predominantly assess models in offline settings. In contrast, the performance of a streaming VLM depends on additional metrics beyond pure video understanding, including proactiveness, which reflects the timeliness of the model’s responses, and consistency, which captures the robustness of its responses over time. To address this limitation, we propose VSAS-Bench, a new…