Latest open artifacts (#22): Zyphra, Cohere, and Poolside are expanding the breadth of the ecosystem
An assessment of the open ecosystem and the motivations behind releasing models
AI/ML news, top picks, and generated innovation digests.
19 articles tagged with this keyword, sorted by most recent first.
An assessment of the open ecosystem and the motivations behind releasing models
Work conducted with Yujun Zhou (yzhou25@nd.edu) and supported by SPAR TL;DR: In paired-choice paradigms, LLMs report consistent preferences over outcomes (e.g., types and number of lives saved, types of policies enacted) Some have suggested that this indicates that LLMs have human-like value systems We design an experimental framework where LLMs are able to modulate their output quality based on prompt context We find that LLMs modulate their output quality in response to effort exhortations, role-play instructions, and harmfulness cues, but NOT to opportunities to achieve the outcomes they report preferring in the paired-choice experiments We suggest that paired-choice paradigms do not provide evidence that LLMs have human-like (i.e., behavior-motivating) value systems, and that our paradigm offers a way to measure the degree to which LLMs have desires Paper describing the work in detail here LLMs report that they prefer some things to others. In paired-choice experiments , where they are repeatedly presented with two options and asked to select the one that they prefer, coherent utility structures emerge: LLMs consistently report preferring certain types of things, and their choices reveal the ability to make quantitative tradeoffs between things and exhibit transitivity (e.g., if they choose A over B and B over C, they will also choose A over C). Human choices exhibit the same properties, which has led some to the implication that LLMs have goals, value systems, and even…
State of the Digital Decade 2026 - Factsheet dumimar Wed, 06/17/2026 - 10:22 This factsheet outlines the key findings of the 2026 State of the Digital Decade Report. Highlighting the progress made by the EU towards the 2030 targets, it also mentions the key points to help EU's digital transformation to move forward: Scale: coordination and cofinancing across Member States and EU instruments Speed: implementation, simplification, recalibration of policy Coherence: simultaneous deployment and uptake of strategic technologies It also summarises the main concerns for Europeans in 2026, based on the Special Eurobarometer survey . You can download the factsheet below. Find out more about the 2026 State of the Digital Decade . Downloads State of the Digital Decade 2026 - Factsheet Download Related topics Digital Decade Digital Decade reporting Digital Decade 2026
Four days after Washington cut foreign access to Anthropic's top models, the fallout is clear — and it's flowing to everyone but Anthropic. Cohere says it's drowning in government inbounds, DeepSeek just closed a record $7.4B round, and China's labs are slashing token prices up to 99%. The export control meant to protect America's AI lead is fast-tracking the alternatives. Also this week: 144 poisoned npm packages turn the AI supply chain into an open credential heist.
AI runs at the speed of light. More and more, that light is made in Texas. Coherent broke ground today on an expanded manufacturing building in Sherman, Texas. The company makes the lasers, optical components and compound semiconductors that wire AI systems together — and runs what it calls the world’s first 6-inch indium phosphide […]
Short note on North Mini Code, Cohere's 30B total and 3B active open-weight MoE model for agentic coding tasks.
Modern large language models (LLMs) extend context lengths to millions of tokens, enabling coherent, personalized responses grounded in long conversational history. However, the Key-Value (KV) cache grows linearly with the extended dialogue history, causing the model’s memory footprint to quickly exceed device limits. While recent KV cache compression methods attempt to reduce memory usage, most apply cache eviction after processing the entire context, incurring unbounded peak memory usage. Additionally, query-dependent eviction narrows the cache semantics to a single query, leading to failure…
TL;DR LLM evals, automated judges that assess relevance, coherence, and quality at scale, are a powerful new... The post Better Experiments with LLM Evals — A funnel, not a fork appeared first on Spotify Engineering .
By combining State-Space Models (SSMs) for efficient long-range dependency modeling with dense local attention for coherence, and using training strategies like diffusion forcing and frame local attention, researchers from Adobe Research successfully overcome the long-standing challenge of long-term memory in video generation. The post Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models first appeared on Synced .
Here are eight observations I’ve shared recently on the Cohere blog and videos that go over them.: Article: What’s the big deal with Generative AI? Is it the future or the present? Article: AI is Eating The World
Discovering systematic errors with cross-modal embeddings In this blog post, we introduce Domino, a new approach for discovering systematic errors made by machine learning models. We also discuss a framework for quantitatively evaluating methods like Domino. Links: 📄 Paper (ICLR 2022) 🌍 Longer Walkthrough 💻 GitHub 📘 Docs 📒 Google Colab Machine learning models that achieve high overall accuracy often make systematic errors on coherent slices of validation data. What is a slice? A slice is a set of data samples that share a common characteristic. As an example, in large image datasets, photos of vintage cars comprise a slice (i.e. all images in the slice share a common subject). The term slice has a number of synonyms that you might be more familiar with (e.g. subgroup, subpopulation, stratum). These terms are largely interchangeable, but we’ll stick with “slice” throughout this post. We say that a model underperforms on a slice if performance on the data samples in the slice is significantly worse than its overall performance. The search for underperforming slices is a critical, but often overlooked, part of model evaluation. When practitioners are aware of the slices on which their models underperform, they can make more informed decisions around model deployment. This is particularly important in safety-critical settings like medicine: a diagnostic model that underperforms on younger patients should likely not be deployed at a pediatric hospital. Slice awareness can also he…
A little less than a year ago, I joined the awesome Cohere team. The company trains massive language models (both GPT-like and BERT-like) and offers them as an API (which also supports finetuning). Its founders include Google Brain alums including co-authors of the original Transformers paper. It’s a fascinating role where I get to help companies and developers put these massive models to work solving real-world problems. I love that I get to share some of the intuitions developers need to start problem-solving with these models. Even though I’ve been working very closely on pretrained Transformers for the past several years (for this blog and in developing Ecco), I’m enjoying the convenience of problem-solving with managed language models as it frees up the restrictions of model loading/deployment and memory/GPU management. These are some of the articles I wrote and collaborated on with colleagues over the last few months: Intro to Large Language Models with Cohere This is a high-level intro to large language models to people who are new to them. It establishes the difference between generative (GPT-like) and representation (BERT-like) models and examples use cases for them. This is one of the first articles I got to write. It's extracted from a much larger document that I wrote to explore some of the visual language to use in explaining the application of these models. A visual guide to prompt engineering Massive GPT models open the door for a new way of programming. If yo…
When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.