RAG
198 articles tagged with this keyword, sorted by most recent first.
Bajaj Finserv Ventures leads $10 Mn pre Series B round in Kapture CX
Verticalized full stack agentic AI platform Kapture CX has raised $10 million in a pre Series B funding round led by Bajaj Finserv Ventures (BFSV), part of Bajaj Finserv, with participation from its existing investors Cactus Venture Partners and India Alternatives. Prior to this, the Bengaluru based company had secured $4 million led India Alternatives extended Series A round in December 2023 and $4 million in a Series A round led by Cactus Venture Partners (CVP) in July 2023. The fresh proceeds will be utilized for expansion into multiple global markets and continued investment in R&D and product development, Kapture CX said in a press release. Co-founded in 2014 by Sheshgiri Kamath and Vikas Garg, Kapture CX is a verticalized, full stack agentic AI platform built to orchestrate high stakes workflows for large enterprises. Through its deep tech capabilities, it brings AI agents, operational intelligence, and human oversight into one system, allowing enterprises to run complex operations at scale. Kapture CX said that enterprises face a fragmented market with point products from multiple providers, making AI adoption a high effort exercise. According to the company, enterprises need a full stack agentic AI platform that understands industry specific requirements and delivers customized solutions for complex workflows. This is the gap Kapture aims to address. By owning and optimizing the full technology stack, from the models to the agentic layer and the user interface, Kaptu…
CHANI: Correlation-based Hawkes Aggregation of Neurons with bio-Inspiration
The present work aims at proving mathematically that a neural network inspired by biology can learn a classification task thanks to local transformations only. In this purpose, we propose a spiking neural network named CHANI (Correlation-based Hawkes Aggregation of Neurons with bio-Inspiration), whose neurons activity is modeled by Hawkes processes. Synaptic weights are updated thanks to an expert aggregation algorithm, providing a local and simple learning rule. We were able to prove that our network can learn on average and asymptotically. Moreover, we demonstrated that it automatically produces neuronal assemblies in the sense that the network can encode several classes and that a same neuron in the intermediate layers might be activated by more than one class, and we provided numerical simulations on synthetic datasets. This theoretical approach contrasts with the traditional empirical validation of biologically inspired networks and paves the way for understanding how local learning rules enable neurons to form assemblies able to represent complex concepts.
A Symplectic Analysis of Alternating Mirror Descent
Motivated by understanding the behavior of the Alternating Mirror Descent (AMD) algorithm for bilinear zero-sum games, we study the discretization of continuous-time Hamiltonian flow via the symplectic Euler method. We provide a framework for analysis using results from Hamiltonian dynamics and symplectic numerical integrators, with an emphasis on the existence and properties of a conserved quantity, the modified Hamiltonian (MH), for the symplectic Euler method. We compute the MH in closed-form when the original Hamiltonian is a quadratic function, and show that it generally differs from the other conserved quantity known previously in the literature. We derive new error bounds on the MH when truncated at orders in the stepsize in terms of the number of iterations, $K$, and use these bounds to show an improved $\mathcal{O}(K^{1/5})$ total regret bound and an $\mathcal{O}(K^{-4/5})$ duality gap of the average iterates for AMD. Finally, we propose a conjecture which, if true, would imply that the total regret for AMD scales as $\mathcal{O}\left(K^{\varepsilon}\right)$ and the duality gap of the average iterates as $\mathcal{O}\left(K^{-1+\varepsilon}\right)$ for any $\varepsilon>0$, and we can take $\varepsilon=0$ upon certain convergence conditions for the MH.
DCatalyst: A Unified Accelerated Framework for Decentralized Optimization
We study decentralized optimization over a network of agents, modeled as an undirected graph and operating without a central server. The objective is to minimize a composite function $f+r$, where $f$ is a (strongly) convex function representing the average of the agents' losses, and $r$ is a convex, extended-value function (regularizer). We introduce DCatalyst, a unified black-box framework that injects Nesterov-type acceleration into decentralized optimization algorithms. At its core, DCatalyst is an inexact, momentum-accelerated proximal scheme (outer loop) that seamlessly wraps around a given decentralized method (inner loop). We show that DCatalyst attains optimal (up to logarithmic factors) communication and computational complexity across a broad family of decentralized algorithms and problem instances. In particular, it delivers accelerated rates for problem classes that previously lacked accelerated decentralized methods, thereby broadening the effectiveness of decentralized methods. On the technical side, our framework introduces inexact estimating sequences--an extension of Nesterov's classical estimating sequences, tailored to decentralized, composite optimization. This construction systematically accommodates consensus errors and inexact solutions of local subproblems, addressing challenges that existing estimating-sequence-based analyses cannot handle while retaining a black-box, plug-and-play character.
China racks up advantages from Iran energy crisis
China’s relative insulation from supply and price shocks will “deepen economic dependencies” that Beijing could leverage for geopolitical advantage, a new report argued.
I always keep these 3 devices plugged into my power station - here's why
Here's how to leverage your power station's capabilities when it's not during an emergency.
WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense
The Wall Street Journal printed an outright false headline and heavily misleading story claiming this, which of course was uncritically amplified by the usual suspects. I post this now on its own so that we have a place to link to, to explain the situation. Headline News WSJ Headline (Obvious Nonsense): China Has Matched Anthropic in Cybersecurity, Resetting AI Race. That. Did. Not. Happen. The post even claims, explicitly, that Claude Opus 4.8 similarly ‘matches’ Claude Mythos, a claim which is even more obviously false. Shame upon the Wall Street Journal. I fear Gell-Mann Amnesia. If they can get something as important as this so completely wrong, what about everything else? I am skipping over the parts that involve accurate reporting, or minor quibbles. It seems important to focus on clearly debunking the central false claims. Alas, the mistakes made here very much rhyme with mistakes being made throughout all this by the White House, and that get latched onto by certain bad actors, who have played a large part in leaving us unprepared for the Mythos Moment. For a full understanding of GLM-5.2, which is indeed an impressive open model, here is my full coverage of that release , placing it in proper context. It is important to understand what makes Mythos special. This is not it. What Makes Mythos Special What makes mythos special is not that only the chosen one can identify any given vulnerability in code. What makes Mythos special is that it can identify vulnerabilities…
The Role of Static Code Analysis in Fintech Compliance
What’s at stake with every commit Security incidents in financial services are both frequent and costly. The average data breach in the financial industry cost USD 6.08 million in 2024. That includes incident response, customer notification, legal work, and reputational damage, but it doesn’t include the months of engineering that went into writing well-intentioned but […]
'Extremely unusual' heat with 'no end' in sight, says Copernicus director
Carlo Buontempo, Copernicus Climate Change service director, says heatwaves have become "more intense, lasting longer and starting earlier in the season" in Europe as the continent "is warming faster than the global average".
I went to the World Cup opener at SoFi Stadium in Los Angeles as a first-time attendee. I'd do it again in a heartbeat.
I went to my first World Cup game, the US opener against Paraguay in Los Angeles' SoFi Stadium. It was worth it, especially from my suite seats.
Your RAG Pipeline Is Probably Useless. Here’s a Better Alternative
Learn what to reach for when retrieval-augmented generation fails in production.
The Future of Retail Marketing Is First-Party Data
This post was created in partnership with Sam’s Club Retail media that looks beyond ROAS and leverages customer insights can unlock long-term value and customer loyalty. During an ADWEEK House […]
How Marketers Can Deliver Outcomes Across Fragmented Journeys
This post was created in partnership with Teads Key takeaways Audience journeys have become fragmented and unpredictable, making it harder for marketers to reach, attribute, and measure consumers across channels. […]
Kemi Badenoch tries out her Andy Burnham attack lines
The Tory leader — and her rivals in Nigel Farage's Reform UK — are still working out the best way to swipe at the PM-in-waiting.
KI-Engpass: Google kann Metas Nachfrage nach Gemini nicht mehr decken
Die Nachfrage nach KI-Rechenleistung übersteigt selbst bei den größten Tech-Konzernen das Angebot. Meta ist wohl besonders betroffen und muss intern umsteuern.
The great cloud rebalance
For years, the enterprise narrative focused on moving to the public cloud for flexibility and leaving behind old infrastructure. While the public cloud remains a powerful platform for burst capacity, global reach, and modern application development, leaders now evaluate where each workload can achieve the best financial performance, operational efficiency, and risk. Cloud repatriation is back on the CIO’s agenda. Cloud repatriation does not always mean dragging workloads back into a company-owned data center. In many cases, enterprises are moving applications and data from hyperscale public cloud platforms into colocation environments, hosted private clouds , or MSP-operated infrastructure. The common thread is not nostalgia for on-premises IT. It is the desire for a more suitable workload placement. Enterprises are deciding that some systems belong in public cloud while others are better served in environments with more predictable economics, tighter control, and fewer architectural compromises. Cost is the loudest signal The most common reason enterprises repatriate workloads is cost. Public cloud pricing works extremely well when demand is variable, when teams need rapid provisioning, or when a business wants to avoid upfront capital spending. But not every enterprise workload behaves that way. Many core systems are steady, always-on, data-intensive, and relatively predictable. For those workloads, usage-based pricing can become less attractive over time. Compute charges,…
Snupit Leverages Locally Built AI to Connect South Africans With Trusted Service Providers
Artificial intelligence continues to reshape industries worldwide, and South Africa’s online services sector is no exception, with AI increasingly used to help customers find the right businesses faster and more efficiently. Online services marketplace Snupit has been leveraging AI-powered technology for years to improve how customers connect with trusted local service professionals. While many companies [...]
GraphRAG vs Vector RAG: Which Retrieval Method is Best?
GraphRAG and Vector RAG address different retrieval needs. Vector RAG splits documents into chunks, embeds them, retrieves semantically similar passages, and sends them to an LLM. It is simple, fast to build, and works best when answers sit within one or two relevant chunks. GraphRAG adds structure by extracting entities, relationships, and communities, making it […] The post GraphRAG vs Vector RAG: Which Retrieval Method is Best? appeared first on Analytics Vidhya .
Here's what will happen to Rhaenyra Targaryen on 'House of the Dragon,' if it follows her fate in the book
Rhaenyra Targaryen (Emma D'Arcy) has finally become queen in "House of the Dragon." How will she die in the show? Here's what happens in the book.
The 13 saddest deaths in 'House of the Dragon,' ranked
"House of the Dragon" doesn't shy away from killing major characters. But some deaths — like Jace's in season three — are more tragic than others.
7 deaths in 'House of the Dragon' that were completely changed from the book
"House of the Dragon" season three has already killed major characters, including one whose death was described differently in "Fire & Blood."
Paraguay eye 'life-changing' Germany World Cup upset
Paraguay eye 'life-changing' Germany World Cup upset
Anthropomorphic Misalignment research needs stronger evidence
This is a distillation of our ICML 2026 Oral position paper, Position: Anthropomorphic Misalignment Research Needs Stronger Evidence . Joint work by Vansh Gupta, Peter Nutter, Samuel Stante, Andreas Krause, Florian Tramèr, Lukas Fluri, Xin Chen, and Anna Hedström at ETH Zurich. Code is here . TL;DR AI safety research increasingly studies behaviors that sound human: deception, scheming, sycophancy, shutdown resistance, and emergent misalignment. We refer to this family of work as anthropomorphic misalignment research (AMR) . Anthropomorphic language is useful, as it points to the risks we are worried about. Yet it also tacitly introduces assumptions about models having intent or other human-like properties, which can lead to misclassified phenomena, mistaken conclusions, and misallocated resources. These behaviors are important to study, but doing so requires stronger and more rigorous evidence than the field currently provides. In the paper, we argue that AMR requires a clearer match between claims and evidence. Specifically, we: describe a shared AMR pipeline: target behavior framing, data construction, experimental design, and causal or mechanistic attribution; identify recurring failure points: vague concepts, narrow datasets, fragile evaluations, unreliable LLM judges, missing controls, and correlation being treated as causation; propose three evidence levels: L1 behavioral evidence, L2 functional evidence, and L3 causal-mechanistic evidence; offer 12 recommendations and…
Swiss glaciers melting at alarming rate in June as Europe faces extreme heat
All snow and ice accumulated over the past winter is expected to have melted by Monday. Over the past century, this tipping point usually only arrives in mid-August on average.
Govee’s smart nugget ice maker makes every iced drink feel like a luxury
Govee says that the modern-design gadget delivers nugget ice in as little as six minutes.
Feature Request: Bring Project-Scoped Retrieval to ChatGPT
Background ChatGPT has evolved from a conversational assistant into a tool that many users rely on for long-term projects, including software development, research, writing, game development, and creative work. Today, responses are generated primarily from: Shared long-term memory Current conversation history User-provided prompts and uploaded files This works well for general conversations, but becomes increasingly difficult for large, long-lived projects. The Current Problem Long-term memory is shared across all projects. When users switch between unrelated projects, memories from previous work may unintentionally influence responses. To avoid this, users must repeatedly search through their own documents, retrieve the relevant information, and paste it into every new conversation. Effectively, users become the retrieval layer in the RAG pipeline—acting as “organic RAG hardware.” The issue is not the context window size. The issue is that project knowledge exists, but ChatGPT cannot automatically retrieve it. Existing Precedent OpenAI has already demonstrated the value of project-aware retrieval through tools such as Codex. Instead of relying solely on conversation history, these tools understand an entire project by retrieving only the files relevant to the current task. This greatly improves long-term collaboration without requiring extremely large context windows. Proposed Solution Extend this idea beyond software development. Allow every ChatGPT project to own a dedica…
Nest’s quest to fix your thermostat
The founding story of Nest is pretty much a perfect tech myth. A legendary product maker (in this case, Tony Fadell) helps create one of the most successful products ever (the iPhone) and then rides off into the sunset to enjoy the rest of his life, only to have an experience that drags him back […]
Portuguese bank sign's storage is about to cash out
Time to switch back to paper and harvest that suddenly valuable RAM
Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines
In this tutorial, we build a stable workflow around the Fable 5 Traces dataset from Hugging Face. We avoid fragile dependencies and manually parse the merged JSONL file to keep Colab reliable. We inspect repository files, normalize tool calls, audit structure, redact secrets, and visualize key distributions. We also export safe no-CoT chat datasets and train pure-Python Naive Bayes baselines on the traces. The post Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines appeared first on MarkTechPost .
US and Iran trade strikes amid claims of ceasefire violations
The US also carried out strikes on Iran on Friday after a drone attack on the M/V Ever Lovely ship. US aircraft targeted Iranian missile and drone storage locations as well as coastal radar sites, CENTCOM said.
Margaret Atwood says the problem with AI is ‘garbage in, garbage out’
Margaret Atwood, the storied author of The Handmaid's Tale and The Blind Assassin, was interviewed as part of the Babell Literary and Cultural Festival in Porto, Portugal. As it usually does at these things, the issue of AI came up, and Atwood didn't mince words. According to Deadline's recap, Atwood said she'd used an AI […]
Apple wants permission to buy memory from a blacklisted Chinese supplier
Apple is looking to alleviate some of the pressure on its supply chain by seeking an exception from the Trump administration to buy RAM chips from CXMT, a company blacklisted by the Pentagon over ties to the People's Liberation Army, according to the Financial Times. The skyrocketing prices of RAM and storage have driven Apple […]
Comment on Does Fitness Data Make the Average Person Healthier? by Barbara Campbell
This is a very relevant point about fitness trackers. The data itself is useful, but it only really matters when it is turned into personal guidance that fits someone’s actual health situation. A simple step goal can motivate one person but be completely wrong for another, especially with medical limits. That idea of meaningful feedback also reminds me of Speed Stars - where progress feels useful because it is tied to timing, repetition, and personal improvement rather than just numbers on a screen.
Some of the best Amazon Prime Day SSD and storage deals are still live - Samsung, WD, and more
I track SSD deals, and found huge markdowns from top brands for Amazon Prime Day.
J.P. Morgan sees a pile of red flags in the AI market
J.P. Morgan warns that there are "signs of investor exuberance" in AI markets. Just 42 AI companies in the S&P 500 account for 65 to 80 percent of the index's total profits. The semiconductor rally is flashing technical patterns last seen during the dotcom bubble, and leveraged chip ETFs have quintupled their market influence since early 2024. The bank sees multiple layers of concentration risk across markets, infrastructure, and the economy. The article J.P. Morgan sees a pile of red flags in the AI market appeared first on The Decoder .
Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on
New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market.
Anthropic’s Mythos 5 is back
After a rollercoaster negotiation process with the Trump administration that dragged on for two weeks, Anthropic's Mythos 5 is finally back in action - at least, somewhat, for a select group of organizations, according to a letter from the government to Anthropic that was viewed by The Verge. Fable 5, however - the public-facing Mythos-class […]
Universal Is Skipping Influencer Screenings for ‘The Odyssey.’ Film Critics Are Thrilled
With discourse about Christopher Nolan’s epic already raging, first viewer reactions will come from mainstream press, not content creators. Some critics are gloating over this break from the norm.
Deployment Awareness Matters More Than Evaluation Awareness
TL;DR Evaluation awareness — an AI recognizing it's being evaluated — is a widely discussed concept in AI safety. But there is a closely related concept that we claim is more important: deployment awareness , the AI's ability to recognize when it is not being evaluated and when its actions matter. A misaligned AI with deployment awareness can game evaluations without any evaluation awareness at all, with a simple strategy: act aligned by default, and deviate only when confident you're in real deployment and your actions matter for your goals. This requires two ingredients — occasionally recognizable deployment situations, and enough self-reflective and strategic reasoning for the AI to anticipate and plan around this. We think "deployment awareness" better identifies what makes evaluations fragile, and we develop this idea below. Concept Explanation Comments Evaluation awareness AI is being tested and confidently believes that this is so This only becomes a problem if most evaluations trigger evaluation awareness, and if the AI knows that. Or if the AI has good self-locating reasoning. Deployment awareness AI is not being tested and confidently believes it is not being tested This is a problem even if it happens rarely (if some of those rare cases are high stakes). Accurate self-locating beliefs AI has (roughly correct) beliefs about the sequence of situations it will face This allows for strategic planning. It makes deployment awareness and probabilistic strategies more eff…
‘Botsitting’: The AI time-savings killer only governance can stop
One of AI’s biggest selling points is all the high-value tasks employees will be free to accomplish with the time saved using AI. Reality, however, remains far from that. While IT workers and other employees do save several hours each week thanks to AI, more than half of that time is burned up babysitting the technology, a new study reveals. According to a survey from the Work AI Institute , digital workers save an average of 11 hours a week through AI, but the net time savings is much less, because they spend 6.4 hours a week “botsitting.” Botsitting involves activities such as feeding AI tools missing context, checking AI outputs, debugging AI mistakes , rerunning prompts, and cleaning up the confident-but-wrong answers they leave behind, as defined by the Work AI Institute, a research group founded by AI copilot and search provider Glean. The botsitting problem is real, several IT leaders agree, and it has serious implications for IT organizations. In many cases, organizations aren’t training their employees to effectively use AI, says Tal Carmi , CIO at digital adoption platform provider WalkMe. WalkMe’s 2026 State of Digital Adoption report found similar results, with employees losing nearly eight hours a week to botsitting, Carmi notes. At the same time, most employees use AI for shallow tasks like writing emails because they don’t trust it for more complex activities, WalkMe found. As a result, enterprises aren’t getting the full ROI of their AI purchases, Carmi says,…
Open source maintainership in the age of AI
AI has really changed the game around software development. More people are leveraging AI than ever to contribute patches to projects they use. To me, this is a good thing as more folks will contribute patches rather than fork or not fix them. The main problem is that AI has made generating code fast but there has been very little improvement in maintaining code bases. In this post, we will highlight the ways the Kubernetes community is adapting to the world of AI assisted coding. The first step of this journey was to develop an AI policy. This seems mundane and bureaucratic but there were many PRs that derailed into discussions around AI usage. The AI policy helps steer the conversation around the project's stance on AI and provides a clear signal to contributors on how to use these tools responsibly. Kubernetes AI policy The Kubernetes project has established clear guidelines for AI-assisted contributions that balance innovation with accountability. These policies are designed to maintain code quality and ensure human oversight while acknowledging that AI tools can be valuable aids in the development process. Transparency first Contributors must disclose when AI tools have been used to assist with a pull request. A simple statement in the PR description such as "This PR was written in part with the assistance of generative AI" is sufficient. This transparency helps reviewers understand the context and apply appropriate scrutiny. Human accountability While AI tools can assi…
You can’t build sovereign infrastructure with Broadcom, says CISPE
Broadcom’s claims that it can support European cloud service providers building competitive sovereign solutions are exaggerated, according to the Cloud Infrastructure Service Providers Europe (CISPE). The US company is promoting its VMware Cloud Foundation (VCF) software as the enabling technology for the European Union’s Sovereign Cloud, but Broadcom is not the solution to Europe’s technology sovereignty problems, according to CISPE secretary-general Francisco Mingorance . “VCF is a proprietary product with limited interoperability and substitutability, controlled by a foreign vendor that has behaved like a bully towards customers and channel partners. If Europe needs an example of the dangers of over-reliance on dominant overseas players, Broadcom is it,” Mingorance said, according to a post on CISPE’s website . CISPE has cited several reasons why VCF doesn’t fit the bill, in particular highlighting its lack of portability. This means that it doesn’t qualify as resilient under CISPE’s Sovereign and Resilient Cloud Framework . Earlier this month, the EU unveiled proposals for its Cloud and AI Development Act (CADA) to strengthen Europe’s digital economy. CADA will encourage investment in European research, lay down conditions for European data centers, and provide a single EU-wide assessment framework for cloud and AI sovereignty. CISPE said that Broadcom is a long way short of fulfilling the conditions proposed for CADA. Broadcom would fail to meet anything but a Level 1 c…
heise+ | KI-Überwachung in der Straßenbahn: Mehr Sicherheit im Fahrgastraum?
Aggression frühzeitig erkennen: Die Bremer Straßenbahn AG testet ein KI-Überwachungssystem. Ein Videobeitrag zu „AI Watch“.
Water Cooler Small Talk, Ep. 11: Overfitting in RAG evaluation
Why memorizing for the exam doesn't mean you understand the subject The post Water Cooler Small Talk, Ep. 11: Overfitting in RAG evaluation appeared first on Towards Data Science .
pgEdge joins rush to merge OLTP and OLAP storage to support AI
For years, enterprises have maintained separate systems for processing transactional (OLTP) and analytical (OLAP) data, even if that meant moving data between them. However, the rise of autonomous agents and AI applications needing immediate access to data while generating volumes of operational data themselves, has exposed the cost and complexity of maintaining those separate systems. The industry’s response has been quick, with data warehouse and database vendors proposing a wave of competing approaches to collapsing those data silos. In the past few weeks Databricks unveiled LTAP and EDB introduced converged analytics , while late last year Snowflake launched pg_lake , all of which offer different blueprints for bringing transactional, analytical and AI workloads closer together. Now it’s the turn of distributed PostgreSQL provider pgEdge, which has introduced a beta version of ColdFront , a PostgreSQL-native hot-and-cold data tiering architecture that automatically moves older data into Apache Iceberg object storage while keeping PostgreSQL as the only database that applications need to interact with. In ColdFront’s architecture, hot and cold refer to newer and older data, respectively. The approach of keeping PostgreSQL as the primary interface is what sets ColdFront apart from the other architectures emerging in this space, differing in where the center of gravity for data lies, according to analysts. Databricks’ LTAP keeps operational applications connected to a lakeh…
How Cara pioneers domain-specific AI for enterprise insurance brokerages with AWS
In this post, we explore how Cara, built in cooperation with AWS, addresses these challenges. We walk through the technical design decisions and the AWS services that support the solution. We also share measurable outcomes Cara has delivered for enterprise brokerages.
pgEdge joins rush to merge OLTP and OLAP storage to support AI
For years, enterprises have maintained separate systems for processing transactional (OLTP) and analytical (OLAP) data, even if that meant moving data between them. However, the rise of autonomous agents and AI applications needing immediate access to data while generating volumes of operational data themselves, has exposed the cost and complexity of maintaining those separate systems. The industry’s response has been quick, with data warehouse and database vendors proposing a wave of competing approaches to collapsing those data silos. In the past few weeks Databricks unveiled LTAP and EDB introduced converged analytics , while late last year Snowflake launched pg_lake , all of which offer different blueprints for bringing transactional, analytical and AI workloads closer together. Now it’s the turn of distributed PostgreSQL provider pgEdge, which has introduced a beta version of ColdFront , a PostgreSQL-native hot-and-cold data tiering architecture that automatically moves older data into Apache Iceberg object storage while keeping PostgreSQL as the only database that applications need to interact with. In ColdFront’s architecture, hot and cold refer to newer and older data, respectively. The approach of keeping PostgreSQL as the primary interface is what sets ColdFront apart from the other architectures emerging in this space, differing in where the center of gravity for data lies, according to analysts. Databricks’ LTAP keeps operational applications connected to a lakeh…
Schöne neue Suchwelt: Warum Google und Co. bald für KI-Fehler haften sollen
Wer fragt, kriegt fertige KI-Antworten statt Linklisten. Laut einem Rechtsgutachten für die Landesmedienanstalten sollen die Tech-Riesen dafür voll einstehen.
Anthropic’s Mythos mess is only getting worse
It's been two weeks since Anthropic took its Mythos-class models offline after a Friday evening ultimatum from the Trump administration. The company sprang into action immediately, sending a barrage of executives to Washington, DC. But updates have been suspiciously lacking, with no resolution in sight. Anthropic declined to comment multiple times this week about the […]
Amplify the Expert: A Philosophy for Building Enterprise RAG
Enterprise Document Intelligence [Vol.1 #M1] - The thesis behind every architectural choice in this series The post Amplify the Expert: A Philosophy for Building Enterprise RAG appeared first on Towards Data Science .
China’s transnational interference threatens digital rights globally
China’s transnational interference threatens digital rights globally H.Seidl Fri, 06/26/2026 - 14:56 picture alliance / NurPhoto | Jaap Arriens Comment Jun 26, 2026 4 min read China’s transnational interference threatens digital rights globally Beijing’s coercive use of digital tools and economic leverage undermines international efforts to regulate digital technologies, say Daria Impiombato and Wendy Chang. Signs are mounting that the Chinese government is expanding its transnational repression both in terms of tools and targets. The first half of 2026 has seen evidence of online and offline attempts to silence overseas critics that cross its political red lines. Only in May, an AI-generated harassment campaign against Europe-based human rights researcher Laura Harth, known for her work exposing China’s overseas police stations, was made public. The campaign, which relied on misogynistic and sexualized images, shows how Beijing is incorporating generative AI into its transnational repression efforts, allowing new forms of scalable, personalized attacks aimed at damaging the reputation of critics abroad. But attempts to silence individuals have also widened to target global civil society collectively. Another recent victim of a reported Chinese government campaign was an entire conference dedicated to advancing digital rights for all – the rights people should enjoy online, including privacy, freedom of expression, access to information and protection from unlawful surveilla…
SAP aligns commerce data for AI personalisation
SAP aligns fragmented commerce data structures to enable operational AI personalisation at the execution layer. Enterprise leadership routinely establishes objectives to anticipate customer requirements and deliver relevant interactions across digital touchpoints. However, the actual infrastructure running inside these enterprises fails to support systematic execution at the required volume. Recommendation engines display generic product listings because […] The post SAP aligns commerce data for AI personalisation appeared first on AI News .
ZTE presents full-stack intelligent energy storage solutions at the Smarter E Europe 2026, powering Europe's energy transition
PARTNER CONTENT: ZTE introduces full-stack, AI-driven energy storage solutions in Munich to address the rising power demands of the digital infrastructure era
European heat wave pressures electricity grids
Germany and the UK are on track to see their highest June average electricity prices since the 2022 energy crisis.
Finnovate secures $2 Mn in pre-Series A funding from angel investors
Finnovate Financial Services, a financial planning platform, has secured approximately $2 million in a pre Series A funding round from a group of angel investors, including Ramakant Deshpande and others. The latest round comes on the back of the company's successful $1 million raise in 2023. The proceeds will be deployed towards scaling operations, enhancing the mobile application and technology infrastructure, expanding marketing efforts, and strengthening business development initiatives, Finnovate said in a press release. Co-founded by Nehal Mota and Naveen Singh, Finnovate aims to help professionals achieve their financial goals through structured and purposeful investing. The firm recently received its Portfolio Management Services (PMS) license, expanding its presence in the wealth management space and paving the way for the launch of its upcoming Multi Asset PMS Strategy. Designed to provide diversified exposure across multiple asset classes, the new offering aims to help investors navigate market cycles more effectively while pursuing long term wealth creation. The strategy is expected to expand the firm's investment solutions suite and cater to the evolving needs of India's growing investor community. India's wealth management sector continues to witness strong momentum, driven by rising financial awareness, increasing participation in capital markets, and growing demand for professional advisory solutions. The company aims to leverage its personalized advisory capa…
Italy: Biopharma, automotive and telecoms sectors are facing China’s technological power
Italy: Biopharma, automotive and telecoms sectors are facing China’s technological power H.Seidl Fri, 06/26/2026 - 12:13 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 16 min read Italy: Biopharma, automotive and telecoms sectors are facing China’s technological power You are reading the Italy chapter of the 2026 report of the European Think Tank Network on China (ETNC) "Fragmented Europe: Dealing with China as a technology and innovation power". Go back to the main page . By Aurelio Insisa , MERICS (formerly Istituto Affari Internazionali), and Francesca Maremonti Research Fellow, IAI China’s capacity to innovate and lead in high-added-value industrial sectors is a critical element of its technological power, alongside its unparalleled strength in ensuring high-volume and cost-competitive manufacturing in traditional industrial sectors. 1 According to Goldman Sachs forecasts, Italy is one of the countries that will suffer most from the impact of Chinese industrial plans for the 2026-30 period, together with Mexico, the CEE-4 group, and Germany. 2 Recent trends: China’s tech power shapes Italy’s industrial future China’s rise as a tech power poses two fundamental questions to Italian institutions and private actors. First, should Italy continue cooperating with Chinese players in sectors where China’s innovation capabilities could further diminish Italy’s already declining global share in high-tech manufacturing? Second, should Rome grant access…
China’s unicorns rise to 381 as ByteDance ranks in top 3 globally: Hurun index
China is breeding a new generation of unicorns – start-ups currently valued at US$1 billion or more – at an accelerating pace, cementing its place alongside the United States in a major global index. The world’s second-largest economy saw 381 Chinese unicorns on the 2026 Hurun Global Unicorn Index, an increase of 38 from last year. The annual list, released on Tuesday, ranks the world’s most valuable start-ups. China now mints a new unicorn every five days on average, double last year’s pace of...
QSR and beverage brand Alienkind raises $3.2 Mn in pre-Series A round
QSR and beverage brand Alienkind has raised $3.2 million in pre-Series A funding round from existing investors like Prakash Sikaria, Flipkart senior VP Ravi Iyer, Arpan Sheth, and others. Prior to this, the Bengaluru-based startup had secured $1.2 million in a Seed funding round at the valuation of $10 million in April last year. The proceeds will be used to support its next phase of growth, including expansion into new markets, Alienkind said in a press release. Co founded in 2024 by Vikram Kakkireni and Abhishek Kumar, Alienkind is a next generation Quick Service Restaurant (QSR) chain specialising in 100% fresh, functional wellness beverages, and food offerings such as Starship burgers. The company aims to redefine the QSR experience through immersive spaces, category defining products, and community driven experiences. It targets Gen Z and health conscious urban consumers by combining minimalist, sci fi inspired store designs with preservative free beverages and plant forward meals. Since its launch, Alienkind claims to have gained nationwide attention with its differentiated positioning, competitive pricing, and brand strategy. The company says it is on track to achieve an annual recurring revenue (ARR) of $10 million by the end of FY27 and plans to expand to 100 stores across major Indian cities by FY28.
China keeps an eye on AI smart glasses as privacy concerns come into focus
China has issued the first industry code of conduct for smart glasses powered by artificial intelligence, following public outrage over videos taken by users covertly filming strangers with the increasingly popular devices. The voluntary code calls on smart eyewear manufacturers to adopt a “minimum data collection” approach, provide clear indicators when cameras or microphones are active, and obtain explicit user consent before recording. The guidelines were released on Thursday by the China...
Vector RAG Isn’t Enough — I Built a Context Graph Layer for Multi-Agent Memory
I benchmarked raw chat history, vector-only RAG, and a context graph on the same multi-agent conversations. The results exposed a surprising weakness in relational retrieval. The post Vector RAG Isn’t Enough — I Built a Context Graph Layer for Multi-Agent Memory appeared first on Towards Data Science .
Retrofit, don’t rebuild: Agentic overlays for transforming legacy enterprise services
In this technical collaboration between AWS and the authors, we present a pragmatic solution: agentic overlays. Agentic overlays are thin wrapper layers that transform traditional REST-based services into agents capable of participating in A2A interactions. They also expose REST APIs as tools compatible with the Model Context Protocol (MCP). Together, they let enterprises add A2A capabilities to existing REST services without rewriting business logic, without duplicating code, and without running parallel infrastructures. This reduces agent sprawl in the infrastructure by reusing existing services as agents. We provide reference architectures and sample code that show how to build agentic overlays.
Sofina Ventures offloads Rs 177 Cr worth stake in Mamaearth parent
Sofina Ventures has sold its 1.28% stake in Honasa Consumer, the parent company of Mamaearth through a Rs 177 crore bulk deal on NSE on Thursday. According to exchange data, the Belgium-based investment firm sold 41.78 lakh shares, representing a 1.28% stake in the company, at an average price of Rs 424.07 apiece, valued at Rs 177.2 crore As per the company's March 2026 shareholding data, Sofina Ventures owned a 3.29% stake in Honasa Consumer. Following the sale of 1.28% shares through the bulk deal, its stake is estimated to have reduced to nearly 2%. The deal comes at a time when several venture capital and private equity firms are monetising their investments in listed startups. Recent transactions include Actis’ Rs 371 crore bulk deal in Pine Labs , Alpha Wave’s stake sale in Delhivery , and partial exits by SoftBank and ADIA in Lenskart . In Q4 FY26, Mamaearth reported strong 23% year-on-year growth in its revenue from operations to Rs 657 crore from Rs 534 crore in Q4 FY25, while posting a profit of Rs 69.4 crore during the quarter. Earlier this week, the Gurugram-based company acquired a majority stake in Fluence Pharma , its second acquisition in six months after the purchase of Reginald Men , which marked its entry into the men’s grooming category. Honasa’s shares closed at Rs 421.7 on the NSE on Thursday, valuing the company at a market capitalization of approximately Rs 13,595 crore ($1.51 billion).
Pocket FM’s AVP Content Ankit Singh exits amid leadership changes
Ankit Singh, Assistant Vice President of Content at Pocket FM, has announced his departure from the company after a two year stint. In a LinkedIn post, Singh said he moved from working on retention, revenue and analytics to leading Pocket FM’s global content marketing function. He said his team managed content marketing across international markets and adopted generative AI for content production, brand campaigns, the Discover platform launch and other initiatives. "In the next chapter, I'm working on something of my own with a close friend for people in the middle of a job search," said Singh. His exit came on the same day the shutdown of Pocket TV, Pocket FM's microdrama vertical, came to light. Responding to Entrackr's queries, the company said Pocket TV had been launched as a beta experiment and was concluded around eight months ago. It also reiterated its focus on its core audio business and global expansion ahead of a potential IPO. Singh's departure also comes amid a series of senior leadership exits at Pocket FM in recent months. Last month, Chief Financial Officer Anurag Sharma stepped down after nearly three years with the company to pursue entrepreneurial opportunities. During the same month, Senior Vice President Mayank Sancheti also stepped down from his role. Pocket FM has also begun discussions to shift its holding company back to India through a reverse flip as it eyes a public listing in the country. Update at 6:10 PM, June 26 : The story has been updated to…
An LLM as arbiter in RAG retrieval: picking the right candidate with reasons
Enterprise Document Intelligence [Vol.1 #7C] - One LLM call ranks the candidates with reasons. The output is one typed object your auditor can defend The post An LLM as arbiter in RAG retrieval: picking the right candidate with reasons appeared first on Towards Data Science .
Exclusive: LucidLink launches MCP server to give AI agents shared access to distributed files
LucidLink Corp., the maker of a cloud network-attached storage system based on object storage technology, today extended its distributed file system technology into agentic artificial intelligence with the public beta release of a Model Context Protocol server that lets AI agents access shared files across clouds, on-premises systems and edge environments. The company said its […] The post Exclusive: LucidLink launches MCP server to give AI agents shared access to distributed files appeared first on SiliconANGLE .
Executive Summary: Fragmented Europe: Dealing with China as a technology and innovation power
Executive Summary: Fragmented Europe: Dealing with China as a technology and innovation power H.Seidl Thu, 06/25/2026 - 11:32 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 15 min read Executive Summary: Fragmented Europe: Dealing with China as a technology and innovation power You are reading the Executive Summary of the 2026 report of the European Think Tank Network on China (ETNC) "Fragmented Europe: Dealing with China as a technology and innovation power". Go back to the main page . By Claudia Wessling and Bernhard Bartsch China’s drive to become a global leader in science, technology and innovation has huge implications for the EU and its member states. On the one hand, China is becoming a strong competitor in industrial high-tech sectors and innovative science that used to be the stronghold of European actors. Advanced digital technologies made in China also increasingly pose risks to infrastructures in Europe. On the other hand, China offers itself as a resourceful counterpart for collaboration in research and development (R&D) and keeps attracting European scientists and businesses alike. This report, the 12 th compiled by the European Think-tank Network on China (ETNC), analyses how Europe is affected by China’s rise to a technological power and its increasing clout in shaping and creating innovation. Authors from 22 European countries have contributed to this study. The goal is to provide a nuanced picture of how those states interact…
Building a state-of-the-art development platform with Backstage
Key takeaways Backstage solved the portal problem, not the platform problem. A portal organizes catalogs, documentation, and templates. A platform owns deployments, environments, policies, and runtime operations. Backstage assumes that the execution layer exists beneath it. Point-to-point integrations become a maintenance burden. Many organizations end up with a “messy middle” where Backstage is connected directly to CI/CD , GitOps , Kubernetes , and observability tools through custom wiring that’s fragile and hard to evolve. Abstractions are the interface between developers and infrastructure. Developers work with components, endpoints, and dependencies. Platform engineers work with environments, pipelines, and component types. The platform compiles both into Kubernetes resources. A control plane bridges the gap. It sits between the portal and runtime, compiling abstractions into infrastructure, enforcing policies consistently, reconciling drift, and aggregating runtime state back to the portal. Good abstractions enable advanced capabilities. Unified observability, automated guardrails, and AI agents that can reason about and act on your platform. All becomes possible when you have well-defined concepts and a control plane that understands both sides. … Start with Backstage If you’re building an internal developer platform , Backstage is certainly part of your architecture. It solved the discovery problem and became the default choice for developer portals. Before Backstage…
Germany: From mutual benefit to existential competition with China
Germany: From mutual benefit to existential competition with China H.Seidl Thu, 06/25/2026 - 10:27 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 14 min read Germany: From mutual benefit to existential competition with China You are reading the Germany chapter of the 2026 report of the European Think Tank Network on China (ETNC) "Fragmented Europe: Dealing with China as a technology and innovation power". Go back to the main page . By Claudia Wessling and Bernhard Bartsch After decades of highly lucrative technology cooperation with China, Germany finds itself in an existential crisis. China has been catching up in high-tech and research areas where Germany has traditionally been a leader. Fears of losing key industries and potentially hundreds of thousands of jobs have led to soul-searching about reviving German competitiveness. But diverging strategies emerge along the fault lines of politics and business. While the German government tries to shift its focus on security politics and geoeconomics, many companies and research institutions consider the risks of not cooperating with China on innovation as being higher – and are doubling down on their engagement with Chinese partners. Recent trends: Balancing more conscious risk management and staying at the competitive edge in science and tech In the spring of 2026, German debates on China are shaped by concerns of a looming existential crisis. The German export industry appears to be in free fall…
AI coding token costs are on track to rival human payroll
Enterprises may soon be paying as much for their developers’ AI token usage as they do for their salaries. According to Gartner , these costs will meet, or even exceed, the typical software engineer’s monthly salary within the next two years. This is not only because developers are increasingly adopting generative AI and agentic tools , it reflects a trend toward consumption-based licensing models as vendors balance infrastructure investments with profitability. Rather than the flat per-seat SaaS model of the past, enterprises now pay for developer token use as well. Gartner senior principal analyst Nitish Tyagi explained that it’s important to note that Gartner’s prediction is based on a global average salary of $2,000 per month; it doesn’t mean AI token usage will exceed all salaries. For instance, in the US, yearly pay rates can be six digits or more. However, that kind of spend is not out of the realm of possibility, Tyagi emphasized. “I have heard scary numbers like ‘My developer consumed $20K last month,’ or ‘A business user consumed $32K’.” If these amounts sound shocking, that’s the point. “The goal is to alarm the industry about the impact of token cost if it is not governed and controlled,” he said. Lack of visibility, immature oversight Enterprises are quickly moving from experimentation to scaled deployment of AI coding agents , but many still underestimate token costs, Tyagi noted. This is because cost structures for software engineering workloads are “highly va…
AI-powered BI with Snowflake and Amazon Quick
In this post, you will learn how to build an end-to-end integration between Snowflake semantic views and Amazon Quick. The sample data is user review data for a media company. You start by loading movie review data from Amazon Simple Storage Service (Amazon S3) into Snowflake, define a semantic view in SQL to add business meaning, explore it with natural-language queries through Cortex Analyst, and then generate an Amazon Quick dataset and dashboard. The dataset can be created manually or with a provided automation script. By the end, your BI team or AI team can ask natural-language questions against a governed data layer and trust that every response reflects the same business logic.
Three insights you may have missed from theCUBE’s coverage of Pure Accelerate
As enterprises advance their artificial intelligence initiatives, they’re discovering that the real constraint isn’t model sophistication — It’s data. AI outcomes now depend on whether organizations can access, mobilize and operationalize data as an active system rather than a passive repository. This shift was a defining theme at Pure Accelerate 2026. The challenge is not simply […] The post Three insights you may have missed from theCUBE’s coverage of Pure Accelerate appeared first on SiliconANGLE .
Ireland: Interlocking factors shape the approach to China in science-tech innovation
Ireland: Interlocking factors shape the approach to China in science-tech innovation H.Seidl Wed, 06/24/2026 - 16:15 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 14 min read Ireland: Interlocking factors shape the approach to China in science-tech innovation You are reading the Ireland chapter of the 2026 report of the European Think Tank Network on China (ETNC) "Fragmented Europe: Dealing with China as a technology and innovation power". Go back to the main page . By Alexander Davey , MERICS, and Niall Duggin , Department of Government and Politics, University College Cork Ireland’s approach to China in science, technology and innovation is shaped by several interlocking factors. At its core is Ireland’s open, FDI-driven economic model. This approach is further informed by a growing set of government strategies focused on security and competitiveness, designed both to bolster that model and to support a gradual shift toward industrial policy and deeper integration within the EU single market. It is also shaped by pressure to diversify amid uncertainty surrounding the current US administration, as well as by wider efforts to build resilience in response to geopolitical instability. Recent developments: Meeting in Beijing, open to collaboration In January 2026, Ireland and China’s leaders met for the first time in 14 years in Beijing. Both Taoiseach Micheál Martin and President Xi Jinping made speeches about the bilateral relationship. In terms…
Long-horizon agent benchmarks are fragmenting: a field guide to what each one actually measures
A field guide to the new wave of long-horizon agent benchmarks: what each one actually measures, the realism-versus-verifiability bargain it strikes, and the seam where its score leaks. The post Long-horizon agent benchmarks are fragmenting: a field guide to what each one actually measures appeared first on Arize AI .
World Cup coverage, brought to you by one of Brazil’s biggest influencers
Virginia Fonseca’s debut on TV Globo is the latest sign of a media shift that critics say is blurring the line between journalism and entertainment. The post World Cup coverage, brought to you by one of Brazil’s biggest influencers appeared first on LatAm Journalism Review by the Knight Center .
Finding the right anchors for RAG: keyword, embedding, and TOC signals in parallel
Enterprise Document Intelligence [Vol.1 #7B] - Retrieval is filtering on structured tables: keywords first, TOC second, embeddings last The post Finding the right anchors for RAG: keyword, embedding, and TOC signals in parallel appeared first on Towards Data Science .
Fragmented Europe: Dealing with China as a technology and innovation power
Fragmented Europe: Dealing with China as a technology and innovation power H.Seidl Wed, 06/24/2026 - 12:14 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 2 min read Fragmented Europe: Dealing with China as a technology and innovation power Report by the European Think-tank Network on China Please note that this report is embargoed until June 30, 2026, 10 a.m. CEST. Edited by: Bernhard Bartsch, Claudia Wessling Peer reviewers: Andreas B. Forsby, Nick Nieschalke, John Seaman, Tamás Matura, Francesca Maremonti, Aurelio Insisa, Matej Šimalčík, Filip Šebok, Anastas Vangeli, Katja Zajc Kejžar, Mario Esteban, and Miriam Tardell Recent years have seen the EU shift toward a policy of “de-risking” in relation to cooperation with China in the science and technology space, as concerns over economic and research security continue to grow. However, this ambition lacks cohesive implementation, and the current state of affairs is a patchwork of sometimes competing interests and approaches across member states. This year’s report by the European Think Tank Network on China (ETNC) examines national approaches to dealing with China as a technological power and research partner, reflecting the broad range of approaches implemented across Europe. The report features 24 national chapters and a dedicated EU chapter, written by China experts covering their own country’s relationship with China in relation to science, technology and innovation, along the same line of in…
Qatar boosts engineering pay to encourage local tech talent
Ministry of Education's new scholarship plan guarantees jobs after graduation
Dell/AMD partnership: Three insights you may have missed from theCUBE’s coverage of Dell Technologies World
With the AI factory becoming a key focus in enterprise IT, hybrid architecture has become equally important as organizations seek to generate workloads on-premises, in the cloud and at the edge. This is why enterprises are increasingly looking toward major players such as Dell Technologies Inc. and Advanced Micro Devices Inc. for production-scale deployment in […] The post Dell/AMD partnership: Three insights you may have missed from theCUBE’s coverage of Dell Technologies World appeared first on SiliconANGLE .
Xi visits Pyongyang and China rehabilitates North Korea
Xi visits Pyongyang and China rehabilitates North Korea H.Seidl Tue, 06/23/2026 - 17:27 picture alliance / KCNA via KNS via AP | Uncredited Comment Jun 24, 2026 5 min read Xi visits Pyongyang and China rehabilitates North Korea Russia’s renewed standing in northeast Asia and continued US strength in the region have encouraged Beijing to set aside reservations about its only official ally, says Andrew Scobell. Xi Jinping’s June 2026 two-day state visit to North Korea is an unmistakable sign that Beijing is working hard to bolster its typically tortuous relationship with Pyongyang. China’s ties with its sole official ally have changed remarkably – from fractured to friendly in less than ten years. The trip made Xi the first leader of the People’s Republic of China (PRC) in forty years to make two official trips to North Korea – his first, in 2019, correlated with a significant thaw in bilateral relations. Only one other Chinese leader had traveled to Pyongyang twice during his tenure: Deng Xiaoping visited North Korea in 1978 and 1982. Xi’s predecessors, Jiang Zemin and Hu Jintao, made only one trip each, while founding leader Mao Zedong made none, despite his momentous decisions to send Chinese troops to support a Pyongyang regime on the verge of collapse in 1950 and to sign a mutual defense treaty in 1961. (This list of travels does not include the many visits to China made by North Korean leaders .) Kim goes from “fatty” to friend For China, where symbolism and showing up m…
EU: Shepherded by Brussels, Europe awakens to Chinese technology
EU: Shepherded by Brussels, Europe awakens to Chinese technology c.groth Tue, 06/23/2026 - 16:23 picture alliance / Long Wei / Costfoto Download (pdf - 3.72 MB) Jun 30, 2026 14 min read EU: Shepherded by Brussels, Europe awakens to Chinese technology You are reading the EU chapter of the 2026 report of the European Think Tank Network on China (ETNC) " Fragmented Europe: Dealing with China as a technology and innovation power ". Go back to the main page . By Rebecca Arcesati EU-China innovation relations have turned more difficult in recent years. Although government policies cannot undo deep interdependencies in science and business overnight, cooperation no longer goes uncontested. The exclusion of Chinese entities from large parts of the Horizon Europe funding program and capitals’ increased scrutiny of Chinese investments into high-tech sectors are just two examples of a wider transformation: From largely unconditional openness to China in science and technology, the EU’s approach has shifted towards a logic of “de-risking” and economic security. 1 Technology and innovation in EU-China relations: from openness to economic security Amid intense geopolitical and technological rivalry between Beijing and Washington, Euro-peans increasingly find themselves in a position of relative disadvantage and dependency. To be sure, European companies control “chokepoints” on some of the world’s most critical technology value chains, such as advanced semiconductor fabrication, and occup…
AI encouraged in news production under Vietnam’s upcoming press regulations
Two decrees guiding Vietnam's upcoming Press Law No 126/2025/QH15 will take effect on July 1, alongside three ready‑to‑issue circulars, ensuring no legal gap when the law comes into force.
What Codex Unlocks for NTT Data
After introducing Codex, it grew to more than 10,000 active users and became one of the largest internal community in just a few months." We chatted with Hiroaki Sato, Head of the AI CoE at NTT DATA who gave us insight into all of the new ways his technical and non-tech teams are leveraging Codex. His Sales teams uses Codex to automate tasks like customer list maintenance, which has created net new workflows and report creation that used to take 2 days now takes 30 min 💥
IEEE Rolls Out Large Language Models Virtual Training Course
Large language models have moved out of the research lab and into engineers’ daily workflow. LLMs serve as reasoning engines that can orchestrate complex tasks including identifying vulnerabilities in source code and transforming fragmented project discussions into rigorous technical specifications. While the general public uses AI tools to write email and plan vacations, technical professionals use LLMs as core architectural elements that are fundamentally changing how digital infrastructures are built and maintained. As the AI models move into mainstream engineering practice, the demand for technical expertise is rising. The LLM technology market is expected to grow by about 33 percent every year through 2030 , according to MarketsandMarkets . The rapid expansion suggests that proficiency in implementing and securing the models is transitioning from a niche into a core requirement for technologists. More than just a better search engine To use LLMs effectively, technical professionals must move beyond treating them as conversational robots. At a fundamental level, the AI systems are built on the transformer architecture , a framework that replaced the older method of processing data in a fixed, sequential order. Unlike earlier models that analyzed information one step at a time, transformers use self-attention mechanisms to ingest vast datasets simultaneously. For technical professionals, LLMs are core architectural elements that are fundamentally changing how digital infr…
Datasette Apps: Host custom HTML applications inside Datasette
Today we launched a new plugin for Datasette, datasette-apps , with this launch announcement post on the Datasette project blog. That post has the what , but I'm going to expand on that a little bit here to provide the why . The TL;DR Datasette Apps are self-contained HTML+JavaScript applications that run in a tightly constrained sandbox hosted on your Datasette application. They can use JavaScript to run read-only SQL queries against data in Datasette, and can run write queries too if you configure them with some stored queries . Here's a very simple example and a more complex custom timeline example - the latter looks like this: Apps are allowed to run JavaScript and render HTML and CSS. They are limited in terms of access - the they run in prevents them from accessing cookies or localStorage and they also have an injected CSP header (thanks to this research ) which prevents them from making HTTP requests to outside hosts, preventing a malicious or buggy app from exfiltrating private data. Datasette Apps started out as my attempt at building a Claude Artifacts mechanism for Datasette Agent , but I quickly realised that the sandboxed pattern is interesting for way more than just adding custom apps in a chat interface and promoted it to its own top-level concept within the Datasette ecosystem. They're also a fun way to turn my multi-year experiment in vibe-coded HTML tools into a core feature of my main project! You can try out Datasette Apps by signing in with GitHub to the…
Les candidats draguent le monde de la Tech
Playbook Paris se délocalise au salon Vivatech, à Paris, où jamais autant de candidats à la présidentielle n’ont choisi de se rendre alors que le sujet de l’IA se place au premier plan. Gabriel Attal et Edouard Philippe jouent des coudes pour séduire le monde de la tech, en dégainant une série de propositions, avec […]
Case Study: How CodeRabbit Leverages LanceDB for AI-Powered Code Reviews
How CodeRabbit leverages LanceDB-powered context engineering turns every review into a quality breakthrough.
Voice for AI Agents and Applications
Learn more: https://bit.ly/4vPQ3HE Voice is one of the most natural human interfaces, but adding it to AI applications has historically forced a tradeoff: fast voice-to-voice models that sacrifice reliability, or accurate speech-to-text-to-LLM-to-speech pipelines that add latency. This course teaches you how to get both, using Vocal Bridge's architecture that pairs a real-time foreground agent with a reasoning background agent. Taught by Ashwyn Sharma, CEO and Co-Founder of Vocal Bridge (an AI Fund portfolio company), this course covers three practical integration patterns that meet you where you are: voice embedded in an application, voice layered onto an existing agent without touching its logic, and voice as a tool your LLM can call when it decides a conversation is the right modality. In detail, you'll survey the traditional voice stack and its tradeoffs, then explore three live integration patterns to understand when each one applies. Build a voice-interactive tic-tac-toe game where voice commands and mouse clicks work together over a single synchronized channel, then add a voice layer to an existing agent with minimal code, leaving your prompts, RAG pipeline, and tools untouched. Give your agent a make_phone_call tool so it can dial a real number, hold a conversation with a demo agent, and stream the transcript back live. Set up evaluation-driven development using Vocal Bridge's multimodal evaluator to score calls, catch regressions, and refine prompts before issues re…
A Metadata Benchmark of Lance, Delta Lake, and Iceberg on S3
A Rust benchmark comparison of Lance, Delta Lake, and Apache Iceberg on S3 and S3 Express, and why Lance is optimized for object storage metadata.
The 2026 Digital Decade eHealth Indicator Study
The 2026 Digital Decade eHealth Indicator Study dumimar Wed, 06/17/2026 - 09:15 This report presents the eHealth target monitoring results as at 31 December 2025 for the EU-27, Iceland and Norway under the Digital Decade Policy Programme. Access to electronic health records continues to increase across Europe In 2025, Member States made further progress towards achieving the eHealth target of “100% of EU citizens having access to their electronic health records by 2030”. The composite eHealth score for the EU-27 reached an average of 87%, an increase of 4 percentage points compared with 2024. All Member States have an online access service; four Member States operate regional services (Ireland, Italy, Spain and Sweden). Overall, 18 Member States, Iceland and Norway increased their maturity score compared with 2024, reflecting improvements such as wider availability of data types and a greater number of healthcare providers being connected and sharing data. Nine Member States maintained the same maturity level, including Belgium and Estonia, which reached a 100% composite maturity score in 2024. Some areas are advancing, while others require further effort Substantial progress has been made in improving the availability of electronic health record data. Data about identification (100%), personal information (98%), eDispenstion (91%) and ePrescription (89%) are the most widely available (see Figure 1). However, medical images (35%) and medical devices and implants (59%) remain…
Digital Decade 2026 - Connectivity Coverage in Europe 2025 report
Digital Decade 2026 - Connectivity Coverage in Europe 2025 report dumimar Wed, 06/17/2026 - 09:10 Fibre to the Premises (FTTP) becomes Europe's most widespread fixed technology, while 5G signal nears universal reach as Member States advance towards the Digital Decade 2030 targets. The Connectivity Coverage in Europe 2025 report provides a comprehensive overview of fixed and mobile broadband coverage across the 27 EU Member States (EU27), Norway, Iceland, Switzerland and the United Kingdom as of mid-2025. The study and its report are designed to monitor the progress of EU Member States towards the targets as set out in the Digital Decade Policy programme . The report analyses the deployment of ten connectivity technologies (DSL, VDSL, VDSL2 Vectoring, cable modem DOCSIS 3.0, DOCSIS 3.1 and higher, FTTP, FWA, 5G, 5G in the 3.4-3.8 GHz band, and satellite) and four aggregated coverage categories, at national, rural and regional (NUTS-3) level. EU27 fixed connectivity coverage reached 98.0% , Next Generation Access (NGA) coverage reached 95.3%, fixed Very High-Capacity Network (VHCN) coverage reached 85.6%, and Fibre to the Premises (FTTP) coverage reached 74.1%. Mobile coverage also continued to expand, with 5G signal reaching 96.8% of households , and coverage of 5G in the 3.4-3.8 GHz band reaching 74.8%. The report identifies growing reliance on Fixed Wireless Access (FWA) solutions in low-density and geographically challenging regions, as well as the increasing relevance of…
Digital Decade 2026 - 5G Observatory Report
Digital Decade 2026 - 5G Observatory Report dumimar Wed, 06/17/2026 - 09:10 The 22nd edition of the European 5G Observatory report is the Commission’s key policy repository for data related to 5G and its evolution towards 6G. This report is prepared by an external contractor for monitoring developments in the deployment of 5G in the EU and internationally, and assessing progress towards EU’s digital connectivity targets. Launched by the European Commission in 2018, the 5G Observatory has evolved into a tool for evidence-based policy-making under the Digital Decade Policy Programme (DDPP) . The 5G Observatory report provides on a yearly basis structured and policy-relevant data and analysis, which helps the Commission and the EU27 Member States assess progress, guide decision-making, and advance the EU’s digital objectives for 2030. Findings of the 2026 report The current edition shows that the EU is very close to achieving full basic 5G population coverage (96.8% of households) and slightly less close in terms of basic 5G rural coverage , which stands at 87.9%. However, mid-band coverage remains the main bottleneck in the EU. 5G coverage in the 3.4–3.8 GHz spectrum band is substantially lower than general 5G coverage in many countries (74.8% overall household coverage in this band). Rural mid-band coverage is particularly weak. Internationally, countries such as South Korea, Japan, China, and India (in order of performance), consistently outperform the EU27 in both household…
Predicting LLM Safety Before Release by Simulating Deployment
Paper link Before releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part of our pre-deployment safety review, we leverage targeted evaluations, red-teaming, and other checks to understand model behavior. We’ve now started using a method for simulating model deployments before they happen, which adds a complementary signal: a deployment-like preview of how a candidate model may behave before it reaches users. Deployment Simulation is a method for simulating a future deployment before it happens. We do so by replaying previous conversations in a privacy-preserving manner with a new candidate model. By doing so, we can study how the new model responds in realistic contexts before release, including whether new undesired behaviors emerge and how often they may appear. In our GPT-5.4 study, these forecasts were informative. For categories whose production rates changed by at least 1.5x, deployment simulation predicted the direction of change 92% of the time, compared with 54% for a baseline built from challenging prompts. Simulated deployments also looked much closer to real production traffic on evaluation-awareness measures: traditional evals often visibly have stage lights; production prefixes mostly do not. The hardest case is agentic tool use, where realistic behavior depends on external state: fil…
If context is king, architecture is the castle…
Recorded live at the AI Agent Conference, Ryan sits down with Apollo GraphQL CEO Matt DeBergalis to discuss how enterprises can leverage GraphQL and MCP as a structured semantic architecture to feed clean data to autonomous agents, safeguard internal microservices against unprecedented "east-west" data exfiltration risks, and rein in skyrocketing token spend by explicitly querying only the exact context required.…
Uncertainty Estimation vs Oversampling
I am currently doing some work with a fraud detection dataset as part of a research to leverage uncertainty to improve neural networks ensemble of experts' results. Firstly I had to take the dataset's training data and split it into 6 different domains (5 in-distribution and 1 ood). The goal is to train 5 different experts on different fraud patterns and then compute uncertainty of each expert (using Monte-Carlo dropout) and the uncertainty of the ensemble of experts. When we talk about fraud detection we expect the data to be heavily imbalance favoring the non-fraud class. In this case is 1:90. Given this it makes sense to use oversampling when training the neural networks and so I did. I made a sampler which increases the ratio to 1:10, not by creating new transactions (Because due to the data split, some domains have very few transactions to get reliable simulated transactions with oversampling) but by having the fraudulent transactions seen x amount of times more to increase the ratio. Now, with this, I'm having a problem with the uncertainty signal. In a perfect scenario, fraudulent transactions would be more uncertain than legit ones, but due to the oversampling the experts seem to be more uncertain about the legit transactions than the fraudulent ones. I have tried different percentages of oversampling and only without oversampling the uncertainty signal is correct, but then the overall predictive results underperform. So what should I do? Should I try different metho…
OpenSearch vs LanceDB for Vector Search: Query Cost and Infrastructure
Choosing a vector database usually comes down to a tradeoff between a full search service and an in-process library. This post showcases benchmarks that compare OpenSearch and LanceDB on the COCO 2017 images embedded with SigLIP. We measure ingestion throughput, query cost, storage layout, and overall infra cost.
One agent, two trace destinations: Arize AX + Databricks Unity Catalog
Send one OpenTelemetry trace stream to both Arize AX and Databricks Unity Catalog so engineers can debug agents in Arize while data teams analyze the same spans in governed lakehouse storage. The post One agent, two trace destinations: Arize AX + Databricks Unity Catalog appeared first on Arize AI .
Spotlight on SIG Storage
In our ongoing SIG Spotlight series, we shine a light on the groups that keep the Kubernetes project moving forward. This time, we catch up with SIG Storage , the group responsible for persistent data, volume management, and the interfaces that connect Kubernetes workloads to the storage systems beneath them. We spoke with Xing Yang , Co-Chair of SIG Storage and Software Engineer at VMware by Broadcom, about the SIG's history, the features shipping in recent Kubernetes releases, and where storage in Kubernetes is headed as AI workloads become the norm. Introductions Could you introduce yourself and share your role(s) within SIG Storage? My name is Xing Yang , a software engineer at VMware by Broadcom. I'm a co-chair in SIG Storage, alongside another co-chair Saad Ali from Google. There are also two Tech Leads in SIG Storage: Michelle Au from Google and Jan Šafránek from Red Hat. What first drew you to storage in Kubernetes, and how did you start contributing? I have always been working in the storage domain, so SIG Storage was a natural place for me to get started when I began to learn Kubernetes. I started attending SIG Storage meetings , trying to figure out what I could do to help. This was before the first Container Storage Interface (CSI) release — lots of things were still evolving. It was a very exciting time. What subprojects or areas do you actively maintain or review today? I'm a maintainer in Kubernetes CSI. There are multiple CSI sidecars — such as csi-provisione…
SFT Drives Gemini’s Safety Properties
This is the third in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The second post can be found here . In this short post, we describe a surprising finding: most safety relevant properties in Gemini seem to be caused by the combination of pretraining and SFT, not other training stages like RL. We do not want to overstate this claim as applying to other model families, and we also note that this may change in future Gemini versions. Nevertheless, this result was counter to our initial expectations and will inform future safety work on our team, and so we felt that it was important to share with the broader safety community. Experiment We perform SFT using the Gemini mixture on the pre-training only versions of Gemini 3.1 Pro and Gemini 3 Flash. We then compare these Post-SFT models to the production versions of Gemini 3.1 Pro and Gemini 3 Flash on different safety relevant benchmarks: Error bars are 95% confidence intervals on the evals. The main result is that the blue bars (SFT-only models) and orange bars (production models) are remarkably similar across evals . An important implication is that for Gemini, SFT is a high leverage place to intervene for model safety and behavior, and we plan to try to intervene here in the future. Brief Descriptions of Each Set of Benchmarks: ODCV refers to the benchmark in https://arxiv.org/abs/2512.20798 Alignment evals refer to a version of Petr…
AutoClimDS: Climate data science agentic AI — A knowledge graph is all you need
Climate data science faces persistent barriers stemming from the fragmented nature of data sources, heterogeneous formats, and the steep technical expertise required to identify, acquire, and process datasets. These challenges limit participation, slow discovery, and reduce the reproducibility of scientific workflows. In this paper, we present a proof of concept for addressing these barriers through the integration of a curated knowledge graph (KG) with AI agents designed for cloud-native scientific workflows. The KG provides a unifying layer that organizes datasets, tools, and workflows, while AI agents—powered by generative AI services—enable natural language interaction, automated data access, and streamlined analysis. Together, these components drastically lower the technical threshold for engaging in climate data science, enabling non-specialist users to identify and analyze relevant datasets. By leveraging existing cloud-ready API data portals, we demonstrate that 'a knowledge graph is all you need' to unlock scalable and agentic workflows for scientific inquiry. The open-source design of our system further supports community contributions, ensuring that the KG and associated tools can evolve as a shared commons. Our results illustrate a pathway toward democratizing access to climate data and establishing a reproducible, extensible framework for human–AI collaboration in scientific research.
Matching first names, full names and pronouns
I am working on a graph store of entities and relationships extracted from a factual test document of around 500 words. The first pass (NER) extracts named entities, the second extracts relationships (RE). For a given person, there are different references in the text: Maria, Maria Gotthard, Dr. Maria Gotthard and can also be referred to by 'she', for example 'she was rewarded by the company'. The goal is to merge all these references into one entity so that the relationship graph is not fragmented into different contexts. I have seen a few posts on different forums saying this is a very difficult problem, but hopefully someone out there has some insights or experience to share 🙂 To make things interesting, references to the same entity can occur in different chunks of text, making it impossible for the LLM (currently Ollama/Mistral) to process the cross-chunk context in one call. To address this, I have added a pass across all extracted entities, including exact text matching and a Levenshtein similarity check, but this does not handle first name v full name and comes with a host of other issues. It has a high risk of over-merging, for example if a set of entities consist of incrementally numbered items they will all be merged into one entity. I am wondering if there is a particular architecture for this problem, for example pre-processing a document to link related entities before extracting. Doesn't have to be LLM-based, heuristics and algorithms sometimes do the trick as…
Bring production agent traces from Arize into Databricks Unity Catalog
Arize Data Fabric now supports Databricks, helping teams sync production agent traces, evaluations, and annotations into customer-owned storage for governed analysis in Unity Catalog. The post Bring production agent traces from Arize into Databricks Unity Catalog appeared first on Arize AI .
Vector Space Hackathon 2026
Wow. So many cool and creative submissions for this year’s hackathon; we really had a tough time picking only 3 winners! The submissions ranged from early mental health detection to crowd-reaction simulators, tactical football search, and infrastructure stress-testing. We’re excited to share the results with you. The Hackathon Qdrant’s 2026 “Think Outside the Bot” hackathon pushed the creative boundaries of vector search. Participants from around the world were challenged to create innovative uses of Qdrant, without the use of RAG or simple chatbots. Submissions were judged on the criteria of Innovation, Creativity, and Technical Depth. The hackathon ran for 5 weeks with winners announced at Vector Space Day 2026 with a total of $10k in prizes. Keep reading to learn about the winning submissions.
Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769
As context windows grow into the millions of tokens, many AI practitioners are questioning whether retrieval-augmented generation (RAG) is still necessary. If modern models can ingest entire libraries of documents, why bother with retrieval at all? In this episode, Alex Bowcut, Head of Engineering at Sphere, explains why the answer depends on the application. Sphere uses AI to automate global tax compliance—an environment where getting the answer right isn’t enough. Every conclusion must be backed by the correct legal citation, and every decision must withstand expert review. We explore how Sphere built TRAM (Tax Review and Assessment Model), a production AI system that combines retrieval, reasoning models, legal review workflows, reinforcement learning, and deterministic systems to help tax experts move nearly two orders of magnitude faster while maintaining accuracy. Along the way, we discuss why RAG remains critical in high-stakes domains, how Sphere processes legal and regulatory documents from jurisdictions around the world, retrieval architectures, semantic chunking, dense versus sparse retrieval, expert feedback loops, and the challenges of building AI systems that people can actually trust. 🗒️ Full show notes: https://twimlai.com/go/769.
EKKA: Automated diagnosis of silent errors in LLM inference
LLM serving frameworks are quickly evolving with a complex software stack and a vast number of optimizations. The rapid development process can introduce silent errors where output quality silently degrades without any explicit error signals. Diagnosing silent errors is notoriously difficult due to the substantial semantic gap between the high-level symptoms and the low-level root causes. We observe that diagnosis of silent errors can be effectively framed as a differential debugging problem by leveraging the existence of semantically correct reference implementations. We propose EKKA, an automated diagnosis system that identifies root causes by systematically aligning and comparing intermediate execution states between a target and a reference framework. We constructed a benchmark of real-world silent errors from popular serving frameworks, where EKKA shows 80% pass@1 diagnosis accuracy and 88% pass@5 diagnosis accuracy, outperforming state-of-the-art systems. EKKA also diagnoses 4 new silent errors from serving frameworks, all of which have been confirmed by the developers.
Russian propaganda abounds in Chinese social media debate on Ukraine
Russian propaganda abounds in Chinese social media debate on Ukraine c.groth Thu, 06/04/2026 - 10:12 picture alliance / Da Qing/HPIC/dpa | Da Qing Comment Jun 08, 2026 8 min read Russian propaganda abounds in Chinese social media debate on Ukraine This series looks at how China debates the issues the country faces at home and abroad. Covering domestic policy, social change, technology, geopolitics and economics and focusing mainly on expert debates, each article draws on analysis from universities, think tanks, government-linked research institutes, business associations and investment groups. A new analysis suggests that Beijing may be tolerating or even encouraging toxic discourse to undermine the moral and political authority of Kyiv and its Western supporters, says Yurii Poita, MERICS Senior Associate Fellow and Head of the Asia Section at Kyiv-based New Geopolitics Research Network (NGRN). While the Chinese government likes to present itself as “ objective and impartial ” regarding the Russia-Ukraine war, the tone on Chinese social media is radically different. Influencers on Chinese online platform Weibo, many with millions of followers, regularly attack the Ukrainian government, attempt to drive a wedge between Kyiv and its European and US partners, portray Ukrainian armed forces and recruitment centers as “Nazi,” and justify documented Russian war crimes. Given the links between these opinion leaders and the Chinese state, and the characteristics of China’s media sys…
What would be the best way to analyze the relationship between a chemical reaction network graph and a tuple using a GNN?
So, for an ongoing research project, I've been analyzing the topology of the chemical reaction network (CRN) of a planet's atmosphere. What I'd like to do is see if anything about the CRN can be inferred directly from the atmosphere's spectra (which is usually in the form of an n-tuple, where n is the number of spectral radiance values (in W/sr/m2/um) as a function of wavelength) using machine learning. I've simulated a large (>100,000) number of planetary atmospheres and their associated spectras to create data set for analysis. As it stands, I'd just been measuring several topological metrics of the graphs (e.g., mean degree, average shortest path length, clustering coefficient, etc), and then using that and the spectral data to train a simple linear, 3-layer regression model I created in PyTorch. However, it was recently pointed out to me that, since I'm working graphs, it would be an excellent use case for graph neural networks, since they take graphs as their input. While I'm intrigued by this idea, I'm not really sure where to start. While I have a lot of experience with modeling atmospheric chemistry and analyzing network topology, I have very little with machine learning (the above mentioned PyTorch regression model was my first real foray into ML). I do have quite a lot of experience coding in Python in general, however. So, what would be the best way to approach this problem? I know PyTorch has an add-on, torch-geometric, that can handle graph neural networks, but…
The Classical Advances Needed to Make Quantum Computers Tick
Quantum computers promise to one day solve problems beyond the most powerful supercomputers imaginable. But it’s often underappreciated how much classical computing it takes just to operate these machines. As qubit counts rise, innovations in this supporting infrastructure will be essential if they’re to live up to their promise. To prepare for the scale of quantum computers the industry is working toward, many companies are also gearing up the classical hardware, and software, required to support them. In April, Nvidia announced new AI-based software to accelerate the classical tasks that enable quantum computers. Sydney-based quantum software company Q-CTRL has developed an automatic calibration algorithm for quantum computers, and is now leveraging Nvidia’s agent-based system. Other companies, including IBM Quantum , Cambridge, England–based Riverlane , which develops quantum-error correction, and Google Quantum AI , are developing similar tools. The Role of Classical in Quantum Digital computer chips are marvels of engineering, operating flawlessly out of the box and capable of trillions of operations without error. The quantum bits, or qubits, at the heart of a quantum computer, by contrast, are temperamental and unreliable, requiring regular calibration and complex error-correcting schemes to keep them on track. Calibration and error-correction are fundamentally classical, not quantum, problems, and they require dedicated classical hardware to solve. As quantum compute…
7 Ways New Engineers Can Flourish in the Age of AI
New graduates’ careers are unfolding in an era when AI is not optional. The most successful engineers treat artificial intelligence as leverage, not competition. Here are seven tips to help keep young professionals in demand no matter how quickly the field’s tools evolve. 1. Master the fundamentals first. AI tools can help you code, but you still need strong fundamentals in: Data structures and algorithms for problem-solving. Operating systems, databases, and networking for system-level understanding. Core programming languages such as C++ , Java , and Python . AI can autocomplete syntax, but if you don’t understand how things work under the hood, you’re likely to struggle to debug or optimize. 2. Learn how to work with AI, not against it. The best engineers will not try to out-code AI. Instead, they will learn to: Write clear prompts to generate better code snippets. Review and debug AI-generated code for accuracy, performance, and security. Use AI for productivity boosts while still exercising judgment. Think of AI as a teammate. The real skill is knowing when to trust it and when not to. 3. Build projects that showcase end-to-end thinking. Employers increasingly look for engineers who can design and build systems, not just solve problems. Create projects that show you can: Define requirements clearly. Use AI tools responsibly within the workflow. Deliver a product that scales and is maintainable. 4. Sharpen your system design skills early. Even junior engineers are now as…
The Ex-Pentagon Chief Sounding the Alarm on AI Weapons — Brad Carson
Brad Carson was the Army's General Counsel, served two terms in Congress and was Acting Under Secretary of Defense for Personnel and Readiness. He now heads Americans for Responsible Innovation, the AI-policy advocacy group he co-founded. Keith Duggar spends roughly eighty minutes pushing back. SPONSOR: --- Cyber Fund built the Monastery to help founders ship products that were impossible a year ago. Applications for Batch 1 are now open. Apply now: https://cyber.fund --- Carson's whole case rests on one line: the genie is not out of the bottle. We have pulled dangerous tech back before. Asilomar halted recombinant DNA in 1975, and the West still controls the chips AI runs on. Calling it unstoppable, he says, is the most dangerous idea in the room. Then Keith drags him somewhere darker. A Palantir heat map scores you 0.73 on whether you are a combatant, and a strike follows. The model is wrong some accepted share of the time, and when it is, nobody answers for it. You cannot court-martial a model, and not even the interpretability researchers can say why it picked you. — Note: after recording, we learned that Americans for Responsible Innovation is backed by EA-aligned philanthropy (not sponsored) --- TIMESTAMPS: 00:00:00 From the Pentagon to AI governance 00:04:52 Regulatory capture vs Silicon Valley networks 00:07:56 Transparency and the Claude tier changes 00:09:40 Tort liability when AI tools cause harm 00:13:40 AI is a product, not a person 00:16:01 Children, suicide, a…
When AI Infrastructure Meets Enterprise Data: ClearML on the Dell AI Data Platform
By Adam Wolf Dell Technologies has published a validated integration of ClearML with the Dell AI Data Platform (AIDP), pairing ClearML’s AI infrastructure capabilities with Dell’s enterprise-managed storage and search engines. The result is a reference architecture that lets AI teams keep moving fast while platform teams keep the data foundation enterprise-grade. Here is what […]
Turn Azure Data into an AI-Ready Knowledge Base
Learn how to turn Azure Blob Storage data into an AI-ready knowledge base using Pinecone. A deployable template automates the full ingestion pipeline—parsing, chunking, embedding, and indexing—so your documents are searchable in minutes.
Build a Coding Assistant with Weaviate MCP: RAG over Code & Docs
Use Weaviate's built-in MCP server to give Claude Code, Cursor, and VS Code hybrid search over your codebase and docs. No glue code.
What Held Up at 3 AM: One Engineer’s RAG Case Study
Most AI demos work. Most AI products don’t. This series is a collection of interviews with engineers who shipped AI agents to production, covering the stacks they chose, the architectures they regretted, and what actually held up at 3 am. This is an interview with Michael Maximilien, former CTO and Distinguished Engineer at IBM and […] The post What Held Up at 3 AM: One Engineer’s RAG Case Study appeared first on Comet .
How GoPerfect Built an Agentic Recruiting Workforce with Qdrant Cloud
GoPerfect mission is to use an AI recruiting workforce that replaces the manual, low-leverage parts of recruiting. Instead, an agent decomposes recruiter intent and runs the work end to end to find top talent. Their agentic platform handles sourcing, scanning, reviewing, outreach, admin work as well as candidate conversations for recruiters, hiring managers, agencies, and CEOs who hire at volume. Recruiting is a needle-in-a-haystack problem with two complications: the haystack is massive (200M+ profiles enriched with 1B+ data points drawn from professional networks, code repositories, company data, and AI-derived signals), and the definition of the “needle” is more nuanced than any keyword filter can express. A product manager is not a product marketer, even though the two sit close together in any reasonable embedding space.
Kubernetes v1.36: Mixed Version Proxy Graduates to Beta
Back in Kubernetes 1.28, we introduced the Mixed Version Proxy (MVP) as an Alpha feature (under the feature gate UnknownVersionInteroperabilityProxy ) in a previous blog post . The goal was simple but critical: make cluster upgrades safer by ensuring that requests for resources not yet known to an older API server are correctly routed to a newer peer API server, instead of returning an incorrect 404 Not Found . We are excited to announce that the Mixed Version Proxy is moving to Beta in Kubernetes 1.36 and will be enabled by default! The feature has evolved significantly since its initial release, addressing key gaps and modernizing its architecture. Here is a look at how the feature has evolved and what you need to know to leverage it in your clusters. What problem are we solving? In a highly available control plane undergoing an upgrade, you often have API servers running different versions. These servers might serve different sets of APIs (Groups, Versions, Resources). Without MVP, if a client request lands on an API server that does not serve the requested resource (e.g., a new API version introduced in the upgrade), that server returns a 404 Not Found . This is technically incorrect because the resource is available in the cluster, just not on that specific server. This can lead to serious side effects, such as mistaken garbage collection or blocked namespace deletions. MVP solves this by proxying the request to a peer API server that can serve it. sequenceDiagram parti…
mimalloc: A new, high-performance, scalable memory allocator for the modern era
mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free. It is relatively small (~12K lines), with clear internal data structures, and is easy to build and integrate into other projects. It provides bounded worst-case allocation times (up to OS primitives), bounded space overhead, low internal fragmentation, and minimal contention by relying almost exclusively on atomic operations. The post mimalloc: A new, high-performance, scalable memory allocator for the modern era appeared first on Microsoft Research .
Media Briefing on the Trump-Xi Meeting
Media Briefing on the Trump-Xi Meeting J.Heller Wed, 05/13/2026 - 11:31 Video May 13, 2026 1 min read Media Briefing on the Trump-Xi Meeting US President Donald Trump is set to meet Xi Jinping in his first visit to China since 2017. The meeting, originally delayed due to the US-Israeli conflict with Iran, arrives at a moment of deep structural tension between the world's two largest economies, with disputes over trade tariffs, rare earth access, Taiwan, and AI technology still unresolved. What makes this summit uniquely complex is the Iran war looming over the agenda. The ongoing conflict has inadvertently strengthened China's negotiating hand, and Washington is pressing Beijing to use its leverage with Tehran to reopen the Strait of Hormuz. MERICS Heads of Program Jacob Gunter (Economy and Industry) and Helena Legarda (Foreign Relations) assess what this summit means for the trajectory of US-China relations, the outlook for global trade, and the broader implications of an Iran conflict that is reshaping great power dynamics. The session is moderated by Claudia Wessling , Director of Communications & Publications at MERICS. Related content about US-China Trump's remarks after China Summit compromise Taiwan's security Tracker Jun 18, 2026 China in 26: Diplomatic strength, economic weakness, investment increase Podcast May 22, 2026 US-China summit: Xi warns Trump about Taiwan (ZDF) External publication May 15, 2026
Kubernetes v1.36: PSI Metrics for Kubernetes Graduates to GA
Since its original implementation in the Linux kernel in 2018, Pressure Stall Information (PSI) has provided users with the high-fidelity signals needed to identify resource saturation before it becomes an outage. Unlike traditional utilization metrics, PSI tells the story of tasks stalled and time lost, all in nicely-packaged percentages of time across the CPU, memory, and I/O. With the recent release of Kubernetes v1.36, users across the ecosystem have a stable, reliable interface to observe resource contention at the node, pod, and container levels. In this post, we will dive into the improvements and performance testing that proved its readiness for production. Beyond utilization: why PSI? Monitoring CPU or memory usage alone can be misleading. A node may report XX% (below 100%) CPU utilization while certain tasks are experiencing severe latency due to scheduling delays. PSI fills this gap by providing: Cumulative Totals : Absolute time spent in a stalled state. Moving Averages : 10s, 60s, and 300s windows that allow operators to distinguish between transient spikes and sustained resource tension. Proving stability: performance testing at scale A common concern when graduating telemetry features is the resource overhead required to collect and serve the metrics. To address this, SIG Node conducted extensive performance validation on high-density workloads (80+ pods) across various machine types. Our testing focused on two primary scenarios to isolate the impact of the Ku…
China Seeks A.I. Independence, Weakening Trump’s Leverage
CSET’s Jacob Feldgoise shared his expert insight in an article published by The New York Times. The article examines how China is accelerating efforts to build a domestic A.I. ecosystem as companies like DeepSeek and Huawei develop alternatives to American chips amid ongoing U.S. export controls. The post China Seeks A.I. Independence, Weakening Trump’s Leverage appeared first on Center for Security and Emerging Technology .
BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning
Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinforcement learning (RL). However, existing captioning-RL methods and evaluation metrics often emphasize a narrow notion of caption quality, inducing trade-offs across core dimensions of captioning. For example, utility-oriented objectives can encourage noisy, hallucinated, or overlong captions that…
Kubernetes v1.36: Moving Volume Group Snapshots to GA
Volume group snapshots were introduced as an Alpha feature with the Kubernetes v1.27 release, moved to Beta in v1.32, and to a second Beta in v1.34. We are excited to announce that in the Kubernetes v1.36 release, support for volume group snapshots has reached General Availability (GA) . The support for volume group snapshots relies on a set of extension APIs for group snapshots . These APIs allow users to take crash-consistent snapshots for a set of volumes. Behind the scenes, Kubernetes uses a label selector to group multiple PersistentVolumeClaim objects for snapshotting. A key aim is to allow you to restore that set of snapshots to new volumes and recover your workload based on a crash-consistent recovery point. This feature is only supported for CSI volume drivers. An overview of volume group snapshots Some storage systems provide the ability to create a crash-consistent snapshot of multiple volumes. A group snapshot represents copies made from multiple volumes that are taken at the same point-in-time. A group snapshot can be used either to rehydrate new volumes (pre-populated with the snapshot data) or to restore existing volumes to a previous state (represented by the snapshots). Why add volume group snapshots to Kubernetes? The Kubernetes volume plugin system already provides a powerful abstraction that automates the provisioning, attaching, mounting, resizing, and snapshotting of block and file storage. Underpinning all these features is the Kubernetes goal of workl…
Task Substitution and Uplift
Summary: We describe three different definitions of the productivity impact of AI (AKA uplift), and show there’s reason to expect: \[\text{uplift on old tasks} \leq \text{uplift in value} \leq \text{uplift on new tasks}\] Three Measures of Uplift One complication in measuring AI’s effect on productivity is that it has different effects on different tasks, and this causes people to change how they allocate their time between tasks. This makes it more difficult to talk about the effect of AI on overall productivity. We use “old tasks” to mean the set of tasks you’d do in a typical day before AI is available – your average workday in 2021, say. “New tasks” means the set of tasks you’d do in a typical day after AI is available. Not all new tasks necessarily use AI; they’re just the tasks you choose knowing AI is an option. We have found it important to distinguish between three measures of AI’s uplift: Uplift on old tasks: The factor by which pre-AI time exceeds post-AI time to complete the old tasks. Uplift on new tasks: The factor by which pre-AI time exceeds post-AI time to complete the new tasks. Uplift in value: The factor by which post-AI value exceeds pre-AI value, allowing for reshuffling of tasks between the pre-AI and post-AI cases. In some cases value has a natural definition; in others, it can be operationalized using related definitions discussed more in the accompanying note. This note discusses the distinction and its implications for interpreting AI productivity…
How a Knowledge Engine Works: From Artifacts to Agent-Ready Answers
A knowledge engine is the data infrastructure category that lets agents query trusted, compiled knowledge instead of brute-forcing retrieval over raw data. How one is built, how agents query it, and how it compares to RAG, vector databases, and semantic layers.
China resumes fuel exports + US oil sanctions + Emission quotas for local cadres
China resumes fuel exports + US oil sanctions + Emission quotas for local cadres c.groth Thu, 05/07/2026 - 13:14 picture alliance / CFOTO | CFOTO Download (pdf - 546.11 KB) MERICS Briefs MERICS China Essentials May 07, 2026 10 min read China resumes fuel exports + US oil sanctions + Emission quotas for local cadres Top Story China resumes fuel exports as national supply worries ebb – and regional ones rise China is moving to prevent the worst for Asian economies by resuming exports of jet and motor fuels to some regional countries in May. Having suspended shipments from refineries shortly after the US and Israel attacked Iran at the end of February, China will allow 500,000 metric tons of fuel to be exported this month. This is still much lower than its pre-war average of more than double that amount, but a sign that Beijing’s persistent caution about its own energy supply is ebbing – and that its worries about compounding pressure on regional supply chains and markets are increasing. China is a major importer of oil and gas, but a major exporter of fuel, with its many refineries providing gasoline, diesel, and jet fuel to countries from nearby Vietnam to far away Australia. Asian economies were deeply disrupted by the energy shock triggered by the closure of the Strait of Hormuz, a critical global energy artery between Iran and Oman, and were hit again when Beijing stopped shipments of its refined petroleum products. China’s partial reversal should help ease the fuel crunch…
Full Text Search in Pinecone, Now in Public Preview
Full text search in Pinecone, built for agents and RAG. Lucene queries, BM25, 17-language tokenization, and text-match filters in a single query alongside vectors.
Your LLM Is Only as Good as What It Retrieves
A Researcher's Perspective on Retrieval Quality in RAG Systems
Introducing Pinecone Marketplace: Getting to Production in Minutes
Stop answering the same questions. Turn docs into a "system of knowledge" with Marketplace. No-code RAG for support, legal, and onboarding with cited answers.
⚡Vector Search at 10B Scale, 📊 Lance Format Benchmarks, 🚗 AV Pipelines at Scale
Distributed vector search at 10B scale, more efficient storage with Lance format v2.2, and production AV pipelines simplified, plus upcoming events and community updates.
Lance Blob V2: Making Multimodal Data a First-Class Citizen in the Lakehouse
How we redesigned blob storage in Lance to make multimodal data a first-class citizen, with four storage semantics (Inline, Packed, Dedicated, External) that automatically adapt to your workload.
Building A Storage Format For The Next Era of Biology
How Lance can serve as the foundation for AI on single-cell genomics atlases and a new generation for modeling in biology.
Expand Financing Support to Science and Technology Enterprises
Read our translation of a May 2025 press conference featuring several Chinese government finance officials, who discussed a recently issued policy encouraging greater capital market funding for tech companies. The post Expand Financing Support to Science and Technology Enterprises appeared first on Center for Security and Emerging Technology .
Leveraging commuting patterns and workplace charging to advance equitable EV charger access
Leveraging commuting patterns and workplace charging to advance equitable EV charger access robyn.cherinka… Tue, 04/21/2026 - 13:24 This study introduces a framework for improving accessibility to and quantifying social equity priorities in electric vehicle charging infrastructure through strategic workplace charger placement. We develop a customizable equity evaluation model that quantifies access disparities across demographic groups. This model is used to construct an optimization framework that informs charging infrastructure deployment decisions. Leveraging commuting patterns, we demonstrate in the case study of Oakland, California that strategically placing workplace charging can achieve, on average, a 1.8-fold reduction in accessible charging resource disparities compared to benchmark scenarios. Our analysis reveals that targeted workplace charger deployment in high-commuter zones can disproportionately improve citywide equity. The framework provides policymakers with quantifiable metrics to evaluate trade-offs between sometimes divergent equity considerations (e.g., income, housing type) and offers practical insights for achieving more equitable charging infrastructure distribution. Image Nov 15, 2025 Human-Centered AI Read More 1 Minute Read
Gradient-based Planning for World Models at Longer Horizons
GRASP is a new gradient-based planner for learned dynamics (a “world model”) that makes long-horizon planning practical by (1) lifting the trajectory into virtual states so optimization is parallel across time, (2) adding stochasticity directly to the state iterates for exploration, and (3) reshaping gradients so actions get clean signals while we avoid brittle “state-input” gradients through high-dimensional vision models. Large, learned world models are becoming increasingly capable. They can predict long sequences of future observations in high-dimensional visual spaces and generalize across tasks in ways that were difficult to imagine a few years ago. As these models scale, they start to look less like task-specific predictors and more like general-purpose simulators. But having a powerful predictive model is not the same as being able to use it effectively for control/learning/planning. In practice, long-horizon planning with modern world models remains fragile: optimization becomes ill-conditioned, non-greedy structure creates bad local minima, and high-dimensional latent spaces introduce subtle failure modes. In this blog post, I describe the problems that motivated this project and our approach to address them: why planning with modern world models can be surprisingly fragile, why long horizons are the real stress test, and what we changed to make gradient-based planning much more robust. This blog post discusses work done with Mike Rabbat, Aditi Krishnapriyan, Yann…
Volcano Engine LAS's Lance-Based PB-Scale Autonomous Driving Data Lake Solution
How Bytedance Volcano Engine LAS (Lake for AI Service) leverages Lance as the core storage format, rapidly constructing a next-gen AI data lake to efficiently store, manage, and process multimodal data (text, images, audio/video).
Lance JSON Support: Why You Might Not Really Need Variant
Lance's JSONB storage, scalar indexing, data evolution, and full-text search already deliver what most users want from Variant — with explicit control, schema consistency, and no vendor lock-in.
The Quest for One Million IOPS: Benchmarking Storage at LanceDB
Learn how LanceDB benchmarks storage and how we achieved one million disk reads per second.
#495 – Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age
Lars Brownworth is a historian, teacher, podcaster, and author specializing in Viking history, medieval Europe, and the Byzantine Empire. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep495-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/lars-brownworth-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Lars’s Website: https://larsbrownworth.com/ The Sea Wolves (book): https://www.amazon.com/Sea-Wolves-History-Vikings/dp/1909979120 Lars’s Books: https://amzn.to/4sHY0xw 12 Byzantine Rulers Podcast : https://12byzantinerulers.com/ Norman Centuries Podcast: https://apple.co/4sgSxNi
Transcript for Vikings, Ragnar, Berserkers, Valhalla & the Warriors of the Viking Age | Lex Fridman Podcast #495
This is a transcript of Lex Fridman Podcast #495 with Lars Brownworth. The timestamps in the transcript are clickable links that take you directly to that point in the main video. Please note that the transcript is human generated, and may have errors. Here are some useful links: Go back to this episode’s main page Watch the full YouTube version of the podcast Table of Contents Here are the loose “chapters” in the conversation. Click link to jump approximately to that part in the transcript: 0:00 – Episode highlight 1:17 – Introduction 2:37 – The start of the Viking Age
Realism vs. Pragmatism: Understanding America’s New Rhetorical Landscape
Amid Trump's fiery rhetoric and debates over "realism," Professor Jason Ralph writes that "pragmatism" may offer a better alternative for the American public.
Fine-tuning experiments on CoT controllability
Kei Nishimura-Gasparian is an Astra fellow and was the primary contributor to this work. Neev Parikh provided mentorship and feedback. Summary: We find that a small amount of fine-tuning on instruction following in the CoT generalizes to meaningful increases in CoT controllability on an out-of-distribution set of tasks (CoTControl eval suite). We fine-tune four reasoning models on small datasets (240 examples or ~100K-300K tokens of fine-tuning) of instruction-following reasoning data and OOD controllability rises from an average of 2.9% to 8.8% across four models. 1 We see the largest increases for instructions that request reasoning in a specified case, suppressing certain words, and adding provided sentences to the reasoning. While 8.8% remains low in absolute terms, this provides evidence that just a small amount of fine-tuning can increase controllability, suggesting that low CoT controllability may not be very robust to accidental optimization pressure. Limitations and caveats: It seems unlikely that frontier AI labs will do even a small amount of fine-tuning directly for controllability which makes our setup somewhat unrealistic. However, the fact that a slight improvement in these capabilities can be elicited with a small amount of fine-tuning suggests the capabilities are latent in the model rather than the fine-tuning teaching the model a new skill. We have not shown that this increase in controllability results in a decrease in monitorability, we will look at this…
Multimodal Embeddings and RAG: A Practical Guide
Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions behind how this all works and walks through three practical implementations using Weaviate and Gemini.
Your Code is Your Schema: Weaviate Managed C# Client
Use semantic search and RAG in C# with the Weaviate Managed .NET client — attribute-driven schema, type-safe queries, and safe migrations, all in idiomatic .NET.
A new GPT-OSS benchmark and DeepSeek R1 updates for latency-optimized reasoning
MLPerf Inference v6.0 expands open-weight LLM coverage with a new GPT-OSS 120B benchmark and a latency-constrained interactive scenario for DeepSeek-R1 — the first MLPerf standard for speculative decoding. The post A new GPT-OSS benchmark and DeepSeek R1 updates for latency-optimized reasoning appeared first on MLCommons .
Enhance Your In-IDE Data Browsing Experience With MongoDB
MongoDB is excited to announce the general availability of our enhanced data browsing experience in the MongoDB for Visual Studio (VS) Code extension. This new experience offers a unified workspace for developers to visually browse, query, and edit their data natively, streamlining workflows so they can manage their database right where they write their code. Evolving the developer workflow The modern developer’s workflow is incredibly fast-paced. With developers juggling an average of 14 different tools daily, the cognitive load of constantly jumping between applications can easily disrupt focus. When your application needs to evolve, working with your data shouldn’t force a break in your flow state. As the MongoDB for VS Code extension has grown to nearly 3 million downloads, we’ve seen firsthand how developers are pushing the boundaries of what an in-IDE (integrated development environment) database tool can do. While developers love accessing their data directly in the editor, we wanted to transform this experience to be even more visual, actionable, and seamless. Instead of switching to external terminals for quick tasks or taking the time to translate familiar MongoDB Shell commands into Extended JSON (EJSON), we are bringing a full-fledged, intuitive data management suite right to your VS Code sidebar. Exploring what’s new in the MongoDB for VS Code extension Here are the key improvements that transform the extension into a complete workflow solution: Paginated tree v…
Observability and OpenTelemetry: Introducing MongoDB Atlas Log Integration
In high-stakes enterprise environments, outages do not wait for business hours, and neither do IT/Network Operators. A latency spike hits the dashboard, and metrics signal that the database is under pressure. The cause? Indeterminate. Meanwhile, the business impact is immediate: orders fail to process, customers can’t access accounts, transactions stall, and critical records become temporarily unavailable. Every minute of uncertainty translates into lost revenue, frustrated users, and escalating pressure. Teams often fall back on a familiar—yet time-consuming—ritual: logging into their data platform, exporting large log files, extracting compressed archives, and manually searching through thousands of lines of entries to identify the issue. What should be a quick diagnosis becomes a manual context-switching investigation. By the time the problematic query, configuration issue, or audit event is identified, users have already experienced the disruption—and the business has absorbed the cost. MongoDB believes the database should be the heartbeat of a digital business. So we’re introducing a new log integration that brings MongoDB Atlas system and audit logs directly into external observability and storage platforms. This enhancement helps bridge the gap between metrics and meaning when it matters most. Flexible log delivery for modern observability workflows Now database operators, DevOps pros, and IT Operations teams alike can send MongoDB system and audit logs—including mong…
Towards Model-based Verification of a Key-Value Storage Engine
In our previous post, we talked about our process of specifying MongoDB’s distributed transactions protocol and how it enabled novel analysis of its performance characteristics. In this follow-up, we talk about how the modularity of our specification also enabled us to check that the underlying storage engine implementation actually conforms to the abstract behavior defined in our formal specification. That is, we are able to formalize the interface boundary between the sharded transaction protocol and WiredTiger, the underlying key-value storage engine, and develop an automated way to generate tests for checking conformance between the semantics of the underlying storage engine layer and this abstract model. As mentioned in the previous post, a deeper exploration of the concepts covered in this post is covered in our recently published VLDB ’25 paper, Design and Modular Verification of Distributed Transactions in MongoDB. Modular, Model-Based Verification As discussed in Part 1, we had developed a TLA+ specification of MongoDB’s distributed transactions protocol in a compositional manner, describing the high level protocol behavior while also formalizing the boundary between the distributed aspect of the transactions protocol and the underlying single-node WiredTiger storage engine component. As mentioned, the distributed transactions protocol can be viewed as running atop the lower level storage layer. When considering the correctness guarantees of the distributed transact…
Building A Legal RAG App in 36 Hours
Learn how we built a production-ready, end-to-end RAG application in just 36 hours using the Query Agent and the new Weaviate Agent Skills library.
Cognitive Synthesis and Neural Athletes
As AI accelerates innovation and adoption, leaders are facing rising cognitive load, shifting systems, and new emotional realities inside their organizations. In this episode, Deloitte’s Chief Innovation Officer Deborah Golden joins us to explore how AI is reshaping leadership, why vulnerability and empathy are critical in this moment, and how anti-fragility, not just resilience, will define the future of work. Featuring: Deborah Golden – LinkedIn Chris Benson – Website , LinkedIn , Bluesky , GitHub , X Daniel Whitenack – Website , GitHub , X Links: Deloitte Sponsor: Framer - The website builder that turns your dot com from a formality into a tool for growth. Check it out at framer.com/PRACTICALAI Upcoming Events: Register for upcoming webinars here !
Vision RAG: Enabling Search on Any Documents
Information comes in many shapes and forms. While retrieval-augmented generation (RAG) primarily focuses on plain text, it overlooks vast amounts of data along the way. Most enterprise knowledge resides in complex documents, slides, graphics, and other multimodal sources. Yet, extracting useful information from these formats using optical character recognition (OCR) or other parsing techniques is often low-fidelity, brittle, and expensive. Vision RAG makes complex documents—including their figures and tables—searchable by using multimodal embeddings, eliminating the need for complex and costly text extraction. This guide explores how Voyage AI’s latest model powers this capability and provides a step-by-step implementation walkthrough. Vision RAG: Building upon text RAG Vision RAG is an evolution of traditional RAG built on the same two components: retrieval and generation. In traditional RAG, unstructured text data is indexed for semantic search. At query time, the system retrieves relevant documents or chunks and appends them to the user’s prompt so the large language model (LLM) can produce more grounded, context-aware answers. Figure 1. Text RAG with Voyage AI and MongoDB. Text RAG with Voyage AI and MongoDB Enterprise data, however, is rarely just clean plain text. Critical information often lives in PDFs, slides, diagrams, dashboards, and other visual formats. Today, this is typically handled by parsing tools and OCR services. Those approaches create several problems:…
Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin - #760
Today, we're joined by Nikita Rudin, co-founder and CEO of Flexion Robotics to discuss the gap between current robotic capabilities and what’s required to deploy fully autonomous robots in the real world. Nikita explains how reinforcement learning and simulation have driven rapid progress in robot locomotion—and why locomotion is still far from “solved.” We dig into the sim2real gap, and how adding visual inputs introduces noise and significantly complicates sim-to-real transfer. We also explore the debate between end-to-end models and modular approaches, and why separating locomotion, planning, and semantics remains a pragmatic approach today. Nikita also introduces the concept of "real-to-sim", which uses real-world data to refine simulation parameters for higher fidelity training, discusses how reinforcement learning, imitation learning, and teleoperation data are combined to train robust policies for both quadruped and humanoid robots, and introduces Flexion's hierarchical approach that utilizes pre-trained Vision-Language Models (VLMs) for high-level task orchestration with Vision-Language-Action (VLA) models and low-level whole-body trackers. Finally, Nikita shares the behind-the-scenes in humanoid robot demos, his take on reinforcement learning in simulation versus the real world, the nuances of reward tuning, and offers practical advice for researchers and practitioners looking to get started in robotics today. The complete show notes for this episode can be found at…
Where did it all go wrong? A hierarchical look into multi-agent error attribution
Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in multi-agent interaction traces—whether using all-at-once evaluation, step-by-step analysis, or binary search—fall short when analyzing complex patterns, struggling with both accuracy and consistency. We present ECHO (Error attribution through Contextual Hierarchy and Objective consensus analysis), a novel algorithm that combines hierarchical context representation, objective analysis-based evaluation, and consensus voting to improve error attribution accuracy. Our approach leverages a positional-based leveling of contextual understanding while maintaining objective evaluation criteria, ultimately reaching conclusions through a consensus mechanism. Experimental results demonstrate that ECHO outperforms existing methods across various multi-agent interaction scenarios, showing particular strength in cases involving subtle reasoning errors and complex interdependencies. Our findings suggest that leveraging these concepts of structured, hierarchical context representation combined with consensus-based objective decision-making, provides a more robust framework for error attribution in multi-agent systems.
Mixed Methods Scenario Development for Human-Vehicle Interaction Research: A Case Study on Winter Driving
Mixed Methods Scenario Development for Human-Vehicle Interaction Research: A Case Study on Winter Driving robyn.cherinka… Wed, 11/12/2025 - 14:59 Scenarios provide a fundamental link between driving simulators and real-world conditions, shaping the extent to which the findings of a user study can be applied to public roads. However, compared to other aspects of study design, scenario development in human–vehicle interaction research tends to receive less deliberate attention. To encourage more methodical scenario generation, this work introduces a mixed methods approach for extracting representative scenarios from an integration of three real-world data sources: aggregated crash statistics, interviews with experienced drivers, and naturalistic driving data. Through a case study on winter driving, we outline the derivation of a nighttime, two-lane road scenario from these data sources and conduct an initial driving simulator pilot study to assess its realism. We hope that this demonstration of scenario generation from quantitative and qualitative data inspires researchers to consider more rigorous methods for scenario design in future work. Read More Image Oct 8, 2025 Human Interactive Driving 1 Minute Read
From Dashboards to Dialogue: Evaluating a Conversational AI Coach for Performance Driving Skill Development
From Dashboards to Dialogue: Evaluating a Conversational AI Coach for Performance Driving Skill Development robyn.cherinka… Wed, 11/12/2025 - 14:50 Learning in domains involving complex motor skills, such as performance driving, often requires feedback that is timely, personalized, and actionable. Yet many drivers rely on video and telemetry data to review their performance without guidance. We explore how conversational AI can support post-drive reflection by integrating LLM-generated coaching into an interactive review interface. In an exploratory within-subjects simulator study (n=16), participants completed laps under two conditions: one with video and data visualizations alone, and another with the same tools augmented with a conversational interface that provided verbal feedback after each lap. Conversational feedback supported short-term improvements in lap time, average speed, and steering control, and was rated as more useful and satisfying—though it also elicited slightly higher nervousness. These results suggest that conversational AI can make post-drive feedback more interpretable and actionable, particularly for drivers reviewing performance data in high-skill contexts like performance driving. Read More Image Oct 4, 2025 Human Interactive Driving 1 Minute Read
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation robyn.cherinka… Wed, 11/12/2025 - 14:40 Off-policy evaluation (OPE) estimates the performance of a target policy using offline data collected from a behavior policy, and is crucial in domains such as robotics or healthcare where direct interaction with the environment is costly or unsafe. Existing OPE methods are ineffective for high-dimensional, long-horizon problems, due to exponential blow-ups in variance from importance weighting or compounding errors from learned dynamics models. To address these challenges, we propose STITCH-OPE, a model-based generative framework that leverages denoising diffusion for long-horizon OPE in high-dimensional state and action spaces. Starting with a diffusion model pre-trained on the behavior data, STITCH-OPE generates synthetic trajectories from the target policy by guiding the denoising process using the score function of the target policy. STITCH-OPE proposes two technical innovations that make it advantageous for OPE: (1) prevents over-regularization by subtracting the score of the behavior policy during guidance, and (2) generates long-horizon trajectories by stitching partial trajectories together end-to-end. We provide a theoretical guarantee that under mild assumptions, these modifications result in an exponential reduction in variance versus long-horizon trajectory diffusion. Experiments on the D4RL and OpenAI Gym benchmarks show substantial improveme…
Beyond detection: A multi-agent framework for root cause analysis of financial discrepancies in distributed environments
The increasing complexity and fragmentation of financial systems in large organizations have created significant challenges for financial teams, particularly in performing real-time, end-to-end validation, as existing validation methods relying on static rules or batch processing are often inadequate for today's dynamic financial environments. This paper introduces a novel approach using Large Language Model (LLM)-based browser agents within a multi-agent framework to enhance financial validation processes. The framework leverages domain-specific agents that autonomously navigate web-based financial platforms to validate data, interpret discrepancies, and perform root cause analysis, ensuring higher accuracy, transparency, and auditability compared to traditional systems. A synthetic dataset and controlled simulation environment were used to evaluate the framework's performance across 20 distinct financial scenarios, revealing significant improvements in validation accuracy (from 40% with a Vanilla agent to 65% with the proposed approach). The results indicate that the proposed multi-agent approach, by isolating validation tasks into specialized agents and orchestrating a coordinated investigation, provides a more reliable, scalable, and interpretable solution for high-stakes financial environments.
Leveraging Generative AI in Project Management
Introduction The project management landscape is undergoing a transformative shift, driven by the rapid advancements...
Why work at the EU AI Office?
It's probably not for everyone, but there are a lot of great reasons to consider, including the potential to have an impact on AI governance worldwide, leveraging the first-mover advantage, and more.
Developing Advanced RAG Systems with Qdrant Hybrid Cloud and LangChain
LangChain and Qdrant are collaborating on the launch of Qdrant Hybrid Cloud , which is designed to empower engineers and scientists globally to easily and securely develop and scale their GenAI applications. Harnessing LangChain’s robust framework, users can unlock the full potential of vector search, enabling the creation of stable and effective AI products. Qdrant Hybrid Cloud extends the same powerful functionality of Qdrant onto a Kubernetes-based architecture, enhancing LangChain’s capability to cater to users across any environment.
Red Hat OpenShift and Qdrant Hybrid Cloud Offer Seamless and Scalable AI
We’re excited about our collaboration with Red Hat to bring the Qdrant vector database to Red Hat OpenShift customers! With the release of Qdrant Hybrid Cloud , developers can now deploy and run the Qdrant vector database directly in their Red Hat OpenShift environment. This collaboration enables developers to scale more seamlessly, operate more consistently across hybrid cloud environments, and maintain complete control over their vector data. This is a big step forward in simplifying AI infrastructure and empowering data-driven projects, like retrieval augmented generation (RAG) use cases, advanced search scenarios, or recommendations systems.
Enhance AI Data Sovereignty with Aleph Alpha and Qdrant Hybrid Cloud
Aleph Alpha and Qdrant are on a joint mission to empower the world’s best companies in their AI journey. The launch of Qdrant Hybrid Cloud furthers this effort by ensuring complete data sovereignty and hosting security. This latest collaboration is all about giving enterprise customers complete transparency and sovereignty to make use of AI in their own environment. By using a hybrid cloud vector database, those looking to leverage vector search for the AI applications can now ensure their proprietary and customer data is completely secure.
STACKIT and Qdrant Hybrid Cloud for Best Data Privacy
Qdrant and STACKIT are thrilled to announce that developers are now able to deploy a fully managed vector database to their STACKIT environment with the introduction of Qdrant Hybrid Cloud . This is a great step forward for the German AI ecosystem as it enables developers and businesses to build cutting edge AI applications that run on German data centers with full control over their data. Vector databases are an essential component of the modern AI stack. They enable rapid and accurate retrieval of high-dimensional data, crucial for powering search, recommendation systems, and augmenting machine learning models. In the rising field of GenAI, vector databases power retrieval-augmented-generation (RAG) scenarios as they are able to enhance the output of large language models (LLMs) by injecting relevant contextual information. However, this contextual information is often rooted in confidential internal or customer-related information, which is why enterprises are in pursuit of solutions that allow them to make this data available for their AI applications without compromising data privacy, losing data control, or letting data exit the company’s secure environment.
New RAG Horizons with Qdrant Hybrid Cloud and LlamaIndex
We’re happy to announce the collaboration between LlamaIndex and Qdrant’s new Hybrid Cloud launch , aimed at empowering engineers and scientists worldwide to swiftly and securely develop and scale their GenAI applications. By leveraging LlamaIndex’s robust framework, users can maximize the potential of vector search and create stable and effective AI products. Qdrant Hybrid Cloud offers the same Qdrant functionality on a Kubernetes-based architecture, which further expands the ability of LlamaIndex to support any user on any environment.
Cutting-Edge GenAI with Jina AI and Qdrant Hybrid Cloud
We’re thrilled to announce the collaboration between Qdrant and Jina AI for the launch of Qdrant Hybrid Cloud , empowering users worldwide to rapidly and securely develop and scale their AI applications. By leveraging Jina AI’s top-tier large language models (LLMs), engineers and scientists can optimize their vector search efforts. Qdrant’s latest Hybrid Cloud solution, designed natively with Kubernetes, seamlessly integrates with Jina AI’s robust embedding models and APIs. This synergy streamlines both prototyping and deployment processes for AI solutions.
Qdrant Hybrid Cloud and Haystack for Enterprise RAG
We’re excited to share that Qdrant and Haystack are continuing to expand their seamless integration to the new Qdrant Hybrid Cloud offering, allowing developers to deploy a managed vector database in their own environment of choice. Earlier this year, both Qdrant and Haystack, started to address their user’s growing need for production-ready retrieval-augmented-generation (RAG) deployments. The ability to build and deploy AI apps anywhere now allows for complete data sovereignty and control. This gives large enterprise customers the peace of mind they need before they expand AI functionalities throughout their operations.
Elevate Your Data With Airbyte and Qdrant Hybrid Cloud
In their mission to support large-scale AI innovation, Airbyte and Qdrant are collaborating on the launch of Qdrant’s new offering - Qdrant Hybrid Cloud . This collaboration allows users to leverage the synergistic capabilities of both Airbyte and Qdrant within a private infrastructure. Qdrant’s new offering represents the first managed vector database that can be deployed in any environment. Businesses optimizing their data infrastructure with Airbyte are now able to host a vector database either on premise, or on a public cloud of their choice - while still reaping the benefits of a managed database product.
Leveraging Linux Internals to Supercharge Osquery Malware Detection
Using /proc to find fileless malware
The Haitian Times thrives by understanding its audience, making smart financial decisions and embracing AI
Despite the challenges faced by the media industry, the Haitian Times –a print and digital newspaper catering to Haitian immigrants in the United States– has managed to not only survive but thrive by adapting to the changing needs of its audience. Through a combination of smart financial decisions, leveraging technology like AI, and deeply understanding […] The post The Haitian Times thrives by understanding its audience, making smart financial decisions and embracing AI appeared first on LatAm Journalism Review by the Knight Center .
Fine-tuning LLMs for longer context and better RAG systems
Update June 2024: Anyscale Endpoints (Anyscale's LLM API Offering) and Private Endpoints (self-hosted LLMs) are now available as part of the Anyscale Platform. Click [here](https://console.anyscale.com/?utm_source=anyscale&utm_medium=blog&utm_campaign=blog_callout&utm_content=june2024_product_update_subheading) to get started on the Anyscale platform.
Comment on State-Of-The-Art Approaches to Attribution in Marketing by Bay tech media
In the realm of digital marketing, attribution methodologies have undergone significant advancements. State-of-the-art approaches include Multi-Touch Attribution (MTA) for holistic channel tracking, Algorithmic Attribution leveraging machine learning for precise credit assignment, Cross-Device Attribution capturing interactions across devices, Incrementality Testing to gauge true marketing impact, and AI-Powered Attribution for deep data analysis. Bay Tech Media implements these cutting-edge methods, empowering businesses with accurate insights to refine and optimize their marketing strategies effectively.
The Artificiality of Alignment
This essay first appeared in Reboot . Credulous, breathless coverage of “AI existential risk” (abbreviated “x-risk”) has reached the mainstream. Who could have foreseen that the smallcaps onomatopoeia “ꜰᴏᴏᴍ” — both evocative of and directly derived from children’s cartoons —
LLM Powered Autonomous Agents
Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT , GPT-Engineer and BabyAGI , serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver. Agent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components: Planning Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks. Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results. Memory Short-term memory: I would consider all the in-context learning (See Prompt Engineering ) as utilizing short-term memory of the model to learn. Long-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval. Tool use The agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more. Overview of a LLM-powered autonomous agent syste…
Stanford AI Lab Papers and Talks at ICLR 2022
The International Conference on Learning Representations (ICLR) 2022 is being hosted virtually from April 25th - April 29th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford! List of Accepted Papers Autonomous Reinforcement Learning: Formalism and Benchmarking Authors : Archit Sharma*, Kelvin Xu*, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn Contact : architsh@stanford.edu Links: Paper | Website Keywords : reinforcement learning, continual learning, reset-free reinforcement learning MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts Authors : Weixin Liang, James Zou Contact : wxliang@stanford.edu Links: Paper | Video | Website Keywords : benchmark dataset, distribution shift, out-of-domain generalization An Explanation of In-context Learning as Implicit Bayesian Inference Authors : Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma Contact : xie@cs.stanford.edu Links: Paper | Video Keywords : gpt-3, in-context learning, pretraining, few-shot learning GreaseLM: Graph REASoning Enhanced Language Models for Question Answering Authors : Xikun Zhang, Antoine Bosselut, Michihiro Yasunaga, Hongyu Ren, Percy Liang, Christopher D. Manning, Jure Leskovec Contact : xikunz2@cs.stanford.edu Award nominations: Sp…
Understanding Deep Learning Algorithms that Leverage Unlabeled Data, Part 1: Self-training
Deep models require a lot of training examples, but labeled data is difficult to obtain. This motivates an important line of research on leveraging unlabeled data, which is often more readily available. For example, large quantities of unlabeled image data can be obtained by crawling the web, whereas labeled datasets such as ImageNet require expensive labeling procedures. In recent empirical developments, models trained with unlabeled data have begun to approach fully-supervised performance (e.g., Chen et al., 2020 , Sohn et al., 2020 ). This series of blog posts will discuss our theoretical work which seeks to analyze recent empirical methods which use unlabeled data. In this first post, we’ll analyze self-training , which is a very impactful algorithmic paradigm for semi-supervised learning and domain adaptation . In Part 2, we will use related theoretical ideas to analyze self-supervised contrastive learning algorithms, which have been very effective for unsupervised representation learning . Background: self-training We will first provide a basic overview of self-training algorithms, which are the main focus of this blog post. The core idea is to use some pre-existing classifier \(F_{pl}\) (referred to as the “pseudo-labeler”) to make predictions (referred to as “pseudo-labels”) on a large unlabeled dataset, and then retrain a new model with the pseudo-labels. For example, in semi-supervised learning, the pseudo-labeler is obtained from training on a small labeled datase…
How to Improve User Experience (and Behavior): Three Papers from Stanford's Alexa Prize Team
Introduction In 2019, Stanford entered the Alexa Prize Socialbot Grand Challenge 3 for the first time, with its bot Chirpy Cardinal , which went on to win 2nd place in the competition. In our previous post , we discussed the technical structure of our socialbot and how developers can use our open-source code to develop their own. In this post we share further research conducted while developing Chirpy Cardinal to discover common pain points that users encounter when interacting with socialbots, and strategies for addressing them. The Alexa Prize is a unique research setting, as it allows researchers to study how users interact with a bot when doing so solely for their own motivations. During the competition, US-based Alexa users can say the phrase “let’s chat” to speak in English to an anonymous and randomly-selected competing bot. They are free to end the conversation at any time. Since Alexa Prize socialbots are intended to create as natural an experience as possible, they should be capable of long, open-domain social conversations with high coverage of topics. We observed that Chirpy users were interested in many different subjects, from current events (e.g., the coronavirus) to pop culture (e.g., the movie Frozen 2 ) to personal interests (e.g,. their pets). Chirpy achieves its coverage of these diverse topics by using a modular design that combines both neural generation and scripted dialogue, as described in our previous post . We used this setting to study three quest…
How do you solve strictly constrained optimization problems with pytorch?
I am trying to solve the following problem using pytorch: given a six sided die whose average roll is known to be 4.5, what is the maximum entropy distribution for the faces? (Note: I know a bunch of non-pytorch techniques for solving problems of this sort - my goal here is really to be better understand how to solve constrained optimization problems in general with pytorch. In real life I'm working on a much harder constrained optimization problem involving a neural model implemented in pytorch, and I'm hoping that if I can solve this problem then it will help with the harder problem.) In principle it should be possible to handle this by looking for critical points of the Lagrangian: $$L(p) = -\sum_i p_i \log p_i + \lambda\left(\sum_i p_i - 1\right) + \mu\left(\sum_i i p_i - 4.5\right)$$ Here's my attempt to do this with pytorch: class MaxEntropyDice(torch.nn.Module): def __init__(self, num_faces=6, mean_constraint=3.5): super().__init__() self.num_faces = num_faces self.mean_constraint = mean_constraint self.p = torch.nn.Parameter(F.normalize(torch.rand(num_faces), p=1, dim=0)) self.probability_multiplier = torch.nn.Parameter(torch.rand(1)) self.mean_multiplier = torch.nn.Parameter(torch.rand(1)) def forward(self): entropy = -torch.sum(self.p * torch.log(self.p)) probability_term = self.probability_multiplier * (torch.sum(self.p) - 1) mean_term = self.mean_multiplier * ( torch.sum(torch.tensor(range(1, self.num_faces + 1)) * self.p) - self.mean_constraint ) lagrangian = en…
Stanford AI Lab Papers at EMNLP/CoNLL 2021
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021) will take place next week, colocated with CoNLL 2021. We’re excited to share all the work from SAIL that will be presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford! List of Accepted Papers Calibrate your listeners! Robust communication-based training for pragmatic speakers Authors : Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman Contact : rewang@stanford.edu Links: Paper | Video Keywords : language generation, pragmatics, communication-based training, calibration, uncertainty Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text Authors : Maya Varma, Laurel Orr, Sen Wu, Megan Leszczynski, Xiao Ling, Christopher Ré Contact : mvarma2@stanford.edu Links: Paper | Video Keywords : named entity disambiguation, biomedical text, rare entities, data integration ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts Authors : Yuta Koreeda, Christopher D. Manning Contact : koreeda@stanford.edu Links: Paper | Website Keywords : natural language inference, contract, law, legal, dataset Venue : The Findings of EMNLP 2021 The Emergence of the Shape Bias Results from Communicative Efficiency Authors : Eva Portelance, Michael C. Frank, Dan Jurafsky, Alessandro Sordoni, Romain Laroche Contact : portelan@stanford.edu Links…
Selective Classification Can Magnify Disparities Across Groups
Selective classification, where models are allowed to “abstain” when they are uncertain about a prediction, is a useful approach for deploying models in settings where errors are costly. For example, in medicine, model errors can have life-or-death ramifications, but abstentions can be easily handled by backing off to a doctor, who then makes a diagnosis. Across a range of applications from vision 1 2 3 and NLP 4 5 , even simple selective classifiers, relying only on model logits, routinely and often dramatically improve accuracy by abstaining. This makes selective classification a compelling tool for ML practitioners 6 7 . However, in our recent ICLR paper, we find that despite reliably improving average accuracy, selective classification can fail to improve and even hurt the accuracy over certain subpopulations of the data . As a motivating example, consider the task of diagnosing pleural effusion, or fluid in the lungs, from chest X-rays. Pleural effusion is often treated with a chest drain, so many pleural effusion cases also have chest drains, while most cases without pleural effusion do not have chest drains 8 . While selective classification improves average accuracy for this task, we find that it does not appreciably improve accuracy on the most clinically relevant subgroup, or subpopulation, of the data: those that have pleural effusion but don’t yet have a chest drain, i.e. those that have pleural effusion but have not yet been treated for it. Practitioners, thus,…
A Gentle Introduction to Graph Neural Networks
What components are needed for building learning algorithms that leverage the structure and properties of graphs?
Minibatch Weighted Sampling for estimating log(q_z) for disentangled representation based on ELBO loss in VAE
I'm reading the paper "Isolating Sources of Disentanglement in VAEs" . Assuming $p(n)$ is a uniform distribution and that we have a model to get $q(z|n)$ for any input $n$ . Also, $q(z|n)$ represents a normal distribution, so the model predicts the mean and covariance matrix for $q(z|n)$ . Please consider the following minibatch-based estimation provided by the author. Question 1. I don’t understand how the third line follows from the second line, where $E_{p(B_M)}$ is introduced along with averaging over the values of $q(z|n_m)$ . It is somewhat intuitive, but I'd like to know concretely. Question 2. What happened to $E_r(B_M|n)$ in (S4)?
Teaching from Simple Abstractions
(You need to know programming to understand this post. If you know what linked lists are, that’s enough to get the general point, but more knowledge would be more helpful.) Within the Programming Languages community, there’s a subcommunity that thinks a lot about education, especially for introductory courses. Two main approaches are SICP approach and […]
Comment on Jumia Kenya Partners Postal Corporation of Kenya For Logistics by Monday News Round Up: Send it Like A Snail Mail | techcabal.com
[…] Jumia Kenya is leveraging on Posta Kenya logistics network […]