AI/ML News & Innovations Hub

Everyone said AI would get cheaper. The chips really did. So did the tokens. Somehow, enterprise bills didn’t get the memo.

Sanchayan Sinha saw that gap coming before most people had a name for it. In December 2024, he got together with Parag Jain and Praveen Jain to build a chip that would make AI cheaper to use. A chip built for inference, following an existing model, instead of training one.

The idea wasn’t revolutionary. Google, Groq, and even Nvidia had been developing inference chips since 2015. However, nobody in India had touched it. The ecosystem was focused on LLMs, chip acquisition, and later, enterprise applications.

Today, Turiyam, the firm that Sinha built, is one among a handful of Indian chip companies focusing on inference—with a full-stack, no Nvidia anywhere in the build, and a thesis that the next AI-infrastructure race will be fought over who builds the cheapest, not smartest, model.

Turiyam’s bet rests on a contradiction.

Prices of tokens (the units of data that an AI model processes) have collapsed roughly 35% in over two years. However, annual enterprise-AI budgets have gone upDFI DFI Insights nearly 6X in the same period, to $7 million in 2026.

“The cost of a single query has collapsed, but our total bill has exploded,” a former CTO at an Indian enterprise said. “Two years ago, we might have spent, say, $45 to generate a thousand responses. Today, we can do that for a fraction of the cost, almost $0.75. But we’re no longer generating just a thousand responses. We’re generating millions.”

Turiyam’s answer is to stop charging for tokens altogether.

Most AI providers charge by the number of tokens used. Turiyam charges by the finished product instead—like a generated image or a completed customer call—at a price it claims is 10% to 20% cheaper than running the same job on Nvidia.

A company spending Rs 2 crore a year on AI infrastructure today could, by that math, be looking at spending just about Rs 40 lakh on the same functions.

Sinha had earlier saiddigitimes asiaIndian startup targets AI inference opportunity with full-stack compute platform he expected 95% of India’s AI market to eventually run on inference alone.

This Bengaluru startup isn’t building faster chips than Nvidia. It’s building cheaper AI