February 10, 2026
Mark Bjornsgaard
In AI circles, 2026 is being dubbed ‘the year of inference’. “80% [of AI spending] will be on inference, and 20% will be on training. That is our forecast,” said Lenovo CEO Yuanqing Yang. Deloitte predicts that inference workloads will account for two-thirds of AI compute in 2026.
But will these predictions actually come true? Or are AI experts getting ahead of themselves? Here’s why inference’s arrival might be slower than you think, what kind of UK data centre would support inference workloads, and why visual inference is on its way.
Inference, visual inference, and the difference between the two
Let’s back up a minute. What exactly is inference? Compared to using compute power to train AI, inference involves using those AI models to process new data. This often takes the form of agentic AI: autonomous systems that run without continuous prompting, completing difficult tasks across several tools. Another difference between training and inference is that inference runs on CPUs rather than GPUs. As for visual inference, this sees AI models process new visual data, for example, in autonomous vehicles or manufacturing quality control.
Why inference’s moment in the spotlight is further away than you think
You might have heard that Nvidia bought AI chip startup Groq (who make inference-ready chips) for $20 billion. Some say this sale means that inference is about to take off. But, if inference was truly going to dominate the AI world, why would Groq settle for $20 billion instead of holding out for a bigger deal?
Deep Green’s Founder and Chief Innovation Officer, Mark Bjornsgaard, believes text-based inference isn’t ready to hit the mainstream. “Transformer-based large language models still aren’t up to the task. Their context windows are way too short,” he said. “When we talk to enterprises, they tell us, ‘We can’t use them. They’re too unreliable.’” So is all the hype around inference nothing more than a storm in a teacup? Not quite. Visual inference is living up to the hype with existing use cases that are taking off.
The rise of visual inference: why it’s taking off this year
Imagine hailing a taxi and stepping into the car, only to see that the driver’s seat is empty. No, the driver hasn’t popped out for a coffee: the car uses sensors and software to drive itself to your destination. This is the future of ride-hailing, already well-established in some US states, and due to arrive in London in 2026. Staying safe in a self-driving car relies on visual inference: real-time processing of visual data to detect approaching obstacles on the road.
From logistics to retail to manufacturing, there are countless other use cases for visual inference, where robots must assess visual data to make real-time decisions. And in the UK, fields where we excel, such as pharmaceuticals, advanced manufacturing, academia, and preventative health, will all drive the visual inference trend.
Which UK data centres are best suited for inference workloads?
Running inference workloads requires immediate answers. That self-driving car might have a millisecond to swerve a hazard on the road, and it can’t afford to wait for a distant data centre to process the data. So, when it comes to inference, low latency is non-negotiable, which means setting up your compute close to the core of your enterprise.
According to Dell, 82% of AI workloads will be carried out in colocation providers or on-prem data centres. A colocation provider has the infrastructure, rack space, and cooling requirements for your servers in an existing UK data centre near your HQ. Meanwhile, on-prem means commissioning a data centre company to build a new data centre on your enterprise’s premises. What’s not so ideal for inference is a large, centralised 100MW+ facility. Data centres this big are usually located in remote industrial estates, so they don’t support the short distances and low latency that inference requires.
So, non-remote data centres located near tech HQ hubs are the best fit for inference workloads. What’s more, these data centres are also perfectly set up to reuse heat. Not only are district heating systems closer and easier to connect to, but the heat also has less distance to travel to be put to its second use, saving energy in the process. Choosing a data centre that reuses heat cuts costs and reduces emissions for your enterprise, benefiting your sustainability performance and your bottom line.
Inside Deep Green’s inference-ready set-up
From the appropriate cooling systems to established heat reuse projects, Deep Green is equipped for inference workloads and well-versed in supporting enterprises with inference requirements. Whether you’re after colocation in one of our existing data centres or an on-prem build suited to your enterprise’s needs, Deep Green has the inference expertise to make it happen. Want to know more about how we build heat reuse into our data centres, creating holistic urban ecologies?




