MIT Says 95 Percent of Generative AI Pilots Fail. What Legal Can Learn. 

5 min read

Generative AI Pilots Fail - 95% says MIT

Artificial Intelligence

MIT grabbed headlines recently by reporting that 95 percent of Generative AI pilots never drive measurable results, with only 5 percent of projects achieving real scale. Every stalled pilot drains budget, slows transformation, builds opposition, and piles pressure on legal ops leaders to prove ROI. Below, we examine what lessons from the study can be applied to your legal operations AI strategy:  

As Nick Whitehouse, Chief AI Officer at Onit, explained in a recent response to the study: “Are 95% of generative AI projects failing? Nope, not really. The way we are measuring them and the way we are running them, yes, but the technology itself, no.” 

Generative AI Pilots need better measuring - Image of AI on a laptop in a workflow

Why Generative AI Pilots Fail in Legal Departments 

AI that is disconnected from workflow 

The MIT study found that the leading reason generative AI pilots fail is weak integration with workflows, not weak technology. If AI doesn’t live inside e-Billing, matter management, or contract review, it lives on the sidelines. Integration reduces the barriers to adoption, and adoption is impact. This is exactly why modern legal ops software that is built to connect workflows matter. 

Chasing sizzle over substance  

The same research showed that companies spend heavily on high-profile pilots in sales and marketing, while the biggest ROI sits in back-office automation. In legal, it can be tempting to tackle impressive use cases like advice giving and negotiation strategy. However, AI can deliver more value faster for invoice review, spend management, and contract analytics than “showcase” projects.  

As Whitehouse noted, “The MIT study was actually helpful: it found the biggest wins often come from focusing on back-office tasks. These are well documented, process-driven, and usually have the data you need to measure results. That’s where success tends to show up first.” 

The build versus buy dilemma 

The difference between success for both build or buy is focus. Analysts noted that vendor tools succeed about twice as often. Internal builds can work, but only if they are targeted and resourced properly. General purpose AI models still require significant tuning, configuring, custom UX, and most importantly, scientific validation to maximize their value in a particular domain. This requires a domain specific focus and attention. 

Generative AI Pilots failing? Not really.

Tackling too much at once  

Spreading resources too thin is another culprit. Startups focusing on one use case often scale quickly, reaching $20 million in revenue in under a year. Larger enterprises tend to pilot too many initiatives at once and struggle to show meaningful progress. Legal ops leaders who pick one workflow, deliver results, and scale from there build momentum and credibility.  

As Onit’s Chief AI Officer pointed out, “About 5% of businesses in the study are doing an exceptionally good job using AI to solve real pain points. You see this especially in the startup ecosystem. In fact, the study showed that businesses working with specialized companies or products focused on these problems have twice the success rate compared to those going it alone.” 

How to Make Sure Your Legal Ops Project is Among the 5% of Success Stories 

The MIT study rattled more than legal departments. It shook investors and boards, fueling skepticism about whether AI can actually deliver measurable business value.  

Meanwhile, adoption pressure is only growing. A recent survey found that 96 percent of legal professionals already use AI in their daily workflows, and nearly half describe it as essential. Legal cannot afford to treat AI as optional. However, moving from experimentation to execution requires a structured approach.  

A group of professionals having a discussion - representing generative AI adoption

Here’s how to set generative AI pilots up for success in legal ops: 

Pinpoint one high-impact process 

Look for workflows where time and cost converge, like invoice review or contract approvals. 

Embed AI into the systems your team already use 

If it isn’t integrated, it won’t be adopted. Seamless connection to your core legal tools is non-negotiable. 

Measure what matters 

Cycle-time reductions, outside counsel savings, compliance improvements. These are the metrics leadership cares about, not abstract productivity gains. 

Fund what works 

Double down on pilots that deliver measurable ROI and resist the urge to scatter budget across experiments. 

Lead the adoption curve 

Change management is not an afterthought. Equip your team with training, communicate wins, and build trust in the tools. 

Team members learning AI

From Pilot Failure to Legal Ops Success 

Loose generative AI pilots are failing at high rates, but that doesn’t mean legal ops is destined for the same fate. Infosys research shows that 95 percent of executives using AI have experienced a mishap, and only 2 percent of organizations meet responsible AI standards. Like the MIT study, that’s an explicit argument for structure, governance, and discipline. 

“The right way to look at this is as a maturity journey: moving from ad hoc experiments to fully integrated AI in your core processes. That’s when you’ll start seeing massive success.” – Nick Whitehouse 

Legal ops leaders who focus on integration, measure what matters, and scale intentionally will be the ones who prove AI’s real value.  
 
We turned the key lessons from MIT’s AI pilot study into a quick summary slide. 

Share it with your team: