The insider's guide to choosing an AI consultant (from someone who sold consulting at McKinsey)

Published by Neha Mazumdar, Partner, Strategy & Digital Transformation at Millennial AI. Education: MBA, IIM Ahmedabad; B.E., Delhi College of Engineering. Previously at: McKinsey, Pocket FM.

Published on March 17, 2026. Category: AI Strategy.

Summary: The AI consulting market will hit $257.6B by 2033, and most firms entering it have never shipped a production AI system. Use an 8-point evaluation framework covering build capability, team structure, pricing transparency, data readiness, success metrics, post-deployment support, and references. 80% of AI projects fail (RAND), and 42% of companies abandoned most AI initiatives in 2025 , your choice of consultant is the single biggest variable you control. Ask the 10 questions in this guide during your first meeting. The answers (and non-answers) will tell you everything.

Why choosing an AI consultant is different now

A year ago, finding an AI consultant meant choosing between a handful of specialized firms and the Big Four. Today, every strategy house, digital agency, and two-person outfit with a ChatGPT API key calls itself an AI consultancy. The market tells the story. AI consulting sits at $16.4 billion and is projected to reach $257.6 billion by 2033. That kind of growth attracts serious talent and serious pretenders in equal measure. [RSM's 2025 Middle Market AI Survey](https://rsmus.com/insights/services/digital-transformation/rsm-middle-market-ai-survey-2025.html) found that 91% of mid-market companies are already using generative AI, but 92% have hit significant challenges and 53% felt underprepared for what they encountered. [Gartner reported in July 2024](https://www.gartner.com/en/newsroom/press-releases/2024-07-29-gartner-predicts-30-percent-of-generative-ai-projects-will-be-abandoned-after-proof-of-concept-by-end-of-2025) that 30% of generative AI projects were being abandoned after proof of concept. [RAND's research](https://www.rand.org/pubs/research_reports/RRA2680-1.html) puts the broader AI project failure rate at 80%. [S&P Global's 2025 data](https://www.spglobal.com/market-intelligence/en/news-insights/research/ai-experiences-rapid-adoption-but-with-mixed-outcomes) shows 42% of companies abandoned most of their AI initiatives entirely. These numbers should alarm you. They should also make you very careful about who you hire to guide your AI investments. The gap between a consultant who can build a compelling slide deck and one who can deliver a working system has never been wider. I've sat on both sides of this table. I spent years at McKinsey, where I helped package and sell consulting engagements to exactly the kind of companies now shopping for AI help. Today, I run strategy and transformation at Millennial AI, where we build and deploy the systems we recommend. That dual perspective is what this guide is built on.

What I learned selling consulting at McKinsey

McKinsey taught me how to sell ideas. The machinery is impressive: a team of brilliant people distills complex problems into structured frameworks, pressure-tests them with data, and presents them in decks that make clients feel like they're buying certainty. But here's what I noticed from the inside. The sale almost always happened before the client could evaluate the work. [HBR's classic piece "Consulting Is More Than Giving Advice"](https://hbr.org/1982/09/consulting-is-more-than-giving-advice) nails this , clients struggle to judge consulting quality in advance because the product is intangible until it's delivered. You're buying a promise wrapped in a brand. At McKinsey, the brand did most of the heavy lifting. A partner would walk into a room, reference three analogous engagements, show a proprietary framework, and the deal was nearly done. The actual delivery team , often staffed weeks later , might have deep expertise or might be learning on the job. I'm not criticizing McKinsey. They hire extraordinary people and do real work. But the model reveals something important about how consulting gets sold: the buying process is designed to favor the seller. The pitch is polished, the references are curated, and the proposal is structured to make saying yes feel like the safe choice. AI consulting has inherited all of these dynamics and added a new one: technical complexity that most buyers can't independently verify. When a firm tells you they've "implemented large language models at scale" or "built custom AI solutions for Fortune 500 clients," how do you check that? What does "at scale" even mean? This asymmetry is why you need a structured evaluation approach. Gut feel and brand recognition aren't enough when the failure rate is this high.

The 8-point evaluation framework

[Forrester's Wave for AI Services (Q2 2024)](https://www.forrester.com/report/the-forrester-wave-tm-ai-services-q2-2024/RES180736) uses a 19-criterion evaluation framework for assessing AI service providers. That's thorough, but it's built for analysts comparing enterprise vendors, and most companies hiring AI consultants need something more practical. The framework below distills what actually matters into eight criteria. Each one is designed to surface a specific kind of risk that won't show up in a standard proposal review. I've organized them in pairs because they reinforce each other , a firm that scores well on one criterion in a pair but poorly on the other is sending you a signal worth paying attention to.

Do they build what they recommend? Can they show production systems?

Criterion 1: Builder credentials. The single most important question you can ask an AI consultant is whether they build the systems they recommend. Many firms operate as "strategy plus vendor selection" shops , they'll assess your needs, recommend an architecture, and then hand you off to a systems integrator or tell your internal team to execute. This model worked reasonably well for ERP implementations and cloud migrations. It fails for AI because the gap between a technically sound recommendation and a working system is enormous. AI projects require constant iteration between strategy and implementation. The data behaves differently than expected. The model performs well in testing and falls apart in production. The integration points with existing systems surface problems nobody anticipated. A consultant who only advises can't course-correct when these things happen. They can write a change order, but they can't fix the model. Ask for specifics: What systems has your team built and deployed? What's running in production today? Can you walk me through one implementation from problem statement to live system? Criterion 2: Production evidence. Demos are easy to fake. Proofs of concept are cheap to build. What matters is production , systems running daily, handling real data, delivering measurable outcomes over months. When a firm shows you their portfolio, push past the case study format. Ask about uptime. Ask about edge cases they discovered after deployment. Ask what they had to rebuild. A firm with real production experience will have war stories. They'll know which parts of the process are genuinely hard and which are just tedious. A firm selling vaporware will give you generalities. The combination of these two criteria eliminates about 70% of the market. Most AI consultancies either don't build (they advise and refer) or haven't shipped anything to production (they're still in perpetual POC mode). Finding a firm that does both narrows your search dramatically.

Team structure and pricing transparency

Criterion 3: Who actually does the work? The bait-and-switch is consulting's oldest trick. Senior partners sell the engagement, then junior staff deliver it. In traditional consulting, this is annoying but manageable , a smart MBA can synthesize research and build a strategy deck regardless of seniority. AI work is different. The person building your model needs genuine technical depth. The person designing your data pipeline needs hands-on engineering experience. You can't staff these roles with generalists learning on the job. Ask to meet the delivery team before you sign. Ask about their backgrounds, what they've built, how long they've been working together. A firm that can't introduce you to the people who'll do the work is telling you something: either those people haven't been hired yet, or they're being pulled from other engagements based on availability. Also ask about subcontracting. Some firms present a unified team but actually outsource significant portions of the work. This isn't automatically bad, but you should know about it and have visibility into who those subcontractors are. Criterion 4: How do they scope and price? AI projects are inherently uncertain. Anyone who gives you a fixed price for a six-month AI engagement in the first meeting is either padding the estimate enormously or doesn't understand the work well enough. Good AI consultants scope in phases. They'll propose a discovery phase with clear deliverables, then scope subsequent phases based on what they learn. The pricing should be transparent , you should be able to see the rate card, understand how hours are allocated, and know what triggers additional costs. Watch for firms that bundle everything into a single large number and resist breaking it down. Watch for firms that price based on "value delivered" without a clear mechanism for measuring that value. And watch for firms that lowball the initial estimate and then rack up change orders , this is the model that gives consulting its worst reputation. The best arrangement I've seen is a fixed-price discovery phase (2-4 weeks) followed by time-and-materials work within an agreed budget range, with weekly financial transparency. This structure gives you a clear exit point after discovery if the fit isn't right, and ongoing visibility into where your money goes.

Data readiness and success metrics

Criterion 5: How do they handle your data reality? Every company thinks its data is messier than everyone else's. They're usually right , everyone's data is messy in its own unique way. The question is how your AI consultant approaches that mess. A firm that jumps straight to model architecture before understanding your data infrastructure is building on sand. The best consultants will spend a disproportionate amount of the discovery phase on data: where it lives, how it flows, who owns it, what's missing, what's duplicated, and what governance exists around it. Ask them to walk you through their data readiness assessment process. If they don't have one, that's your answer. If they do, listen for specificity. Generic answers like "we'll do a data audit" are worthless. You want to hear about specific tools they use, specific dimensions they evaluate, and specific criteria that determine whether to proceed, restructure, or pause. Also ask what happens when the data isn't ready. A good consultant will have a clear answer because they've been in that situation many times. They'll talk about data engineering sprints, interim solutions, and realistic timelines for getting data to the point where AI work can begin. Criterion 6: Do they define success before starting? This sounds obvious, but a stunning number of AI engagements launch without clear success metrics. The consultants are happy to start billing, and the client is eager to get going, and nobody pauses to define what "done" looks like. You want a firm that insists on defining measurable outcomes before work begins. And these metrics need to be business metrics, not technical ones. "Model accuracy of 95%" is a technical metric. "Reduce customer churn by 15% within six months of deployment" is a business metric. Both matter, but if your consultant only talks about the first kind, they're thinking about the project from an engineering perspective rather than a business one. Ask how they track and report on these metrics during the engagement. Ask what happens when the metrics aren't being met. The answer should involve course correction, candid conversation, and potentially recommending that you stop the project if the ROI case has collapsed. A consultant who will never tell you to stop spending money is a consultant optimizing for their revenue.

Post-deployment support and reference checks

Criterion 7: What happens after launch? AI systems are living systems. They drift, they encounter new data patterns, they need retraining, and the business context around them changes. A consultant who delivers a model and disappears is leaving you with a depreciating asset. Ask about their post-deployment model. Do they offer ongoing support? What does it look like, and what does it cost? How do they handle model monitoring and retraining? What's their response time when something breaks? The best firms build knowledge transfer into the engagement from the start. They're training your team alongside the implementation so that you can eventually own the system independently. They'll be honest about which components your team can maintain and which will need ongoing specialized support. Watch out for firms that create dependency by design , systems that only they can maintain, proprietary tools that lock you in, or architectures that require their ongoing involvement to function. You want a partner who makes themselves less necessary over time, even though that conflicts with their revenue model. Criterion 8: What do their past clients actually say? References are standard in any procurement process, but most people waste them by asking generic questions. When you talk to a consultant's references, ask questions that reveal what the proposal won't. Specific questions that work: What surprised you about working with them? What would you do differently if you hired them again? How did they handle the first major setback? Did the team that pitched the work actually deliver it? What's the system doing today compared to what was promised? Would you hire them for a different kind of project, or only for exactly what they did for you? That last question is particularly revealing. A firm with genuine breadth will get enthusiastic referrals across project types. A firm that did one thing well and sells everything will get hesitant answers when you ask about adjacent capabilities. Also ask for references they didn't volunteer. Any firm can give you three happy clients. Ask for a client where things went sideways and they had to recover. How a firm handles failure tells you more about them than how they handle success.

Red flags that should end the conversation

Some warning signs are subtle. These are not. If you encounter any of the following, walk away. They guarantee specific outcomes. AI projects carry genuine uncertainty. A firm that guarantees results is either lying or has padded the guarantee with so many conditions that it's meaningless. Confident firms speak in ranges and probabilities. Desperate or dishonest firms guarantee. They can't explain their approach without jargon. Technical depth and jargon are inversely correlated in my experience. People who deeply understand AI can explain what they do in plain language. People who are performing expertise hide behind terminology. If you can't understand what they're proposing after an hour-long meeting, the problem is their communication. They resist introducing the delivery team. I've mentioned this already, but it bears repeating because it's the most reliable predictor of a bad engagement. If the people selling you can't produce the people who'll do the work, the team doesn't exist yet. They have no failed projects. Every firm that has done real work has projects that didn't go as planned. A firm that claims a perfect track record is either lying or hasn't done enough work to have encountered real problems. Ask about failures. The quality of the answer matters more than the content. Their case studies are all "strategy" and "roadmap" with no production systems. This is the calling card of an AI-washed strategy firm. They've renamed their digital transformation practice and added "AI" to every slide. The work hasn't changed; the packaging has. They propose a massive engagement from day one. Any firm suggesting a six-figure, multi-month commitment before a discovery phase is optimizing for their pipeline, not your outcome. Good AI work starts small, proves value, and scales. They own the IP. Read the contract carefully. Some firms retain ownership of models, code, or even data derivatives created during the engagement. You should own everything built with your data and your money. Full stop. <table><thead><tr><th>#</th><th>Criterion</th><th>What to Look For</th><th>Red Flag</th></tr></thead><tbody><tr><td>1</td><td>Build what they recommend</td><td>Production systems shown</td><td>"We partner with others for build"</td></tr><tr><td>2</td><td>Production evidence</td><td>Live systems, real users</td><td>Only POCs and demos</td></tr><tr><td>3</td><td>Team transparency</td><td>Named people with profiles</td><td>"We'll staff based on needs"</td></tr><tr><td>4</td><td>Pricing clarity</td><td>Milestone-based, scope change plan</td><td>Vague T&M estimates</td></tr><tr><td>5</td><td>Data assessment</td><td>Structured process, specific tools</td><td>"We'll do a data audit" (generic)</td></tr><tr><td>6</td><td>Success metrics</td><td>Business metrics before start</td><td>Only technical metrics</td></tr><tr><td>7</td><td>Post-deployment</td><td>Ongoing support, iteration plan</td><td>Engagement ends at launch</td></tr><tr><td>8</td><td>References</td><td>3 clients from last 12 months</td><td>No references or 2+ years old</td></tr></tbody></table>

The first meeting: 10 questions that cut through the pitch

You've screened firms against the framework. You've got a shortlist. Now you're sitting in the first meeting, and they're running their standard pitch. Here are ten questions that will break them out of the script and reveal what they actually know and how they actually work. 1. "Walk me through the last AI project that failed and what you learned." , Tests honesty and self-awareness. A rehearsed answer about a minor hiccup doesn't count. 2. "Who specifically will be writing code on my project, and can I talk to them this week?" , Forces them to commit to a team or admit they haven't staffed it. 3. "What's your data readiness assessment process, step by step?" , Reveals whether they have a real methodology or are making it up. 4. "How do you price a project when neither of us knows exactly what we'll find in the data?" , Tests commercial maturity. Good answers involve phased approaches and transparency. 5. "Show me something running in production right now that your team built." , Separates builders from advisors instantly. 6. "What would make you tell us to stop this project?" , Tests whether they'll prioritize your interests over their billing. 7. "How do you handle knowledge transfer so we're not dependent on you in 12 months?" , Reveals whether they build dependency or independence. 8. "What's your rate card, and how do you track hours against budget?" , Tests financial transparency. Evasion here predicts billing problems later. 9. "Give me a reference from a project that went sideways." , Tests confidence and honesty. Firms that refuse this one are hiding something. 10. "What would you do differently if you were us, evaluating consultants?" , An open-ended question that reveals their actual worldview. Listen for self-awareness and genuine advice versus a pitch disguised as candor. Take notes during these conversations. The pattern across answers matters as much as any individual response. A firm that gives you direct, specific answers to most of these questions is worth your time. A firm that deflects, generalizes, or gets defensive is telling you exactly what the engagement will feel like.

How we built Millennial AI around these principles

I'd be dishonest if I pretended this guide was purely academic. We built Millennial AI specifically because we saw the problems described above and wanted to build a firm that solved them. Every partner at Millennial AI builds. We don't have a strategy team that hands off to a separate delivery team. The people who design the solution are the people who implement it. This isn't scalable in the traditional consulting sense, and we're fine with that. We'd rather do fewer engagements well than many engagements poorly. We publish our rate cards. We scope in phases. We define success metrics before we write a line of code. We build knowledge transfer into every engagement so that our clients can run their own systems within a reasonable timeframe. We've turned down projects where we didn't believe the data could support the proposed solution, even when the revenue would have been meaningful for a firm our size. We do this because we've been on the other side. We've seen what happens when smart companies hire the wrong AI partner, and it's ugly: months of wasted time, six or seven figures spent, and nothing to show for it except a deck full of recommendations and a proof of concept that never made it to production. The 80% failure rate in AI projects isn't a force of nature. It's the result of misaligned incentives, poor scoping, inadequate technical depth, and consultants who sell what they can't deliver. Every one of those factors is within your control as a buyer. This guide gives you the framework to exercise that control. Use the eight criteria to evaluate candidates. Ask the ten questions in your first meeting. Watch for the red flags. And remember that the best predictor of a successful AI engagement isn't the consultant's brand, their slide deck, or their client list. It's whether they've built and deployed systems like the one you need, with the transparency and accountability to prove it. Your AI investment is too important to leave to a polished pitch. Make them earn it.

AI Strategy

The insider's guide to choosing an AI consultant (from someone who sold consulting at McKinsey)

Neha MazumdarMarch 17, 202616 min read

TL;DR

--The AI consulting market will hit $257.6B by 2033, and most firms entering it have never shipped a production AI system.
--Use an 8-point evaluation framework covering build capability, team structure, pricing transparency, data readiness, success metrics, post-deployment support, and references.
--80% of AI projects fail (RAND), and 42% of companies abandoned most AI initiatives in 2025 , your choice of consultant is the single biggest variable you control.
--Ask the 10 questions in this guide during your first meeting. The answers (and non-answers) will tell you everything.

Why choosing an AI consultant is different now

The market tells the story. AI consulting sits at $16.4 billion and is projected to reach $257.6 billion by 2033. That kind of growth attracts serious talent and serious pretenders in equal measure. RSM's 2025 Middle Market AI Survey found that 91% of mid-market companies are already using generative AI, but 92% have hit significant challenges and 53% felt underprepared for what they encountered.

Gartner reported in July 2024 that 30% of generative AI projects were being abandoned after proof of concept. RAND's research puts the broader AI project failure rate at 80%. S&P Global's 2025 data shows 42% of companies abandoned most of their AI initiatives entirely.

These numbers should alarm you. They should also make you very careful about who you hire to guide your AI investments. The gap between a consultant who can build a compelling slide deck and one who can deliver a working system has never been wider.

I've sat on both sides of this table. I spent years at McKinsey, where I helped package and sell consulting engagements to exactly the kind of companies now shopping for AI help. Today, I run strategy and transformation at Millennial AI, where we build and deploy the systems we recommend. That dual perspective is what this guide is built on.

What I learned selling consulting at McKinsey

But here's what I noticed from the inside. The sale almost always happened before the client could evaluate the work. HBR's classic piece "Consulting Is More Than Giving Advice" nails this , clients struggle to judge consulting quality in advance because the product is intangible until it's delivered. You're buying a promise wrapped in a brand.

At McKinsey, the brand did most of the heavy lifting. A partner would walk into a room, reference three analogous engagements, show a proprietary framework, and the deal was nearly done. The actual delivery team , often staffed weeks later , might have deep expertise or might be learning on the job.

I'm not criticizing McKinsey. They hire extraordinary people and do real work. But the model reveals something important about how consulting gets sold: the buying process is designed to favor the seller. The pitch is polished, the references are curated, and the proposal is structured to make saying yes feel like the safe choice.

AI consulting has inherited all of these dynamics and added a new one: technical complexity that most buyers can't independently verify. When a firm tells you they've "implemented large language models at scale" or "built custom AI solutions for Fortune 500 clients," how do you check that? What does "at scale" even mean?

This asymmetry is why you need a structured evaluation approach. Gut feel and brand recognition aren't enough when the failure rate is this high.

The 8-point evaluation framework

Forrester's Wave for AI Services (Q2 2024) uses a 19-criterion evaluation framework for assessing AI service providers. That's thorough, but it's built for analysts comparing enterprise vendors, and most companies hiring AI consultants need something more practical.

The framework below distills what actually matters into eight criteria. Each one is designed to surface a specific kind of risk that won't show up in a standard proposal review. I've organized them in pairs because they reinforce each other , a firm that scores well on one criterion in a pair but poorly on the other is sending you a signal worth paying attention to.

This model worked reasonably well for ERP implementations and cloud migrations. It fails for AI because the gap between a technically sound recommendation and a working system is enormous. AI projects require constant iteration between strategy and implementation. The data behaves differently than expected. The model performs well in testing and falls apart in production. The integration points with existing systems surface problems nobody anticipated.

A consultant who only advises can't course-correct when these things happen. They can write a change order, but they can't fix the model.

Ask for specifics: What systems has your team built and deployed? What's running in production today? Can you walk me through one implementation from problem statement to live system?

Criterion 2: Production evidence. Demos are easy to fake. Proofs of concept are cheap to build. What matters is production , systems running daily, handling real data, delivering measurable outcomes over months.

When a firm shows you their portfolio, push past the case study format. Ask about uptime. Ask about edge cases they discovered after deployment. Ask what they had to rebuild. A firm with real production experience will have war stories. They'll know which parts of the process are genuinely hard and which are just tedious. A firm selling vaporware will give you generalities.

The combination of these two criteria eliminates about 70% of the market. Most AI consultancies either don't build (they advise and refer) or haven't shipped anything to production (they're still in perpetual POC mode). Finding a firm that does both narrows your search dramatically.

Team structure and pricing transparency

AI work is different. The person building your model needs genuine technical depth. The person designing your data pipeline needs hands-on engineering experience. You can't staff these roles with generalists learning on the job.

Ask to meet the delivery team before you sign. Ask about their backgrounds, what they've built, how long they've been working together. A firm that can't introduce you to the people who'll do the work is telling you something: either those people haven't been hired yet, or they're being pulled from other engagements based on availability.

Also ask about subcontracting. Some firms present a unified team but actually outsource significant portions of the work. This isn't automatically bad, but you should know about it and have visibility into who those subcontractors are.

Criterion 4: How do they scope and price? AI projects are inherently uncertain. Anyone who gives you a fixed price for a six-month AI engagement in the first meeting is either padding the estimate enormously or doesn't understand the work well enough.

Good AI consultants scope in phases. They'll propose a discovery phase with clear deliverables, then scope subsequent phases based on what they learn. The pricing should be transparent , you should be able to see the rate card, understand how hours are allocated, and know what triggers additional costs.

Watch for firms that bundle everything into a single large number and resist breaking it down. Watch for firms that price based on "value delivered" without a clear mechanism for measuring that value. And watch for firms that lowball the initial estimate and then rack up change orders , this is the model that gives consulting its worst reputation.

The best arrangement I've seen is a fixed-price discovery phase (2-4 weeks) followed by time-and-materials work within an agreed budget range, with weekly financial transparency. This structure gives you a clear exit point after discovery if the fit isn't right, and ongoing visibility into where your money goes.

Data readiness and success metrics

A firm that jumps straight to model architecture before understanding your data infrastructure is building on sand. The best consultants will spend a disproportionate amount of the discovery phase on data: where it lives, how it flows, who owns it, what's missing, what's duplicated, and what governance exists around it.

Ask them to walk you through their data readiness assessment process. If they don't have one, that's your answer. If they do, listen for specificity. Generic answers like "we'll do a data audit" are worthless. You want to hear about specific tools they use, specific dimensions they evaluate, and specific criteria that determine whether to proceed, restructure, or pause.

Also ask what happens when the data isn't ready. A good consultant will have a clear answer because they've been in that situation many times. They'll talk about data engineering sprints, interim solutions, and realistic timelines for getting data to the point where AI work can begin.

Criterion 6: Do they define success before starting? This sounds obvious, but a stunning number of AI engagements launch without clear success metrics. The consultants are happy to start billing, and the client is eager to get going, and nobody pauses to define what "done" looks like.

You want a firm that insists on defining measurable outcomes before work begins. And these metrics need to be business metrics, not technical ones. "Model accuracy of 95%" is a technical metric. "Reduce customer churn by 15% within six months of deployment" is a business metric. Both matter, but if your consultant only talks about the first kind, they're thinking about the project from an engineering perspective rather than a business one.

Ask how they track and report on these metrics during the engagement. Ask what happens when the metrics aren't being met. The answer should involve course correction, candid conversation, and potentially recommending that you stop the project if the ROI case has collapsed. A consultant who will never tell you to stop spending money is a consultant optimizing for their revenue.

Post-deployment support and reference checks

Ask about their post-deployment model. Do they offer ongoing support? What does it look like, and what does it cost? How do they handle model monitoring and retraining? What's their response time when something breaks?

The best firms build knowledge transfer into the engagement from the start. They're training your team alongside the implementation so that you can eventually own the system independently. They'll be honest about which components your team can maintain and which will need ongoing specialized support.

Watch out for firms that create dependency by design , systems that only they can maintain, proprietary tools that lock you in, or architectures that require their ongoing involvement to function. You want a partner who makes themselves less necessary over time, even though that conflicts with their revenue model.

Criterion 8: What do their past clients actually say? References are standard in any procurement process, but most people waste them by asking generic questions. When you talk to a consultant's references, ask questions that reveal what the proposal won't.

Specific questions that work: What surprised you about working with them? What would you do differently if you hired them again? How did they handle the first major setback? Did the team that pitched the work actually deliver it? What's the system doing today compared to what was promised? Would you hire them for a different kind of project, or only for exactly what they did for you?

That last question is particularly revealing. A firm with genuine breadth will get enthusiastic referrals across project types. A firm that did one thing well and sells everything will get hesitant answers when you ask about adjacent capabilities.

Also ask for references they didn't volunteer. Any firm can give you three happy clients. Ask for a client where things went sideways and they had to recover. How a firm handles failure tells you more about them than how they handle success.

Red flags that should end the conversation

Some warning signs are subtle. These are not. If you encounter any of the following, walk away.

They guarantee specific outcomes. AI projects carry genuine uncertainty. A firm that guarantees results is either lying or has padded the guarantee with so many conditions that it's meaningless. Confident firms speak in ranges and probabilities. Desperate or dishonest firms guarantee.

They can't explain their approach without jargon. Technical depth and jargon are inversely correlated in my experience. People who deeply understand AI can explain what they do in plain language. People who are performing expertise hide behind terminology. If you can't understand what they're proposing after an hour-long meeting, the problem is their communication.

They resist introducing the delivery team. I've mentioned this already, but it bears repeating because it's the most reliable predictor of a bad engagement. If the people selling you can't produce the people who'll do the work, the team doesn't exist yet.

They have no failed projects. Every firm that has done real work has projects that didn't go as planned. A firm that claims a perfect track record is either lying or hasn't done enough work to have encountered real problems. Ask about failures. The quality of the answer matters more than the content.

Their case studies are all "strategy" and "roadmap" with no production systems. This is the calling card of an AI-washed strategy firm. They've renamed their digital transformation practice and added "AI" to every slide. The work hasn't changed; the packaging has.

They propose a massive engagement from day one. Any firm suggesting a six-figure, multi-month commitment before a discovery phase is optimizing for their pipeline, not your outcome. Good AI work starts small, proves value, and scales.

They own the IP. Read the contract carefully. Some firms retain ownership of models, code, or even data derivatives created during the engagement. You should own everything built with your data and your money. Full stop.

#	Criterion	What to Look For	Red Flag
1	Build what they recommend	Production systems shown	"We partner with others for build"
2	Production evidence	Live systems, real users	Only POCs and demos
3	Team transparency	Named people with profiles	"We'll staff based on needs"
4	Pricing clarity	Milestone-based, scope change plan	Vague T&M estimates
5	Data assessment	Structured process, specific tools	"We'll do a data audit" (generic)
6	Success metrics	Business metrics before start	Only technical metrics
7	Post-deployment	Ongoing support, iteration plan	Engagement ends at launch
8	References	3 clients from last 12 months	No references or 2+ years old

The first meeting: 10 questions that cut through the pitch

1. "Walk me through the last AI project that failed and what you learned." , Tests honesty and self-awareness. A rehearsed answer about a minor hiccup doesn't count.

2. "Who specifically will be writing code on my project, and can I talk to them this week?" , Forces them to commit to a team or admit they haven't staffed it.

3. "What's your data readiness assessment process, step by step?" , Reveals whether they have a real methodology or are making it up.

4. "How do you price a project when neither of us knows exactly what we'll find in the data?" , Tests commercial maturity. Good answers involve phased approaches and transparency.

5. "Show me something running in production right now that your team built." , Separates builders from advisors instantly.

6. "What would make you tell us to stop this project?" , Tests whether they'll prioritize your interests over their billing.

7. "How do you handle knowledge transfer so we're not dependent on you in 12 months?" , Reveals whether they build dependency or independence.

8. "What's your rate card, and how do you track hours against budget?" , Tests financial transparency. Evasion here predicts billing problems later.

9. "Give me a reference from a project that went sideways." , Tests confidence and honesty. Firms that refuse this one are hiding something.

10. "What would you do differently if you were us, evaluating consultants?" , An open-ended question that reveals their actual worldview. Listen for self-awareness and genuine advice versus a pitch disguised as candor.

Take notes during these conversations. The pattern across answers matters as much as any individual response. A firm that gives you direct, specific answers to most of these questions is worth your time. A firm that deflects, generalizes, or gets defensive is telling you exactly what the engagement will feel like.

How we built Millennial AI around these principles

I'd be dishonest if I pretended this guide was purely academic. We built Millennial AI specifically because we saw the problems described above and wanted to build a firm that solved them.

Every partner at Millennial AI builds. We don't have a strategy team that hands off to a separate delivery team. The people who design the solution are the people who implement it. This isn't scalable in the traditional consulting sense, and we're fine with that. We'd rather do fewer engagements well than many engagements poorly.

We publish our rate cards. We scope in phases. We define success metrics before we write a line of code. We build knowledge transfer into every engagement so that our clients can run their own systems within a reasonable timeframe. We've turned down projects where we didn't believe the data could support the proposed solution, even when the revenue would have been meaningful for a firm our size.

We do this because we've been on the other side. We've seen what happens when smart companies hire the wrong AI partner, and it's ugly: months of wasted time, six or seven figures spent, and nothing to show for it except a deck full of recommendations and a proof of concept that never made it to production.

The 80% failure rate in AI projects isn't a force of nature. It's the result of misaligned incentives, poor scoping, inadequate technical depth, and consultants who sell what they can't deliver. Every one of those factors is within your control as a buyer.

This guide gives you the framework to exercise that control. Use the eight criteria to evaluate candidates. Ask the ten questions in your first meeting. Watch for the red flags. And remember that the best predictor of a successful AI engagement isn't the consultant's brand, their slide deck, or their client list. It's whether they've built and deployed systems like the one you need, with the transparency and accountability to prove it.

Your AI investment is too important to leave to a polished pitch. Make them earn it.

References

Neha Mazumdar

Partner, Strategy & Digital Transformation

Three years at McKinsey taught her to diagnose a business problem in a week. She wanted to go further and make sure the fix got built. Runs client engagements and holds every project to the bar of a shipped product.

Continue exploring

AI StrategyMar 10, 202612 min read

AI strategy that ships: what mid-market companies get wrong

Most AI strategies are PowerPoint exercises. The ones that work start with whether your operations can absorb what you plan to deploy.

OperationsFeb 3, 202614 min read

Why AI consulting projects fail (and the $4.6M mistake pattern behind most of them)

Eight out of ten AI projects fail. After three years at McKinsey and two years running client engagements, the failure patterns are predictable and preventable.

DevelopmentMar 28, 202618 min read

Build vs. buy for AI: how mid-market companies should decide

Build-vs-buy goes deeper than capability. It is about where your competitive advantage lives and how fast the vendor market is shifting beneath you.

Try our free AI assessment