Mid-Market Management Consulting Firm
4x faster proposal turnaround through AI-powered knowledge search and document generation.
The client is a management consulting firm with 180 employees across offices in New York, Chicago, and London, specializing in operational transformation for mid-market healthcare and financial services companies. Annual revenue was approximately $42M at the time of engagement, with a target of reaching $55M within two years. The growth plan depended heavily on increasing proposal win rate and proposal volume without a proportional increase in headcount.
The firm had accumulated 12 years of project deliverables, proposals, case studies, frameworks, and client research across an estimated 38,000 documents. That institutional knowledge should have been the firm's main competitive advantage. Instead, it was inaccessible. Documents were scattered across a SharePoint instance with an inconsistent folder structure, individual consultants' OneDrive accounts, a legacy Confluence wiki that had been partially migrated and then abandoned, and a shared drive that predated the SharePoint deployment.
When a partner or senior consultant needed to assemble a proposal, the process was slow. The typical approach was to email three or four colleagues asking if anyone remembered a similar engagement, search SharePoint with keyword queries that returned hundreds of irrelevant results, and manually review past proposals to find reusable sections. A partner estimated that assembling a competitive proposal for a new engagement took an average of 34 hours of senior consultant time, spread across two to three weeks. The firm was producing approximately 8 proposals per month and winning 22% of them.
Knowledge loss from departing employees made this worse over time. When a principal with 9 years of tenure left the firm, an estimated 40% of the institutional knowledge related to their client relationships and sector expertise left with them. New hires spent their first three to four months asking colleagues where to find things and rebuilding personal reference libraries.
A previous initiative to organize the knowledge base — a six-month taxonomy project led by a knowledge management consultant — produced a detailed classification scheme that was never adopted because it required consultants to manually tag and categorize their own documents. Compliance with the tagging protocol dropped below 5% within two months.
Millennial AI structured the engagement around a principle the firm's previous taxonomy project had missed: the system has to meet people where they work, not ask them to change how they work. The solution needed to make finding and reusing knowledge easier than the current workaround of emailing colleagues — without requiring anyone to tag, categorize, or migrate documents.
Diagnose: Knowledge Audit & Architecture (Weeks 1-3)
The first three weeks focused on understanding what existed, where it lived, and how people actually searched for it. A structured audit of the four document repositories identified 38,400 documents, of which approximately 29,000 were substantive (the remainder being duplicates, empty templates, and outdated drafts). Interviews with 14 consultants across all seniority levels mapped the real information-seeking workflows — who people asked, what search terms they used, and where they gave up. The audit identified five high-value document categories that accounted for 80% of reuse: past proposals, project deliverables, sector research, client presentations, and internal frameworks. The RAG architecture was designed around these categories, with retrieval tuned to surface the specific sections within documents rather than returning entire 80-page deliverables.
Design & Deploy: RAG System & Semantic Search (Weeks 3-8)
The retrieval-augmented generation system was built in three layers. First, a document ingestion pipeline that connected to SharePoint, OneDrive, and the legacy shared drive, processing documents into chunks optimized for retrieval. Each chunk retained its source metadata — document title, author, date, client (anonymized for the search layer), and project type. Second, a vector database storing embeddings generated by a fine-tuned embedding model, with hybrid search combining semantic similarity and keyword matching. Third, an LLM layer that synthesized retrieved chunks into direct answers with source citations. A consultant could ask 'What operational metrics did we track in the Mercy Health supply chain project?' and receive a synthesized answer with links to the specific sections of the relevant deliverables, rather than a list of 30 documents that might contain the answer.
Deploy: Proposal Generation Engine (Weeks 8-12)
The proposal engine was built on top of the RAG system. When a partner initiated a new proposal, they provided the prospect name, industry, engagement type, and a brief description of the scope. The system then retrieved relevant past proposals, extracted reusable sections (firm credentials, methodology descriptions, similar case studies, team bios, and pricing structures), and assembled a first-draft proposal in the firm's standard template. The draft was not meant to be sent as-is. It was a 70-80% complete starting point that a senior consultant could refine, customize, and finalize. The system also flagged potential conflicts of interest by cross-referencing the prospect against the anonymized client database. Partners described the output as equivalent to having a senior associate who had read every proposal the firm had ever written.
Scale: Deployment & Adoption (Weeks 12-16)
The system was rolled out in two phases. The search tool was deployed firm-wide first, with a simple chat interface accessible from the firm's intranet and a Slack integration for quick queries. Usage was voluntary and no training was mandated. Adoption was tracked by weekly active users. Within three weeks, 67% of consultants had used the search tool at least once, and 41% were using it daily. The proposal engine was rolled out to the partner group and senior managers in week 14, with individual onboarding sessions. Search queries, generated proposals, and consultant edits all fed back into the relevance model, so results improved with use.
4x
Faster proposal turnaround
Average assembly time from 34 hours to 8 hours
31%
Proposal win rate
Up from 22%, attributed to higher proposal quality and faster response
60%
Reduction in research time
Consultants finding relevant past work in minutes instead of days
67%
Firm-wide adoption in 3 weeks
Without any mandatory training or usage requirements
$3.8M
Incremental revenue pipeline
From increased proposal volume in the first quarter post-launch
12
Proposals per month
Up from 8, with the same team
The most telling indicator came from new hire onboarding. A consultant who joined six weeks after the system went live described it as having a senior colleague with perfect recall of every project the firm had done, available whenever they needed it. The managing partner noted that the proposal quality improvement was not just speed — proposals were now consistently referencing the firm's most relevant past work, because the system surfaced engagements that the proposal author might not have known existed. The London office, which had historically operated with limited visibility into US engagements, reported using the search tool more frequently than either US office.
“The previous taxonomy project failed because it asked 180 consultants to change how they work. This succeeded because it didn't ask anyone to change anything. You search the way you think, and the system figures out what you need. Our proposal process went from a scavenger hunt to a starting line.”
Managing Partner, Management Consulting Firm
Institutional knowledge trapped in shared drives?
If your team spends more time searching for past work than building on it, we can change that.