Skip to main content
QSR.pro
ArticlesChainsTrendingPopularReportsToolsGlossaryMarket Map
Subscribe
QSR.pro

The definitive source for QSR industry intelligence. Deep research, real data, and actionable analysis for operators, franchisees, and investors.

Never Miss an Update

Content

  • All Articles
  • Trending
  • Popular
  • Collections
  • Guides
  • Topics
  • Archive

Categories

  • Operations
  • Finance
  • Technology
  • Industry Analysis
  • Marketing
  • People & Culture

Research & Data

  • Chain Database
  • Compare Franchises
  • State Guides
  • Best QSR by City
  • Industry Reports
  • QSR Glossary
  • Chain Rankings
  • Market Map

Tools

  • Franchise Calculator
  • Wage Benchmarks
  • All Tools

Resources

  • Start Here
  • Reading List
  • Newsletter
  • Site Directory
  • RSS Feed

Company

  • About
  • Contact
  • Advertise
  • Privacy Policy
  • Terms of Service

Connect

LinkedIn

© 2026 QSR Pro. All rights reserved.

Built with precision for the QSR industry

Share
  1. Home
  2. Technology & Innovation
  3. Voice AI in the Drive-Thru: Why 85% Accuracy Isn't Good Enough
Technology & Innovation•Published March 2026•9 min read

Voice AI in the Drive-Thru: Why 85% Accuracy Isn't Good Enough

Major chains are betting billions on automated ordering systems, but the technology still can't match human performance when it matters most

drive-thruAutomationTechnology
Q

QSR Pro Staff

The QSR Pro editorial team covers the quick service restaurant industry with in-depth analysis, data-driven reporting, and operator-first perspective.

Share:
Share:
85

Table of Contents

  • The future of fast food was supposed to arrive in a Chicago McDonald's parking lot. In 2021, the world's largest burger chain partnered with IBM to test voice AI at the drive-thru speaker - a technology that promised to take orders faster, more accurately, and without the labor shortage headaches plaguing the industry. Three years later, in July 2024, McDonald's pulled the plug. The problem wasn't that the technology didn't work. It did - about 85% of the time. The problem was that 85% isn't good enough when you're processing millions of transactions per day, each one representing a customer who expects their order to be right, and a franchisee who needs to protect razor-thin margins in an industry where order accuracy directly impacts profitability. ## The Great Voice AI Experiment: Who's In and Who's Out The quick-service restaurant industry is in the middle of a high-stakes race to automate the drive-thru, the channel that generates an estimated 70% of revenue for major chains. But the scoreboard reveals a technology still finding its footing. McDonald's announced in June 2024 that it would end its Automated Order Taker (AOT) partnership with IBM, shutting down the technology at more than 100 test locations no later than July 26, 2024. The three-year pilot had struggled with what the industry politely calls "order accuracy incidents" - viral TikTok videos showed the system adding hundreds of chicken nuggets to orders, mishearing "ice cream" as "bacon," and failing to process cancellation requests. But while McDonald's was exiting, others were doubling down. Wendy's announced in February 2025 that it would deploy its FreshAI system - built in partnership with Google Cloud - to between 500 and 600 locations by year's end. The company had started with a single Columbus, Ohio restaurant in 2023, expanded to 36 company-operated stores across Ohio and Florida in 2024, and is now scaling nationwide. Wendy's has claimed a "success rate of nearly 99%" - though that metric counts any order started by the AI and submitted to the point-of-sale system, even if a human had to intervene mid-conversation to fix errors. Taco Bell parent company Yum Brands announced in July 2024 that it would bring voice AI to hundreds of U.S. drive-thrus by the end of the year, expanding from a pilot that started with five California locations. As of early 2024, more than 100 Taco Bell restaurants were using the technology. The company has since announced a broader partnership with Nvidia to accelerate its AI deployments across all Yum brands, including KFC and Pizza Hut. Meanwhile, SoundHound AI - one of the industry's most aggressive voice AI providers - has quietly built what may be the largest footprint in the sector. The company reported powering over 14,000 restaurant locations by Q2 2025, processing more than 100 million customer interactions by October 2024. Its client roster includes White Castle, Chipotle, Jersey Mike's, Church's Texas Chicken, Applebee's, and numerous regional chains. ## The Accuracy Problem: Why Every Percentage Point Matters The industry's dirty secret is that even human order-takers aren't perfect. Traditional drive-thrus achieve order accuracy rates of 80-85% during peak hours, according to multiple industry sources. Some studies peg human accuracy as high as 89-92% under optimal conditions. Voice AI systems have reached comparable - sometimes superior - levels in controlled deployments. SoundHound claims its AI can complete more than 90% of orders without requiring human intervention. Presto Automation reported in December 2024 that its "non-interaction rate" (NIR) - the percentage of orders taken entirely by AI - averaged 85% across all Presto Voice-enabled restaurants, with certain locations hitting 95%. But raw accuracy numbers obscure the real challenge: AI and humans fail differently. A 2025 study by customer experience firm Intouch Insight found that traditional drive-thrus achieved 89% order accuracy, while AI-powered systems dropped to 83%. The critical finding: 65% of AI errors involved customizations - the "no pickles," "extra sauce," "make it a meal" modifications that represent the most profitable upsell opportunities and the most frustrating customer experience failures. Humans might mishear an order during a lunch rush, but they excel at the conversational inference that makes drive-thru ordering work. When a customer says "I'll take a large chocolate milkshake," a human knows they probably mean a large chocolate Frosty at Wendy's, or a McCafé shake at McDonald's. AI systems require extensive training on brand-specific menu terminology - and they still struggle. The IBM system that McDonald's abandoned reportedly achieved 85% accuracy in its Chicago test market. But McDonald's serves roughly 70 million customers per day globally. An 85% accuracy rate would mean 10.5 million incorrect orders daily - an unacceptable customer service and cost burden, even accounting for the fact that not all of those customers use drive-thrus. ## The Triple Challenge: Accents, Dialects, and Ambient Noise Drive-thrus are among the most acoustically hostile environments imaginable for speech recognition technology. Forbes reported in February 2026 that McDonald's IBM partnership ended largely because "the system struggled with interpreting different accents, dialects and background noise." The challenges are layered and specific: Background noise fluctuates wildly. A drive-thru microphone must filter out idling engines, honking horns, construction noise from nearby roads, weather conditions, and multiple voices inside the vehicle - all while isolating the speaker's voice clearly enough to distinguish "Coke" from "Diet Coke." Accent and dialect variation creates a moving target. The same menu item can be pronounced dozens of ways across regional accents. "Large" sounds different in Boston than Birmingham. "Fountain drink" might be called "soda," "pop," or "Coke" (as a generic term for all soft drinks) depending on geography. SoundHound acknowledged in 2022 that "the complexities of language differences make creating accent-agnostic speech recognition systems nearly as challenging as offering distinct languages." Real-time conversational complexity means AI must process interruptions, mid-order changes, simultaneous speakers (a parent ordering for kids in the back seat), slang, incomplete sentences, and the rapid-fire customization requests that define modern fast food ordering. Technology providers have made significant progress on these challenges. Presto Automation highlighted in June 2025 that its system "passed multiple background noise challenges during the order-taking process," designed specifically for restaurants located on busy roads. Amazon's Nova Sonic model, announced in late 2025, promises "accurate recognition of streaming speech across accents with reliability to background noise." But the gap remains measurable - and costly. ## The Hybrid Model: AI's Actual Future in the Drive-Thru No major chain is pursuing fully autonomous voice AI at scale. Instead, the industry is converging on a hybrid model: AI handles routine transactions and escalates complex orders to human staff. This isn't an admission of failure - it's practical economics. The average drive-thru order is simple: a combo meal, maybe a drink, minimal customization. For those transactions, AI is fast, consistent, never calls in sick, and can upsell ("Would you like to add a dessert?") without the social awkwardness that makes human workers hesitate. Presto Automation's hybrid approach explicitly combines AI with human backup. The company's "Presto Voice" system is designed to hand off smooth to a human operator when the AI detects uncertainty - a customer repeating themselves, requesting an unusual modification, or expressing frustration. Wendy's FreshAI pairs the voice system with a digital menu board showing a visual confirmation of the order, allowing customers to catch errors in real-time and reducing the burden on the AI to achieve perfect transcription. The economics work when AI handles 70-80% of orders completely, reducing the number of human order-takers needed per shift while keeping experienced staff available for the 20-30% of transactions that require human judgment. This isn't full automation - it's labor reallocation. ## The ROI Question: When Does 85% Beat 92%? Despite lower accuracy rates, voice AI deployments are accelerating. The reason is simple: cost. Labor represents one of the largest variable expenses in QSR operations, typically 25-30% of revenue. An AI system that can handle even 70% of drive-thru orders without human intervention generates immediate labor savings, particularly during overnight and off-peak hours when staffing is most challenging. The technology also addresses the industry's chronic staffing shortage. U.S. restaurant operators have struggled to fill positions since the pandemic, with turnover rates in QSR often exceeding 100% annually. An AI system that never quits, never needs training, and works 24/7 has inherent value even if it occasionally adds unwanted McNuggets to an order. Revenue impact extends beyond labor savings. AI systems consistently outperform humans at upselling. They don't forget to suggest adding fries, they're never too busy to offer a dessert, and they don't experience the social discomfort that makes human workers hesitate to push additional items. Industry estimates suggest AI-driven upselling can increase average check size by 10-15%. But the flip side is measurable too: incorrect orders drive customer dissatisfaction, generate waste, and require costly remakes. A single viral video of an AI system failing spectacularly can damage brand reputation in ways that are hard to quantify but impossible to ignore. ## The Timeline: When Will Voice AI Be "Good Enough"? Industry insiders suggest we're two to four years away from voice AI systems that can match or exceed human performance across all common drive-thru scenarios - not just average accuracy, but reliable handling of customizations, accents, and noisy environments. The path forward depends on three converging trends: Larger training datasets from real-world deployments. Every order processed by a Wendy's FreshAI system or a White Castle SoundHound installation generates training data that improves the underlying models. The current generation of systems has processed hundreds of millions of real transactions - an advantage earlier pilots lacked. Improved natural language models from the broader AI revolution. The same large language model advances powering ChatGPT and Claude are being adapted for real-time voice applications. Amazon's Nova Sonic, Google Cloud's contributions to Wendy's FreshAI, and Nvidia's partnership with Yum Brands all use cutting-edge speech-to-speech models that didn't exist three years ago. Better hardware and infrastructure at the edge. Microphone arrays, noise-canceling technology, and on-device processing are improving rapidly. The physical equipment capturing customer voices in 2025 is orders of magnitude better than what McDonald's deployed in 2021. By 2027-2028, industry observers predict that voice AI will reliably handle 90%+ of drive-thru orders without human intervention, achieving or exceeding human-level accuracy on customizations and performing well across diverse accents and noise conditions. But "good enough" is a moving target. As AI improves, customer expectations rise. The viral failures of early systems have created awareness that many customers now explicitly ask whether they're speaking to a human or a machine - and some refuse to engage with AI systems at all. ## What McDonald's Walking Away Really Means McDonald's decision to end its IBM partnership wasn't a rejection of voice AI - it was a rejection of that particular implementation. The company explicitly stated it remains committed to exploring voice ordering technology with alternative vendors. The failure taught the industry several critical lessons: First, brand risk matters more than cost savings. McDonald's couldn't afford the reputational damage from viral videos of malfunctioning AI, even if the technology saved money on average. Second, pilot success doesn't guarantee scalability. The IBM system worked adequately in controlled tests but struggled when deployed across diverse locations, customer demographics, and operating conditions. Third, the technology provider ecosystem matters. IBM, despite its AI credentials, wasn't a natural fit for the real-time, customer-facing demands of drive-thru ordering. The vendors seeing success - SoundHound, Google Cloud, specialized startups like Presto - have deep expertise in voice interfaces and restaurant operations specifically. ## The 85% Threshold Voice AI in the drive-thru has reached a critical inflection point. The technology works well enough to be useful, but not well enough to be invisible. It's accurate enough to save money in many deployments, but not reliable enough to replace human workers entirely. That 85% accuracy rate - whether it's Presto's non-interaction rate, McDonald's IBM performance, or the industry average - represents both remarkable progress and a stubborn plateau. The final 10-15 percentage points, the gap between "mostly right" and "reliably excellent," is where the real challenge lies. The chains betting billions on this technology aren't waiting for perfection. They're deploying hybrid systems that combine AI efficiency with human judgment, banking on incremental improvements while managing customer expectations. For now, that means your next drive-thru order might be taken by AI - but there's still a human listening in, ready to step in when the robot can't quite understand that you want "no onions, extra pickles, and make it a large." Because in an industry built on speed, consistency, and customer satisfaction, 85% accuracy isn't a failure. But it's not good enough yet, either.
  • Related Reading

The future of fast food was supposed to arrive in a Chicago McDonald's parking lot. In 2021, the world's largest burger chain partnered with IBM to test voice AI at the drive-thru speaker - a technology that promised to take orders faster, more accurately, and without the labor shortage headaches plaguing the industry. Three years later, in July 2024, McDonald's pulled the plug. The problem wasn't that the technology didn't work. It did - about 85% of the time. The problem was that 85% isn't good enough when you're processing millions of transactions per day, each one representing a customer who expects their order to be right, and a franchisee who needs to protect razor-thin margins in an industry where order accuracy directly impacts profitability. ## The Great Voice AI Experiment: Who's In and Who's Out The quick-service restaurant industry is in the middle of a high-stakes race to automate the drive-thru, the channel that generates an estimated 70% of revenue for major chains. But the scoreboard reveals a technology still finding its footing. McDonald's announced in June 2024 that it would end its Automated Order Taker (AOT) partnership with IBM, shutting down the technology at more than 100 test locations no later than July 26, 2024. The three-year pilot had struggled with what the industry politely calls "order accuracy incidents" - viral TikTok videos showed the system adding hundreds of chicken nuggets to orders, mishearing "ice cream" as "bacon," and failing to process cancellation requests. But while McDonald's was exiting, others were doubling down. Wendy's announced in February 2025 that it would deploy its FreshAI system - built in partnership with Google Cloud - to between 500 and 600 locations by year's end. The company had started with a single Columbus, Ohio restaurant in 2023, expanded to 36 company-operated stores across Ohio and Florida in 2024, and is now scaling nationwide. Wendy's has claimed a "success rate of nearly 99%" - though that metric counts any order started by the AI and submitted to the point-of-sale system, even if a human had to intervene mid-conversation to fix errors. Taco Bell parent company Yum Brands announced in July 2024 that it would bring voice AI to hundreds of U.S. drive-thrus by the end of the year, expanding from a pilot that started with five California locations. As of early 2024, more than 100 Taco Bell restaurants were using the technology. The company has since announced a broader partnership with Nvidia to accelerate its AI deployments across all Yum brands, including KFC and Pizza Hut. Meanwhile, SoundHound AI - one of the industry's most aggressive voice AI providers - has quietly built what may be the largest footprint in the sector. The company reported powering over 14,000 restaurant locations by Q2 2025, processing more than 100 million customer interactions by October 2024. Its client roster includes White Castle, Chipotle, Jersey Mike's, Church's Texas Chicken, Applebee's, and numerous regional chains. ## The Accuracy Problem: Why Every Percentage Point Matters The industry's dirty secret is that even human order-takers aren't perfect. Traditional drive-thrus achieve order accuracy rates of 80-85% during peak hours, according to multiple industry sources. Some studies peg human accuracy as high as 89-92% under optimal conditions. Voice AI systems have reached comparable - sometimes superior - levels in controlled deployments. SoundHound claims its AI can complete more than 90% of orders without requiring human intervention. Presto Automation reported in December 2024 that its "non-interaction rate" (NIR) - the percentage of orders taken entirely by AI - averaged 85% across all Presto Voice-enabled restaurants, with certain locations hitting 95%. But raw accuracy numbers obscure the real challenge: AI and humans fail differently. A 2025 study by customer experience firm Intouch Insight found that traditional drive-thrus achieved 89% order accuracy, while AI-powered systems dropped to 83%. The critical finding: 65% of AI errors involved customizations - the "no pickles," "extra sauce," "make it a meal" modifications that represent the most profitable upsell opportunities and the most frustrating customer experience failures. Humans might mishear an order during a lunch rush, but they excel at the conversational inference that makes drive-thru ordering work. When a customer says "I'll take a large chocolate milkshake," a human knows they probably mean a large chocolate Frosty at Wendy's, or a McCafé shake at McDonald's. AI systems require extensive training on brand-specific menu terminology - and they still struggle. The IBM system that McDonald's abandoned reportedly achieved 85% accuracy in its Chicago test market. But McDonald's serves roughly 70 million customers per day globally. An 85% accuracy rate would mean 10.5 million incorrect orders daily - an unacceptable customer service and cost burden, even accounting for the fact that not all of those customers use drive-thrus. ## The Triple Challenge: Accents, Dialects, and Ambient Noise Drive-thrus are among the most acoustically hostile environments imaginable for speech recognition technology. Forbes reported in February 2026 that McDonald's IBM partnership ended largely because "the system struggled with interpreting different accents, dialects and background noise." The challenges are layered and specific: Background noise fluctuates wildly. A drive-thru microphone must filter out idling engines, honking horns, construction noise from nearby roads, weather conditions, and multiple voices inside the vehicle - all while isolating the speaker's voice clearly enough to distinguish "Coke" from "Diet Coke." Accent and dialect variation creates a moving target. The same menu item can be pronounced dozens of ways across regional accents. "Large" sounds different in Boston than Birmingham. "Fountain drink" might be called "soda," "pop," or "Coke" (as a generic term for all soft drinks) depending on geography. SoundHound acknowledged in 2022 that "the complexities of language differences make creating accent-agnostic speech recognition systems nearly as challenging as offering distinct languages." Real-time conversational complexity means AI must process interruptions, mid-order changes, simultaneous speakers (a parent ordering for kids in the back seat), slang, incomplete sentences, and the rapid-fire customization requests that define modern fast food ordering. Technology providers have made significant progress on these challenges. Presto Automation highlighted in June 2025 that its system "passed multiple background noise challenges during the order-taking process," designed specifically for restaurants located on busy roads. Amazon's Nova Sonic model, announced in late 2025, promises "accurate recognition of streaming speech across accents with reliability to background noise." But the gap remains measurable - and costly. ## The Hybrid Model: AI's Actual Future in the Drive-Thru No major chain is pursuing fully autonomous voice AI at scale. Instead, the industry is converging on a hybrid model: AI handles routine transactions and escalates complex orders to human staff. This isn't an admission of failure - it's practical economics. The average drive-thru order is simple: a combo meal, maybe a drink, minimal customization. For those transactions, AI is fast, consistent, never calls in sick, and can upsell ("Would you like to add a dessert?") without the social awkwardness that makes human workers hesitate. Presto Automation's hybrid approach explicitly combines AI with human backup. The company's "Presto Voice" system is designed to hand off smooth to a human operator when the AI detects uncertainty - a customer repeating themselves, requesting an unusual modification, or expressing frustration. Wendy's FreshAI pairs the voice system with a digital menu board showing a visual confirmation of the order, allowing customers to catch errors in real-time and reducing the burden on the AI to achieve perfect transcription. The economics work when AI handles 70-80% of orders completely, reducing the number of human order-takers needed per shift while keeping experienced staff available for the 20-30% of transactions that require human judgment. This isn't full automation - it's labor reallocation. ## The ROI Question: When Does 85% Beat 92%? Despite lower accuracy rates, voice AI deployments are accelerating. The reason is simple: cost. Labor represents one of the largest variable expenses in QSR operations, typically 25-30% of revenue. An AI system that can handle even 70% of drive-thru orders without human intervention generates immediate labor savings, particularly during overnight and off-peak hours when staffing is most challenging. The technology also addresses the industry's chronic staffing shortage. U.S. restaurant operators have struggled to fill positions since the pandemic, with turnover rates in QSR often exceeding 100% annually. An AI system that never quits, never needs training, and works 24/7 has inherent value even if it occasionally adds unwanted McNuggets to an order. Revenue impact extends beyond labor savings. AI systems consistently outperform humans at upselling. They don't forget to suggest adding fries, they're never too busy to offer a dessert, and they don't experience the social discomfort that makes human workers hesitate to push additional items. Industry estimates suggest AI-driven upselling can increase average check size by 10-15%. But the flip side is measurable too: incorrect orders drive customer dissatisfaction, generate waste, and require costly remakes. A single viral video of an AI system failing spectacularly can damage brand reputation in ways that are hard to quantify but impossible to ignore. ## The Timeline: When Will Voice AI Be "Good Enough"? Industry insiders suggest we're two to four years away from voice AI systems that can match or exceed human performance across all common drive-thru scenarios - not just average accuracy, but reliable handling of customizations, accents, and noisy environments. The path forward depends on three converging trends: Larger training datasets from real-world deployments. Every order processed by a Wendy's FreshAI system or a White Castle SoundHound installation generates training data that improves the underlying models. The current generation of systems has processed hundreds of millions of real transactions - an advantage earlier pilots lacked. Improved natural language models from the broader AI revolution. The same large language model advances powering ChatGPT and Claude are being adapted for real-time voice applications. Amazon's Nova Sonic, Google Cloud's contributions to Wendy's FreshAI, and Nvidia's partnership with Yum Brands all use cutting-edge speech-to-speech models that didn't exist three years ago. Better hardware and infrastructure at the edge. Microphone arrays, noise-canceling technology, and on-device processing are improving rapidly. The physical equipment capturing customer voices in 2025 is orders of magnitude better than what McDonald's deployed in 2021. By 2027-2028, industry observers predict that voice AI will reliably handle 90%+ of drive-thru orders without human intervention, achieving or exceeding human-level accuracy on customizations and performing well across diverse accents and noise conditions. But "good enough" is a moving target. As AI improves, customer expectations rise. The viral failures of early systems have created awareness that many customers now explicitly ask whether they're speaking to a human or a machine - and some refuse to engage with AI systems at all. ## What McDonald's Walking Away Really Means McDonald's decision to end its IBM partnership wasn't a rejection of voice AI - it was a rejection of that particular implementation. The company explicitly stated it remains committed to exploring voice ordering technology with alternative vendors. The failure taught the industry several critical lessons: First, brand risk matters more than cost savings. McDonald's couldn't afford the reputational damage from viral videos of malfunctioning AI, even if the technology saved money on average. Second, pilot success doesn't guarantee scalability. The IBM system worked adequately in controlled tests but struggled when deployed across diverse locations, customer demographics, and operating conditions. Third, the technology provider ecosystem matters. IBM, despite its AI credentials, wasn't a natural fit for the real-time, customer-facing demands of drive-thru ordering. The vendors seeing success - SoundHound, Google Cloud, specialized startups like Presto - have deep expertise in voice interfaces and restaurant operations specifically. ## The 85% Threshold Voice AI in the drive-thru has reached a critical inflection point. The technology works well enough to be useful, but not well enough to be invisible. It's accurate enough to save money in many deployments, but not reliable enough to replace human workers entirely. That 85% accuracy rate - whether it's Presto's non-interaction rate, McDonald's IBM performance, or the industry average - represents both remarkable progress and a stubborn plateau. The final 10-15 percentage points, the gap between "mostly right" and "reliably excellent," is where the real challenge lies. The chains betting billions on this technology aren't waiting for perfection. They're deploying hybrid systems that combine AI efficiency with human judgment, banking on incremental improvements while managing customer expectations. For now, that means your next drive-thru order might be taken by AI - but there's still a human listening in, ready to step in when the robot can't quite understand that you want "no onions, extra pickles, and make it a large." Because in an industry built on speed, consistency, and customer satisfaction, 85% accuracy isn't a failure. But it's not good enough yet, either.#

Related Reading#

  • Why QSR Drive-Thru Speakers Are Getting an AI Upgrade and What It Means for Order Accuracy
  • The Kiosk Tipping Point: Why 2026 Is the Year Self-Order Kiosks Become Standard in Every QSR
  • How AI-Powered Menu Boards Are Increasing QSR Average Ticket by 15%: Inside the Dynamic Pricing Revolution
  • The App Is the Restaurant: How Mobile Ordering Became the QSR Business Model
Q

QSR Pro Staff

The QSR Pro editorial team covers the quick service restaurant industry with in-depth analysis, data-driven reporting, and operator-first perspective.

More from QSR

Frequently Asked Questions

Table of Contents

  • The future of fast food was supposed to arrive in a Chicago McDonald's parking lot. In 2021, the world's largest burger chain partnered with IBM to test voice AI at the drive-thru speaker - a technology that promised to take orders faster, more accurately, and without the labor shortage headaches plaguing the industry. Three years later, in July 2024, McDonald's pulled the plug. The problem wasn't that the technology didn't work. It did - about 85% of the time. The problem was that 85% isn't good enough when you're processing millions of transactions per day, each one representing a customer who expects their order to be right, and a franchisee who needs to protect razor-thin margins in an industry where order accuracy directly impacts profitability. ## The Great Voice AI Experiment: Who's In and Who's Out The quick-service restaurant industry is in the middle of a high-stakes race to automate the drive-thru, the channel that generates an estimated 70% of revenue for major chains. But the scoreboard reveals a technology still finding its footing. McDonald's announced in June 2024 that it would end its Automated Order Taker (AOT) partnership with IBM, shutting down the technology at more than 100 test locations no later than July 26, 2024. The three-year pilot had struggled with what the industry politely calls "order accuracy incidents" - viral TikTok videos showed the system adding hundreds of chicken nuggets to orders, mishearing "ice cream" as "bacon," and failing to process cancellation requests. But while McDonald's was exiting, others were doubling down. Wendy's announced in February 2025 that it would deploy its FreshAI system - built in partnership with Google Cloud - to between 500 and 600 locations by year's end. The company had started with a single Columbus, Ohio restaurant in 2023, expanded to 36 company-operated stores across Ohio and Florida in 2024, and is now scaling nationwide. Wendy's has claimed a "success rate of nearly 99%" - though that metric counts any order started by the AI and submitted to the point-of-sale system, even if a human had to intervene mid-conversation to fix errors. Taco Bell parent company Yum Brands announced in July 2024 that it would bring voice AI to hundreds of U.S. drive-thrus by the end of the year, expanding from a pilot that started with five California locations. As of early 2024, more than 100 Taco Bell restaurants were using the technology. The company has since announced a broader partnership with Nvidia to accelerate its AI deployments across all Yum brands, including KFC and Pizza Hut. Meanwhile, SoundHound AI - one of the industry's most aggressive voice AI providers - has quietly built what may be the largest footprint in the sector. The company reported powering over 14,000 restaurant locations by Q2 2025, processing more than 100 million customer interactions by October 2024. Its client roster includes White Castle, Chipotle, Jersey Mike's, Church's Texas Chicken, Applebee's, and numerous regional chains. ## The Accuracy Problem: Why Every Percentage Point Matters The industry's dirty secret is that even human order-takers aren't perfect. Traditional drive-thrus achieve order accuracy rates of 80-85% during peak hours, according to multiple industry sources. Some studies peg human accuracy as high as 89-92% under optimal conditions. Voice AI systems have reached comparable - sometimes superior - levels in controlled deployments. SoundHound claims its AI can complete more than 90% of orders without requiring human intervention. Presto Automation reported in December 2024 that its "non-interaction rate" (NIR) - the percentage of orders taken entirely by AI - averaged 85% across all Presto Voice-enabled restaurants, with certain locations hitting 95%. But raw accuracy numbers obscure the real challenge: AI and humans fail differently. A 2025 study by customer experience firm Intouch Insight found that traditional drive-thrus achieved 89% order accuracy, while AI-powered systems dropped to 83%. The critical finding: 65% of AI errors involved customizations - the "no pickles," "extra sauce," "make it a meal" modifications that represent the most profitable upsell opportunities and the most frustrating customer experience failures. Humans might mishear an order during a lunch rush, but they excel at the conversational inference that makes drive-thru ordering work. When a customer says "I'll take a large chocolate milkshake," a human knows they probably mean a large chocolate Frosty at Wendy's, or a McCafé shake at McDonald's. AI systems require extensive training on brand-specific menu terminology - and they still struggle. The IBM system that McDonald's abandoned reportedly achieved 85% accuracy in its Chicago test market. But McDonald's serves roughly 70 million customers per day globally. An 85% accuracy rate would mean 10.5 million incorrect orders daily - an unacceptable customer service and cost burden, even accounting for the fact that not all of those customers use drive-thrus. ## The Triple Challenge: Accents, Dialects, and Ambient Noise Drive-thrus are among the most acoustically hostile environments imaginable for speech recognition technology. Forbes reported in February 2026 that McDonald's IBM partnership ended largely because "the system struggled with interpreting different accents, dialects and background noise." The challenges are layered and specific: Background noise fluctuates wildly. A drive-thru microphone must filter out idling engines, honking horns, construction noise from nearby roads, weather conditions, and multiple voices inside the vehicle - all while isolating the speaker's voice clearly enough to distinguish "Coke" from "Diet Coke." Accent and dialect variation creates a moving target. The same menu item can be pronounced dozens of ways across regional accents. "Large" sounds different in Boston than Birmingham. "Fountain drink" might be called "soda," "pop," or "Coke" (as a generic term for all soft drinks) depending on geography. SoundHound acknowledged in 2022 that "the complexities of language differences make creating accent-agnostic speech recognition systems nearly as challenging as offering distinct languages." Real-time conversational complexity means AI must process interruptions, mid-order changes, simultaneous speakers (a parent ordering for kids in the back seat), slang, incomplete sentences, and the rapid-fire customization requests that define modern fast food ordering. Technology providers have made significant progress on these challenges. Presto Automation highlighted in June 2025 that its system "passed multiple background noise challenges during the order-taking process," designed specifically for restaurants located on busy roads. Amazon's Nova Sonic model, announced in late 2025, promises "accurate recognition of streaming speech across accents with reliability to background noise." But the gap remains measurable - and costly. ## The Hybrid Model: AI's Actual Future in the Drive-Thru No major chain is pursuing fully autonomous voice AI at scale. Instead, the industry is converging on a hybrid model: AI handles routine transactions and escalates complex orders to human staff. This isn't an admission of failure - it's practical economics. The average drive-thru order is simple: a combo meal, maybe a drink, minimal customization. For those transactions, AI is fast, consistent, never calls in sick, and can upsell ("Would you like to add a dessert?") without the social awkwardness that makes human workers hesitate. Presto Automation's hybrid approach explicitly combines AI with human backup. The company's "Presto Voice" system is designed to hand off smooth to a human operator when the AI detects uncertainty - a customer repeating themselves, requesting an unusual modification, or expressing frustration. Wendy's FreshAI pairs the voice system with a digital menu board showing a visual confirmation of the order, allowing customers to catch errors in real-time and reducing the burden on the AI to achieve perfect transcription. The economics work when AI handles 70-80% of orders completely, reducing the number of human order-takers needed per shift while keeping experienced staff available for the 20-30% of transactions that require human judgment. This isn't full automation - it's labor reallocation. ## The ROI Question: When Does 85% Beat 92%? Despite lower accuracy rates, voice AI deployments are accelerating. The reason is simple: cost. Labor represents one of the largest variable expenses in QSR operations, typically 25-30% of revenue. An AI system that can handle even 70% of drive-thru orders without human intervention generates immediate labor savings, particularly during overnight and off-peak hours when staffing is most challenging. The technology also addresses the industry's chronic staffing shortage. U.S. restaurant operators have struggled to fill positions since the pandemic, with turnover rates in QSR often exceeding 100% annually. An AI system that never quits, never needs training, and works 24/7 has inherent value even if it occasionally adds unwanted McNuggets to an order. Revenue impact extends beyond labor savings. AI systems consistently outperform humans at upselling. They don't forget to suggest adding fries, they're never too busy to offer a dessert, and they don't experience the social discomfort that makes human workers hesitate to push additional items. Industry estimates suggest AI-driven upselling can increase average check size by 10-15%. But the flip side is measurable too: incorrect orders drive customer dissatisfaction, generate waste, and require costly remakes. A single viral video of an AI system failing spectacularly can damage brand reputation in ways that are hard to quantify but impossible to ignore. ## The Timeline: When Will Voice AI Be "Good Enough"? Industry insiders suggest we're two to four years away from voice AI systems that can match or exceed human performance across all common drive-thru scenarios - not just average accuracy, but reliable handling of customizations, accents, and noisy environments. The path forward depends on three converging trends: Larger training datasets from real-world deployments. Every order processed by a Wendy's FreshAI system or a White Castle SoundHound installation generates training data that improves the underlying models. The current generation of systems has processed hundreds of millions of real transactions - an advantage earlier pilots lacked. Improved natural language models from the broader AI revolution. The same large language model advances powering ChatGPT and Claude are being adapted for real-time voice applications. Amazon's Nova Sonic, Google Cloud's contributions to Wendy's FreshAI, and Nvidia's partnership with Yum Brands all use cutting-edge speech-to-speech models that didn't exist three years ago. Better hardware and infrastructure at the edge. Microphone arrays, noise-canceling technology, and on-device processing are improving rapidly. The physical equipment capturing customer voices in 2025 is orders of magnitude better than what McDonald's deployed in 2021. By 2027-2028, industry observers predict that voice AI will reliably handle 90%+ of drive-thru orders without human intervention, achieving or exceeding human-level accuracy on customizations and performing well across diverse accents and noise conditions. But "good enough" is a moving target. As AI improves, customer expectations rise. The viral failures of early systems have created awareness that many customers now explicitly ask whether they're speaking to a human or a machine - and some refuse to engage with AI systems at all. ## What McDonald's Walking Away Really Means McDonald's decision to end its IBM partnership wasn't a rejection of voice AI - it was a rejection of that particular implementation. The company explicitly stated it remains committed to exploring voice ordering technology with alternative vendors. The failure taught the industry several critical lessons: First, brand risk matters more than cost savings. McDonald's couldn't afford the reputational damage from viral videos of malfunctioning AI, even if the technology saved money on average. Second, pilot success doesn't guarantee scalability. The IBM system worked adequately in controlled tests but struggled when deployed across diverse locations, customer demographics, and operating conditions. Third, the technology provider ecosystem matters. IBM, despite its AI credentials, wasn't a natural fit for the real-time, customer-facing demands of drive-thru ordering. The vendors seeing success - SoundHound, Google Cloud, specialized startups like Presto - have deep expertise in voice interfaces and restaurant operations specifically. ## The 85% Threshold Voice AI in the drive-thru has reached a critical inflection point. The technology works well enough to be useful, but not well enough to be invisible. It's accurate enough to save money in many deployments, but not reliable enough to replace human workers entirely. That 85% accuracy rate - whether it's Presto's non-interaction rate, McDonald's IBM performance, or the industry average - represents both remarkable progress and a stubborn plateau. The final 10-15 percentage points, the gap between "mostly right" and "reliably excellent," is where the real challenge lies. The chains betting billions on this technology aren't waiting for perfection. They're deploying hybrid systems that combine AI efficiency with human judgment, banking on incremental improvements while managing customer expectations. For now, that means your next drive-thru order might be taken by AI - but there's still a human listening in, ready to step in when the robot can't quite understand that you want "no onions, extra pickles, and make it a large." Because in an industry built on speed, consistency, and customer satisfaction, 85% accuracy isn't a failure. But it's not good enough yet, either.
  • Related Reading

Get more insights like this

Subscribe to our daily briefing

Related Articles

Drive
Technology & Innovation•March 2026

Why QSR Drive-Thru Speakers Are Getting an AI Upgrade and What It Means for Order Accuracy

From McDonald's to Wendy's to Taco Bell, AI voice ordering is moving from pilot to large-scale deployment across the drive-thru lane

QSR Pro Staff•11 min read•2,719
2026
Technology & Innovation•March 2026

The Kiosk Tipping Point: Why 2026 Is the Year Self-Order Kiosks Become Standard in Every QSR

After years of cautious adoption, kiosk technology has hit critical mass—driven by economics, consumer demand, and a fundamental reimagining of restaurant labor

QSR Pro Staff•11 min read•5,578
15
Technology & Innovation•March 2026

How AI-Powered Menu Boards Are Increasing QSR Average Ticket by 15%: Inside the Dynamic Pricing Revolution

From McDonald's $300M bet on Dynamic Yield to Wendy's pricing backlash, the race to optimize every transaction is reshaping fast food economics—and customer trust

QSR Pro Staff•11 min read•2,371
2026
Operations & Management•March 2026

The QSR Labor Crisis in 2026: Wages, Automation, and the Fight for the Future of Fast Food

With quit rates surging past 4.8%, wages under political pressure, and unions organizing at record pace, QSR operators are turning to AI drive-thrus, robotic fryers, and self-order kiosks to survive. Here is where every major chain stands.

QSR Pro Staff•8 min read•1

Free Tools

  • Labor Cost CalculatorMeasure automation savings
  • Profit Margin CalculatorModel tech ROI
View all tools

Related Topics

drive-thruAutomationTechnology

Explore

  • Finance & Economics
  • Industry Analysis
  • Marketing & Growth
  • Operations & Management
  • People & Culture
Previous

The Kiosk Tipping Point: Why 2026 Is the Year Self-Order Kiosks Become Standard in Every QSR

Technology & Innovation
Next

The QSR Real Estate Land Grab: How Chick-fil-A, Wingstop, and Dutch Bros Are Winning the Best Sites

Finance & Economics

More from Technology & Innovation

View all
Inside
Technology & Innovation•March 2026

Inside Sweetgreen's Infinite Kitchen: Can a Robotic Assembly Line Fix Fast Casual's Margin Problem?

Sweetgreen's robotic Infinite Kitchen delivers 700 basis points of labor savings and 10 points of extra margin. But with $450K per install and same-store sales falling 9.5%, the real question is whether automation can outrun fast casual's deeper structural challenges.

AutomationChipotle
QSR Pro Staff•9 min read•3
70
Technology & Innovation•March 2026

Wingstop's Digital-First Playbook: Can 70% Digital Sales Reshape QSR Unit Economics?

Wingstop's digital sales mix hit 73.2% in Q4 2025, one of the highest penetration rates in QSR. The six-year arc from 39% to 73% has fundamentally altered the brand's labor model, throughput capacity, and expansion calculus. Here's what it means for the industry.

unit economics
QSR Pro Staff•10 min read•3
Restaurants
Technology & Innovation•March 2026

Restaurants Are Betting Big on AI. Only 5% Say It's Actually Working.

A new benchmark study of 168 restaurant brands and 94,000 locations reveals a stark gap between AI enthusiasm and measurable results. Nearly three-quarters of operators are investing in AI, but fewer than one in ten report meaningful impact on operations or guest experience.

QSR Pro Staff•6 min read•2
$20
Technology & Innovation•March 2026

Restaurants Are Losing $20 Billion a Year to Missed Phone Calls. AI Is Finally Fixing It.

Over 40% of restaurant phone calls go unanswered during peak hours, costing the industry an estimated $20 billion annually. A new wave of AI phone ordering platforms is turning that dead air into revenue, and the economics are hard to argue with.

QSR Pro Staff•8 min read•1