IBM’s Watson Supercomputer Fails as a Tutor: A Cautionary AI Story

By Greg Toppo
April 9, 2024

With the commencement of a new race to develop the subsequent teaching chatbot, IBM’s disbanded 5-year, $100M educational push unveils valuable lessons about both the potentials and confines of AI. 

Feb. 16, 2011, marked a pivotal moment in the realm of artificial intelligence.

On that fateful day, IBM’s Watson supercomputer completed a triumphant three-game sweep against Jeopardy! titans Ken Jennings and Brad Rutter. Despite trailing by over $30,000, Jennings, now the show’s host, submitted his Final Jeopardy response with a hint of resignation: “I, for one, welcome our computer overlords.”

Although some saw it as a mere amusement, the encounter spurred Satya Nitta, a veteran computer scientist at IBM’s Watson Research Center in Yorktown Heights, New York. Entrusted with utilizing the supercomputer’s capabilities in education, he envisioned taking on the sector’s most elusive feat: creating the world’s premier tutoring system driven by AI, providing personalized instruction without human intervention.


“I sensed they were prepared to make a significant advancement in this field,” he articulated in an interview. 

Nitta successfully petitioned his superiors to allocate more than $100 million to the initiative, assembling 130 technologists, including 30 to 40 Ph.D.s, from research labs across four continents. 

However, by 2017, the ambitious tutoring venture was essentially defunct, and Nitta had concluded that effective, long-term, personalized tutoring proved to be “a poor application of AI – a stance that still stands today.”

Despite Watson’s impressive computational prowess, the computer was ineffectual as an educator. It couldn’t captivate or inspire students, motivate them to excel, or sustain their focus on the material – qualities essential in exceptional mentors.

This finding reverberates with our current era of AI-induced apprehensions about the future of humanity in a domain dominated by advancing technology. “AI excels in certain aspects,” Nitta elucidated, “but it cannot substitute for human interaction.”

His five-year odyssey to a seeming cul-de-sac could serve as a guidepost as ChatGPT and analogous programs embark on a renewed, multi-million-dollar endeavor to potentially refute his observations.

Pioneers in education technology, ranging from Google to Microsoft, are striving to pick up where Watson left off, offering AI tools designed to enhance student learning. Sal Khan, the founder of Khan Academy, remarked last year that AI holds the capacity to enact “probably the most significant positive transformation” in education history. He aspires to provide “every student on the planet an artificially intelligent but exceptional personal tutor.”

An Extensive Journey

Studies on high-dose, one-on-one, in-person tutoring have proven to be unequivocal: it stands as one of the most potent interventions, exhibiting substantial enhancements in students’ academic performance, particularly in subjects like mathematics, reading, and writing.  

Nevertheless, traditional tutoring poses challenges due to its exorbitant costs and logistical difficulties in extending its scope. Paige Johnson, a Microsoft education vice president, noted that a West Texas school district recently allocated over $5.6 million in federal relief funds to tutor 6,000 pupils, a financial barrier for many parents and educational institutions. 

We missed something important. At the heart of education, at the heart of any learning, is engagement.

Satya Nitta, IBM Research’s former global head of AI solutions for learning

Determined to rebalance the equation to favor students, IBM couldn’t resist the rendering.

The Watson lab holds a legendary status in the realm of computer science, boasting six Nobel laureates and six Turing Award laureates among its members. The lab pioneered modern speech recognition, invented barcodes, and devised the magnetic stripes on credit cards, enabling ATMs. In 1997, it made waves by defeating world chess champion Garry Kasparov through Deep Blue, setting the groundwork for AI’s “human-like” thought processes.

Chess enthusiasts watch World Chess champion Garry Kasparov on a television monitor as he holds his head in his hands at the start of the sixth and final match May 11, 1997, against IBM’s Deep Blue computer in New York. Kasparov lost the match in merely 19 moves. (Stan Honda/Getty)

The fervent atmosphere fueled Nitta’s sense of mission: “I felt deep obligation to undertake a substantial endeavor, not a trivial pursuit.”

In the span of a few years following Watson’s triumph, Nitta, who joined in 2000 as a chip maven, ascended to the global helm of AI educational solutions at IBM Research. Tasked with the Watson initiative, he had an open-ended directive: leverage Watson for educational breakthroughs.

Immersing himself in learning theories, Nitta delved into cognitive science, neuroscience, and the history of “intelligent tutoring systems” in academia. Among his paramount reads was the work of Stanford neuroscientist Vinod Menon, who orchestrated a 12-week math tutoring experiment on elementary students, illustrating heightened neural connectivity consequent to tutoring sessions. 

Presenting the notion of an AI-driven cognitive tutor to his higher-ups, Nitta contended, “I’ve got a compelling proposal here that can revolutionize learning at large. However, it’s a 25-year pilgrimage, not a brief, three or four-year jaunt.”

IBM forged partnerships with early education powerhouse Sesame Workshop and Pearson, the renowned international publisher.

An envisioned product by Sesame was a voice-activated Elmo doll, serving as an interactive digital tutoring ally engaging children to gauge their proficiency and provide verbal encouragement for progress.

One proposed application of IBM’s planned Watson tutoring app was a voice-activated Elmo doll acting as an interactive digital companion. (Getty)

On the other end, Pearson pledged to capacitate college students to “converse with Watson in live interactions.”

Nitta and his team initiated crafting lessons and subjecting them to students’ scrutiny, both in classrooms and the lab. Preferring interactive dialogues over one-way queries, they prompted kids to articulate responses in their own words instead of multiple-choice selections.

Yet, the outcomes were unsatisfactory.

While some students interacted with the chatbot, others merely replied with “IDK” (I don’t know). Furthermore, those who engaged started providing truncated responses progressively.

Nitta and his team acknowledged a stark reality underpinning the issue: despite its computational prowess, Watson lacked engagement. This lack of charm failed to leverage any notable learning benefits. It wasn’t merely dull, but it missed the mark on effectiveness as well.

Satya Nitta (left) and part of his team at IBM’s Watson Research Center, which dedicated five years to crafting an AI-interacted tutor using the Watson supercomputer.

“Human dialogue is profound,” he articulated. “In mutual exchanges, I witness a diversified perception. The tutor shapes the student, and vice versa. The shared discourse evolution is remarkably profound, and it seems implausible to replicate that with an impersonal bot, despite my AI background.”

Upon students’ dwindling usage patterns, “we had to confront the reality,” Nitta remarked. “We then came to the realization that our concept – an intelligent tutoring system universally assisting all students – was inherently flawed.”

‘A Vital Oversight’

IBM subsequently pivoted, unfolding an alternative, crowd-pleasing Watson iteration – this time, engaging in Oxford-style debates. In a televised exhibition in 2019, the supercomputer debated champion Harish Natarajan on the topic of “Should we subsidize preschools?”, proposing funding arguments, intriguingly linking good preschools with “future crime prevention.” The current version, Watsonx, concentrates on aiding businesses in constructing AI applications such as “intelligent customer care.”

Nitta departed IBM, eventually leading several colleagues to establish Merlyn Mind. The startup employs voice-activated AI to assist teachers in routine tasks like updating digital gradebooks, launching PowerPoint presentations, and communicating with students and parents. 

Thirteen years post-Watson’s illustrious Jeopardy! conquest and over a year into the ChatGPT era, Nitta’s AI anticipations veer pragmatically: his AI serves as a meticulously designed assistant, seamlessly integrating into a teacher’s daily progress.

Sophisticated actions like automating quizzes from course material and refining student essays are within AI’s purview. Nonetheless, the notion of machines or chatbots reproducibly enlightening like humans remains “a misinterpretation of AI’s capabilities,” he contested.

Although maintaining deep admiration for the Watson lab, Nitta conceded, “We disregarded a crucial element. At the crux of education, at the core of information acquisition, there lies engagement. That’s the ultimate aspiration.”

These insights come as no surprise to tutoring professionals. Varsity Tutors, which furnishes live and online tutoring across 500 school districts, employs AI in devising personalized lesson strategies. Nonetheless, when it comes to actual tutoring, humans – as affirmed by Anthony Salcito, chief institution officer at Nerdy, parent company of Varsity – are the messengers.

Students love their tutors. I’m not sure we’re at a point where students are going to love an AI agent.

Anthony Salcito, Nerdy

The paramount factor in a student’s tutoring triumph is consistent attendance, evidenced by research. Despite the AI chatbot’s adeptness in learning dynamics, it’s uncertain whether most students, notably those struggling, would commit to an inanimate agent or exhibit respect for its educational direction.

Reflecting on the present attributes of AI bots in education, Salcito finds them wanting. Most, he remarked, “fail to reimagine the educational landscape substantially.” They often function as expedient, revamped search engines, devoid of influential transformation. 

In most scenarios, he added, the efficacy of personalized, one-on-one tutoring surfaces as students escalate honesty about their capabilities, advocate for themselves, and demand more out of their educational journey. “In a classroom setting, a student may claim comprehension. But when facing a tutor, they admit, ‘Hey, I need assistance,'” he highlighted.

Cognitive science suggests that for unmotivated or uncertain students, only individualized attention can bolster comprehension. This underscores the need for an attentive, empathetic human mind scrutinizing, inquiring extensively, and decoding students’ cues. 

Jeremy Roschelle, a learning scientist and executive director of Digital Promise, a federally funded research hub, noted a typical decline in usage with most educational technology products. “Students lose interest. It’s not exclusive to tutoring. The novelty factor entices students, longing for the next innovation,” he observed. 

There’s a novelty factor for students. They thirst for the upcoming novelty.

Jeremy Roschelle, Digital Promise

Presently, Nitta highlights research denoting that major commercial AI applications lack user engagement in comparison to top entertainment and social media platforms like YouTube, Instagram, and TikTok. Recent analysis unveiled the user engagement disparity, dubbing the visitor involvement on platforms like ChatGPT as “lackluster.” Approximately 14% of monthly active users engage with such applications in a single day, indicating a lack of substantial user stickiness.

In contrast, social media platforms boast 60% to 65% user engagement rates. 

A noteworthy AI outlier is, an app enabling users to interact with historical and fictional figures like Socrates and Bart Simpson. It boasts a 41% stickiness score.

Other articles

Post Image
Parents Joining Their Kids on Stage at Graduation: Embracing the Spirit of Community

When Yanelit Madriz Zarate walked across the stage at a University of California …

Read More
Post Image
California University Leader Believes Year-Round Operations Will Boost Enrollment

EdSource’s journalism is always free for everyone — because we believe an inform …

Read More
Post Image
Maryland Superintendent Calls for Increased Efforts to Expand State’s Teacher Workforce

Maryland State Schools Superintendent Carey Wright emphasized on Tuesday the imp …

Read More