IBM Watson
IBM Watson is a computer system capable of answering questions posed in natural language. It was developed as a part of IBM's DeepQA project by a research team, led by principal investigator David Ferrucci. Watson was named after IBM's founder and first CEO, industrialist Thomas J. Watson.
The computer system was initially developed to answer questions on the popular quiz show Jeopardy! and in 2011, the Watson computer system competed on Jeopardy! against champions Brad Rutter and Ken Jennings, winning the first-place prize of US$1 million.
In February 2013, IBM announced that Watson's first commercial application would be for utilization management decisions in lung cancer treatment, at Memorial Sloan Kettering Cancer Center, New York City, in conjunction with WellPoint.
In 2022, IBM divested and spun-off their Watson Health division into Merative, which was sold to Francisco Partners, an American private equity firm. The division cost $4 billion to develop but was sold for $1 billion. By 2023, Watson resulted in IBM losing 10% of its stock value, costing four times more than what it brought to the company and resulting in mass layoffs.
Description
Watson was created as a question answering computing system that IBM built to apply advanced natural language processing, information retrieval, knowledge representation, automated reasoning, and machine learning technologies to the field of open domain question answering. The system is named DeepQA.IBM stated that Watson uses "more than 100 different techniques to analyze natural language, identify sources, find and generate hypotheses, find and score evidence, and merge and rank hypotheses."
In recent years, Watson's capabilities have been extended and the way in which Watson works has been changed to take advantage of new deployment models, evolved machine learning capabilities, and optimized hardware available to developers and researchers.
Software
Watson uses IBM's DeepQA software and the Apache UIMA framework implementation. The system was written in various languages, including Java, C++, and Prolog, and runs on the SUSE Linux Enterprise Server 11 operating system using the Apache Hadoop framework to provide distributed computing.Other than the DeepQA system, Watson contained several strategy modules. For example, one module calculated the amount to bet for Final Jeopardy, according to the confidence score on getting the answer right, and the current scores of all contestants. One module used the Bayes rule to calculate the probability that each unrevealed question might be the Daily Double, using historical data from the J! Archive as the prior. If a Daily Double is found, the amount to wager is computed by a 2-layered neural network of the same kind as those used by TD-Gammon, a neural network that played backgammon, developed by Gerald Tesauro in the 1990s. The parameters in the strategy modules were tuned by benchmarking against a statistical model of human contestants fitted on data from the J! Archive, and selecting the best one.
Hardware
The system is workload-optimized, integrating massively parallel POWER7 processors and built on IBM's DeepQA technology, which it uses to generate hypotheses, gather massive evidence, and analyze data. Watson employs a cluster of ninety IBM Power 750 servers, each of which uses a 3.5 GHz POWER7 eight-core processor, with four threads per core. In total, the system uses 2,880 POWER7 processor threads and 16 terabytes of RAM.According to John Rennie, Watson can process 500 gigabytes per second. IBM master inventor and senior consultant Tony Pearson estimated Watson's hardware cost at about three million dollars. Its Linpack performance stands at 80 TeraFLOPs, which is about half as fast as the cut-off line for the Top 500 Supercomputers list. According to Rennie, all content was stored in Watson's RAM for the Jeopardy game because data stored on hard drives would be too slow to compete with human Jeopardy champions.
Data
The sources of information for Watson include encyclopedias, dictionaries, thesauri, newswire articles and literary works. Watson also used databases, taxonomies and ontologies including DBpedia, WordNet and YAGO. The IBM team provided Watson with millions of documents, including dictionaries, encyclopedias and other reference material, that it could use to build its knowledge.Operation
Watson parses questions into different keywords and sentence fragments in order to find statistically related phrases. Watson's main innovation was not in the creation of a new algorithm for this operation, but rather its ability to quickly execute hundreds of proven language analysis algorithms simultaneously. The more algorithms that find the same answer independently, the more likely Watson is to be correct. Once Watson has a small number of potential solutions, it is able to check against its database to ascertain whether the solution makes sense or not.Comparison with human players
Watson's basic working principle is to parse keywords in a clue while searching for related terms as responses. This gives Watson some advantages and disadvantages compared with human Jeopardy! players. Watson has deficiencies in understanding the context of the clues. Watson can read, analyze, and learn from natural language, which gives it the ability to make human-like decisions. As a result, human players usually generate responses faster than Watson, especially to short clues. Watson's programming prevents it from using the popular tactic of buzzing before it is sure of its response. However, Watson has consistently better reaction time on the buzzer once it has generated a response, and is immune to human players' psychological tactics, such as jumping between categories on every clue.In a sequence of 20 mock games of Jeopardy!, human participants were able to use the six to seven seconds that Watson needed to hear the clue and decide whether to signal for responding. During that time, Watson also has to evaluate the response and determine whether it is sufficiently confident in the result to signal. Part of the system used to win the Jeopardy! contest was the electronic circuitry that receives the "ready" signal and then examines whether Watson's confidence level was great enough to activate the buzzer. Given the speed of this circuitry compared to the speed of human reaction times, Watson's reaction time was faster than the human contestants except when the human anticipated the ready signal. After signaling, Watson speaks with an electronic voice and gives the responses in Jeopardy! question format. Watson's voice was synthesized from recordings that actor Jeff Woodman made for an IBM text-to-speech program in 2004.
The Jeopardy! staff used different means to notify Watson and the human players when to buzz, which was critical in many rounds. The humans were notified by a light, which took them tenths of a second to perceive. Watson was notified by an electronic signal and could activate the buzzer within about eight milliseconds. The humans tried to compensate for the perception delay by anticipating the light, but the variation in the anticipation time was generally too great to fall within Watson's response time. Watson did not attempt to anticipate the notification signal.
History
Development
Since Deep Blue's victory over Garry Kasparov in chess in 1997, IBM had been on the hunt for a new challenge. In 2004, IBM Research manager Charles Lickel, over dinner with coworkers, noticed that the restaurant they were in had fallen silent. He soon discovered the cause of this evening's hiatus: Ken Jennings, who was then in the middle of his successful 74-game run on Jeopardy!. Nearly the entire restaurant had piled toward the televisions, mid-meal, to watch Jeopardy!. Intrigued by the quiz show as a possible challenge for IBM, Lickel passed the idea on, and in 2005, IBM Research executive Paul Horn supported Lickel, pushing for someone in his department to take up the challenge of playing Jeopardy! with an IBM system. Though he initially had trouble finding any research staff willing to take on what looked to be a much more complex challenge than the wordless game of chess, eventually David Ferrucci took him up on the offer. In competitions managed by the United States government, Watson's predecessor, a system named Piquant, was usually able to respond correctly to only about 35% of clues and often required several minutes to respond. To compete successfully on Jeopardy!, Watson would need to respond in no more than a few seconds, and at that time, the problems posed by the game show were deemed to be impossible to solve.In initial tests run during 2006 by David Ferrucci, the senior manager of IBM's Semantic Analysis and Integration department, Watson was given 500 clues from past Jeopardy! programs. While the best real-life competitors buzzed in half the time and responded correctly to as many as 95% of clues, Watson's first pass could get only about 15% correct. During 2007, the IBM team was given three to five years and a staff of 15 people to solve the problems. John E. Kelly III succeeded Paul Horn as head of IBM Research in 2007. InformationWeek described Kelly as "the father of Watson" and credited him for encouraging the system to compete against humans on Jeopardy!. By 2008, the developers had advanced Watson such that it could compete with Jeopardy! champions. By February 2010, Watson could beat human Jeopardy! contestants on a regular basis.
During the game, Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage including the full text of the 2011 edition of Wikipedia, but was not connected to the Internet. For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble in a few categories, notably those having short clues containing only a few words.
Although the system is primarily an IBM effort, Watson's development involved faculty and graduate students from Rensselaer Polytechnic Institute, Carnegie Mellon University, University of Massachusetts Amherst, the University of Southern California's Information Sciences Institute, the University of Texas at Austin, the Massachusetts Institute of Technology, and the University of Trento, as well as students from New York Medical College. Among the team of IBM programmers who worked on Watson was 2001 Who Wants to Be a Millionaire? top prize winner Ed Toutant, who himself had appeared on Jeopardy! in 1989.