Post-Game Analysis: How Watson Did in Round 1 of 'Jeopardy!'
Watson early in the Single Jeopardy round, putting the hurt on the humans.
Last night (Feb. 14) saw the completion of the first round of the "Jeopardy!" contest between an IBM computer called Watson and the quiz show's best human contestants, and it was a thriller.
In the spirit of Monday morning quarterbacking, TechNewsDaily requested post-game analyses from researchers into artificial intelligence and from the author of an upcoming book on Watson. Here's some of the play-by-play, with commentary from the experts, in preparation for tonight's second round.
(Warning: Spoilers ahead!)
Digital strategy right out the gate
All-time "Jeopardy!" money-winner Brad Rutter was awarded the chance to go first, and after he gave the correct response to a $200 clue, he went for the next clue in the category, for $400. Watson pounced on it. The computer's unassuming, warbly, Stephen Hawking-like voice then asked for a way-down $800 question in an untouched category – an unusual move so early in the game – and it paid off: Watson nabbed the round's only Daily Double.
Stephen Baker, who has long followed the Watson project for his book "Final Jeopardy: Man vs. Machine and the Quest to Know Everything," recognized the maneuver from the dozens of matches Watson played against former "Jeopardy!" contestants to hone its programming. "Throughout the sparring matches, Watson has hunted Daily Doubles – that's key to its strategy," Baker said.
Although Watson had won only $400 at that point, "he" – as host Alex Trebek referred to the computer – by rule could wager up to $1,000. Watson did so, answered the clue correctly in the "Literary Characters APB" category (with APB standing for "all points bulletin"): "Wanted for killing Sir Danvers Carew; appearance – pale & dwarfish; seems to have a split personality." (Watson's answer: "Who is Hyde?"
Watson then proceeded to pluck the low-hanging fruit: the cheap and typically easier questions at the top of each category. As play progressed, the computer responded correctly to nine of the next 12 clues. By the first commercial break, the computer had $5,200 compared to Rutter's $1,000 and $200 for Ken Jennings, who has won the most "Jeopardy!" games.
Throughout this early session, "I bet that the human contestants also knew the correct answers, and that what we saw was Watson's speed advantage," said Ted Senator, vice president and technical fellow at the Science Applications International Corp. Senator, a former "Jeopardy!" champ himself, is also secretary-treasurer of the Association for the Advancement of Artificial Intelligence.
Although Watson cannot buzz in faster than a human – the machine must also physically press a button – its programming evidently allows it to devour certain question types.
Three of these clues Watson answered correctly, and four out of five before the round is up, were in the category "Beatles People." The clues contained song snippets from the seminal British rock group.
"Watson is very good at producing Beatles song titles from lyric fragments, and this is the same kind of ability that Google search has," said Michael Dyer, a professor of computer science at the University of California, Los Angeles.
Watson is not connected to the Internet , but it can rummage through the 200 million pages or so of text in its memory – which apparently includes Beatles songbooks – at lightning speed compared with our mental look-up abilities. [Read: 9 Super-Cool Uses for Supercomputers ]
But that's old hat for computers. "Pulling ahead on the easy clues because of speed isn’t all that impressive – computers have been faster than people on simple tasks for a long time," Senator said. "It’s what will happen later [in the round] that will be far more interesting."
Second-half proves rough for the machine
Indeed, Watson's dominance after the first 15 questions evaporated. The computer flubbed the very first clue after the commercial break, which read: "From the Latin for 'end,' this is where trains can also originate." Watson, whose answers come with a confidence ranking, was 97 percent sure that "finis," the Latin word for "end," was correct.
Dyer wrote in an e-mail: "Watson should have gotten 'terminal' and 'stations' from 'trains originate' and then should have noticed that 'terminal' overlaps with 'end' via the related notion of 'terminate' (to end), but I guess this was too indirect for Watson (but, not apparently, for my brain)."
On the 20th question, Jennings engaged in a Watson- (and human-) beating strategy: Buzz first and then use the five seconds available to come with an answer. Humans can exploit the fact that Watson buzzes in only when it has high confidence in an answer choice.
Watson got the next clue wrong due to linguistic trickiness, according to IBM's chief Watson scientist David Ferrucci, who spoke to Baker. The clue: "It was the anatomical oddity of U.S. gymnast George Eyser, who won a gold medal on the parallel bars in 1904."
Jennings employed the buzz-in-first strategy again and guessed, "What is he only had one hand?" That was incorrect. Watson then offered "What is leg?"
The correct answer needed the specificity of "missing a leg" — a hard grab for Watson, given the word "oddity" in the clue, Ferrucci explained to Baker. Watson "would have had to find documents indicating that not only that Eyser was missing a leg, but that this was odd," Baker wrote on his blog.
Another Watson flub occurred after Jennings incorrectly answered, "What is the 1920s?" in the "Name the Decade" category. Watson, who is deaf, repeated this answer.
Now the action really starts
Overall, as the round entered its late stages with most of the remaining clues located in the harder $800 level, "Watson didn’t buzz in as often," Senator noted. "The humans appeared more confident and aggressive."
Watson found himself losing to Rutter, but the computer pulled even by answering the last clue correctly.
When the dust settled, Rutter and Watson were tied at $5,000 apiece and Jennings had $2,000.
"The producers couldn’t ask for a better cliffhanger," Senator said.
Ultimately, the Single Jeopardy round does not matter all that much, Baker said. "The first pass is almost more show. It's Double Jeopardy when the much bigger dollars trot out. The scores are meaningless right now."