full transcript
"From the Ted Talk by Kenneth Cukier: Big data is better data"

Unscramble the Blue Letters

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most impressive aares where this concept is taking place is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to siplmy throw data at the problem and tell the cmuteopr to figure it out for itself. And it will help you understand it by seeing its origins. In the 1950s, a computer scientist at IBM named Arthur Samuel liked to play cechkers, so he wrote a computer program so he could play against the computer. He plyaed. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the buakrnocgd, and all it did was score the pioribblaty that a given board coitniurfoagn would likely lead to a winning board versus a linsog board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur sumael leaves the computer to play itself. It plays itself. It collects more data. It cotlcles more data. It increases the aaccucry of its prediction. And then ahturr Samuel goes back to the computer and he plays it, and he loses, and he plays it, and he loses, and he plays it, and he loses, and Arthur Samuel has created a machine that surpasses his atiilby in a task that he taught it.

Open Cloze

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most impressive _____ where this concept is taking place is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to ______ throw data at the problem and tell the ________ to figure it out for itself. And it will help you understand it by seeing its origins. In the 1950s, a computer scientist at IBM named Arthur Samuel liked to play ________, so he wrote a computer program so he could play against the computer. He ______. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the __________, and all it did was score the ___________ that a given board _____________ would likely lead to a winning board versus a ______ board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur ______ leaves the computer to play itself. It plays itself. It collects more data. It ________ more data. It increases the ________ of its prediction. And then ______ Samuel goes back to the computer and he plays it, and he loses, and he plays it, and he loses, and he plays it, and he loses, and Arthur Samuel has created a machine that surpasses his _______ in a task that he taught it.

Solution

  1. played
  2. collects
  3. background
  4. arthur
  5. areas
  6. samuel
  7. configuration
  8. checkers
  9. ability
  10. losing
  11. probability
  12. simply
  13. accuracy
  14. computer

Original Text

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most impressive areas where this concept is taking place is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to simply throw data at the problem and tell the computer to figure it out for itself. And it will help you understand it by seeing its origins. In the 1950s, a computer scientist at IBM named Arthur Samuel liked to play checkers, so he wrote a computer program so he could play against the computer. He played. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the background, and all it did was score the probability that a given board configuration would likely lead to a winning board versus a losing board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur Samuel leaves the computer to play itself. It plays itself. It collects more data. It collects more data. It increases the accuracy of its prediction. And then Arthur Samuel goes back to the computer and he plays it, and he loses, and he plays it, and he loses, and he plays it, and he loses, and Arthur Samuel has created a machine that surpasses his ability in a task that he taught it.

ngrams of length 2

collocation frequency
big data 16
arthur samuel 6
machine learning 5

Important Words

  1. ability
  2. accuracy
  3. area
  4. areas
  5. arthur
  6. artificial
  7. background
  8. big
  9. board
  10. branch
  11. checkers
  12. collects
  13. computer
  14. concept
  15. configuration
  16. created
  17. data
  18. figure
  19. general
  20. ibm
  21. idea
  22. impressive
  23. increases
  24. information
  25. instructing
  26. intelligence
  27. knew
  28. lead
  29. learning
  30. leaves
  31. legal
  32. loses
  33. losing
  34. machine
  35. move
  36. named
  37. operating
  38. origins
  39. place
  40. play
  41. played
  42. plays
  43. prediction
  44. probability
  45. problem
  46. program
  47. samuel
  48. science
  49. scientist
  50. score
  51. simply
  52. small
  53. strategy
  54. surpasses
  55. task
  56. taught
  57. throw
  58. understand
  59. winning
  60. wins
  61. won
  62. wrote