full transcript

From the Ted Talk by Kenneth Cukier: Big data is better data

Unscramble the Blue Letters

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most isevmisrpe areas where this concept is taking pacle is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to simply throw data at the proelbm and tell the computer to figure it out for itself. And it will help you untnsaderd it by seeing its oingris. In the 1950s, a computer scientist at IBM named aruhtr Samuel liked to play checkers, so he wrote a computer pgorram so he could play against the computer. He palyed. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the bncgurkoad, and all it did was score the probability that a given board configuration would likely lead to a winning board versus a losing board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur Samuel leaves the computer to play itself. It plays itself. It clotelcs more data. It collects more data. It increases the accuracy of its prediction. And then Arthur Samuel goes back to the ceumotpr and he plays it, and he loses, and he plays it, and he loses, and he plyas it, and he loses, and Arthur Samuel has created a machine that surpasses his atbiliy in a task that he taught it.

Open Cloze

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most __________ areas where this concept is taking _____ is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to simply throw data at the _______ and tell the computer to figure it out for itself. And it will help you __________ it by seeing its _______. In the 1950s, a computer scientist at IBM named ______ Samuel liked to play checkers, so he wrote a computer _______ so he could play against the computer. He ______. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the __________, and all it did was score the probability that a given board configuration would likely lead to a winning board versus a losing board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur Samuel leaves the computer to play itself. It plays itself. It ________ more data. It collects more data. It increases the accuracy of its prediction. And then Arthur Samuel goes back to the ________ and he plays it, and he loses, and he plays it, and he loses, and he _____ it, and he loses, and Arthur Samuel has created a machine that surpasses his _______ in a task that he taught it.

Solution

  1. program
  2. collects
  3. impressive
  4. played
  5. origins
  6. ability
  7. understand
  8. place
  9. plays
  10. arthur
  11. background
  12. problem
  13. computer

Original Text

So what is the value of big data? Well, think about it. You have more information. You can do things that you couldn't do before. One of the most impressive areas where this concept is taking place is in the area of machine learning. Machine learning is a branch of artificial intelligence, which itself is a branch of computer science. The general idea is that instead of instructing a computer what do do, we are going to simply throw data at the problem and tell the computer to figure it out for itself. And it will help you understand it by seeing its origins. In the 1950s, a computer scientist at IBM named Arthur Samuel liked to play checkers, so he wrote a computer program so he could play against the computer. He played. He won. He played. He won. He played. He won, because the computer only knew what a legal move was. Arthur Samuel knew something else. Arthur Samuel knew strategy. So he wrote a small sub-program alongside it operating in the background, and all it did was score the probability that a given board configuration would likely lead to a winning board versus a losing board after every move. He plays the computer. He wins. He plays the computer. He wins. He plays the computer. He wins. And then Arthur Samuel leaves the computer to play itself. It plays itself. It collects more data. It collects more data. It increases the accuracy of its prediction. And then Arthur Samuel goes back to the computer and he plays it, and he loses, and he plays it, and he loses, and he plays it, and he loses, and Arthur Samuel has created a machine that surpasses his ability in a task that he taught it.

Frequently Occurring Word Combinations

ngrams of length 2

collocation frequency
big data 14
arthur samuel 6
machine learning 4
favorite pie 2
supermarket sales 2
smaller amounts 2
term big 2
small data 2
national security 2
security agency 2
martin luther 2
telltale signs 2
samuel knew 2

ngrams of length 3

collocation frequency
term big data 2
national security agency 2
arthur samuel knew 2

Important Words

  1. ability
  2. accuracy
  3. area
  4. areas
  5. arthur
  6. artificial
  7. background
  8. big
  9. board
  10. branch
  11. checkers
  12. collects
  13. computer
  14. concept
  15. configuration
  16. created
  17. data
  18. figure
  19. general
  20. ibm
  21. idea
  22. impressive
  23. increases
  24. information
  25. instructing
  26. intelligence
  27. knew
  28. lead
  29. learning
  30. leaves
  31. legal
  32. loses
  33. losing
  34. machine
  35. move
  36. named
  37. operating
  38. origins
  39. place
  40. play
  41. played
  42. plays
  43. prediction
  44. probability
  45. problem
  46. program
  47. samuel
  48. science
  49. scientist
  50. score
  51. simply
  52. small
  53. strategy
  54. surpasses
  55. task
  56. taught
  57. throw
  58. understand
  59. winning
  60. wins
  61. won
  62. wrote