full transcript
"From the Ted Talk by Kenneth Cukier: Big data is better data"

Unscramble the Blue Letters

Machine learning is at the bsais of many of the things that we do online: search engines, Amazon's personalization algorithm, computer translation, voice recognition systems. Researchers recently have looked at the question of biopsies, cancerous biopsies, and they've asked the computer to identify by looking at the data and svvuaril rates to demnierte whether cells are actually cancerous or not, and sure enough, when you tohrw the data at it, through a machine-learning algorithm, the mnachie was able to itfedniy the 12 telltale signs that best predict that this biposy of the breast cancer cells are indeed cancerous. The problem: The medical literature only knew nine of them. Three of the traits were ones that people didn't need to look for, but that the machine spotted.

Open Cloze

Machine learning is at the _____ of many of the things that we do online: search engines, Amazon's personalization algorithm, computer translation, voice recognition systems. Researchers recently have looked at the question of biopsies, cancerous biopsies, and they've asked the computer to identify by looking at the data and ________ rates to _________ whether cells are actually cancerous or not, and sure enough, when you _____ the data at it, through a machine-learning algorithm, the _______ was able to ________ the 12 telltale signs that best predict that this ______ of the breast cancer cells are indeed cancerous. The problem: The medical literature only knew nine of them. Three of the traits were ones that people didn't need to look for, but that the machine spotted.

Solution

  1. biopsy
  2. throw
  3. basis
  4. determine
  5. identify
  6. machine
  7. survival

Original Text

Machine learning is at the basis of many of the things that we do online: search engines, Amazon's personalization algorithm, computer translation, voice recognition systems. Researchers recently have looked at the question of biopsies, cancerous biopsies, and they've asked the computer to identify by looking at the data and survival rates to determine whether cells are actually cancerous or not, and sure enough, when you throw the data at it, through a machine-learning algorithm, the machine was able to identify the 12 telltale signs that best predict that this biopsy of the breast cancer cells are indeed cancerous. The problem: The medical literature only knew nine of them. Three of the traits were ones that people didn't need to look for, but that the machine spotted.

ngrams of length 2

collocation frequency
big data 16
arthur samuel 6
machine learning 5

Important Words

  1. algorithm
  2. asked
  3. basis
  4. biopsies
  5. biopsy
  6. breast
  7. cancer
  8. cancerous
  9. cells
  10. computer
  11. data
  12. determine
  13. engines
  14. identify
  15. knew
  16. learning
  17. literature
  18. looked
  19. machine
  20. medical
  21. people
  22. personalization
  23. predict
  24. question
  25. rates
  26. recognition
  27. researchers
  28. search
  29. signs
  30. spotted
  31. survival
  32. systems
  33. telltale
  34. throw
  35. traits
  36. translation
  37. voice