full transcript

From the Ted Talk by Joseph Redmon: How computers learn to recognize objects instantly

Unscramble the Blue Letters

Now, with this kind of result, we can do a lot more with our computer vosiin algorithms. We see that it knows that there's a cat and a dog. It knows their relative locations, their size. It may even know some extra information. There's a book snititg in the bagrcounkd. And if you want to build a system on top of cepuotmr vision, say a self-driving vhilcee or a robiotc system, this is the kind of information that you want. You want something so that you can itecnrat with the physical world. Now, when I started working on object detection, it took 20 seconds to process a single image. And to get a feel for why speed is so important in this domian, here's an example of an object detector that takes two seconds to process an image. So this is 10 times faster than the 20-seconds-per-image detector, and you can see that by the time it makes predictions, the entire state of the world has changed, and this wouldn't be very useful for an application.

Open Cloze

Now, with this kind of result, we can do a lot more with our computer ______ algorithms. We see that it knows that there's a cat and a dog. It knows their relative locations, their size. It may even know some extra information. There's a book _______ in the __________. And if you want to build a system on top of ________ vision, say a self-driving _______ or a _______ system, this is the kind of information that you want. You want something so that you can ________ with the physical world. Now, when I started working on object detection, it took 20 seconds to process a single image. And to get a feel for why speed is so important in this ______, here's an example of an object detector that takes two seconds to process an image. So this is 10 times faster than the 20-seconds-per-image detector, and you can see that by the time it makes predictions, the entire state of the world has changed, and this wouldn't be very useful for an application.

Solution

  1. background
  2. domain
  3. robotic
  4. computer
  5. vehicle
  6. sitting
  7. vision
  8. interact

Original Text

Now, with this kind of result, we can do a lot more with our computer vision algorithms. We see that it knows that there's a cat and a dog. It knows their relative locations, their size. It may even know some extra information. There's a book sitting in the background. And if you want to build a system on top of computer vision, say a self-driving vehicle or a robotic system, this is the kind of information that you want. You want something so that you can interact with the physical world. Now, when I started working on object detection, it took 20 seconds to process a single image. And to get a feel for why speed is so important in this domain, here's an example of an object detector that takes two seconds to process an image. So this is 10 times faster than the 20-seconds-per-image detector, and you can see that by the time it makes predictions, the entire state of the world has changed, and this wouldn't be very useful for an application.

Frequently Occurring Word Combinations

ngrams of length 2

collocation frequency
computer vision 5
object detection 4
real time 3
neural network 2
bounding boxes 2
times faster 2
detection system 2
stop signs 2

Important Words

  1. algorithms
  2. application
  3. background
  4. book
  5. build
  6. cat
  7. changed
  8. computer
  9. detection
  10. detector
  11. dog
  12. domain
  13. entire
  14. extra
  15. faster
  16. feel
  17. image
  18. important
  19. information
  20. interact
  21. kind
  22. locations
  23. lot
  24. object
  25. physical
  26. predictions
  27. process
  28. relative
  29. result
  30. robotic
  31. seconds
  32. single
  33. sitting
  34. size
  35. speed
  36. started
  37. state
  38. system
  39. takes
  40. time
  41. times
  42. top
  43. vehicle
  44. vision
  45. working
  46. world