full transcript

From the Ted Talk by Madhumita Murgia: How data brokers sell your identity

Unscramble the Blue Letters

I'm a 26-year-old British Asian wmoan working in media and linivg in a South West postcode in London. I have previously lived at two adsedsers in Sussex, and two others in North East London. While gonwrig up, my family lived in a detached house in Kent and took holidays to India every year. They mostly did their shopping online at Ocado, gave menoy to charities and read the Financial Times. Now, I live in a recently coeernvtd flat with a private landlord, and I have a housemate. I'm interested in movies and startups, and I have taken five holidays in the past 12 mnthos, mostly to visit friends abroad. I'm about to buy flights within 14 days. My annual salary is between 30,000 and 40,000 pounds a year. I don't own a TV or watch any scheduled pomnmairgrg, but I do enojy on-demand services such as Netflix or Now TV. Last week, I passed through ueppr Street in North London on Monday and waedndsey evnngies at 7 p.m. I cook a little, but I tend to eat out or get tyaakewas often. My favourite cuisines are Thai and Mexican food. I don't own any furniture, and I don't have any children. On weeknights, I tend to spend the evenings with my uevirnitsy friends having dinner. I usually buy my groceries at Sainsbury's but only because it's on my way home. I don't care for cars or own one. I don't like any form of housework, and I have a cnleaer who lets herself in while I'm at work. On Fridays, you'll find me at the pub after work. At home, I'm far more likely to be browsing restaurant reviews rather than managing my finances or looking at property prices online. I like the idea of living abroad someday. I prefer to work as a team than on my own. I'm ambitious, and it's important to me that my many thinks I'm doing well. I'm rarley swayed by others' vwies. This mleoty set of characteristics, attteidus, thoughts, and desires come very close to defining me as a person. It is also a precise and accurate description of what a gruop of companies I had never hread of, personal data trackers, had learned about me. My journey to uncover what data companies knew began in 2014, when I became curious about the murky world of data brokers, a multi-billion-pound industry of companies that collect, package, and sell detailed profiles of individuals based on their online and offline behaviours. I decided to write about it for Wired Magazine. What I found out shocked me, and reinforced my ainitxees about a profit-led system designed to log behaviours every time we interact with the connected world. I already knew about my daily records being collected by services such as ggoole Maps, Search, Facebook, or contactless credit card transactions. But you combine that with public information such as land registry, council tax, or voter rdceros, along with my shopping habits and real-time health and location ioafimotnrn, and these benign data sets begin to reveal a lot, such as whether you're optimistic, political, ambitious, or a risk-taker. Even as you're listening to me, you may be sedentary, but your stmhranpoe can reveal your exact loticaon, and even your posture. Your life is being converted into such a data package to be sold on. Ultimately, you are the product. Ostensibly, we're all protected by data protection laws. In the UK, the law states that any personal data set has to be stripped of identifiers such as your name or your National Insurance number. prsoanel data is considered anything that can be traced directly back to you. without the need for adoaiditnl information. This doesn't mean it can't be sold on. It only menas that they need your permission. Simple examples of personal data include your full credit card nemubr, your bank statement, or a criminal rcerod. However, I drevoicesd that oninle anonymity is a complete myth. Particulars such as your postcode, your date of birth, and your gedner can be traded freely and without your prsiimseon because they're not considered personal but pseudonymous. In other wrdos, they can't be traced back to you without the need for additional information. So why does it matter if a bnuch of companies you've never heard of know your age or your postcode, you may think. Well, it mteatrs quite a lot. About a decade ago, Latanya Sweeney, a professor of privacy at hrraavd University proved that about 87% of US citizens could be uuielnqy identified by just three facts about them: their zip code, their date of birth, and their gender. In the UK, where we have far fewer citizens serviced by much longer postcodes, that probability is far higher. Professor Sweeney proved this in a rather cheeky way when wliliam Weld, a former governor of cdarmigbe, Massachusetts, in the US decided to support the commercial release of 135,000 sttae employee hletah records along with their flaiiems, including his own. These records did not contain a name or a social security number, but did contain hundreds of fields of sensitive medical information icundling dgurs preriscebd, hospitalisations, and procedures performed on these emypeelos. For $20, poeforssr Sweeney pcreahusd the voter records for Cambridge, Massachusetts, containing the names, zip codes, dates of birth, and gender for every vtoer in the area, and then cross-referenced this with their health records. Within minutes, she had pinpointed Governor Welds' own health records. Only six people in Cambridge saherd his date of birth. Three of them were men. And he was the only one living in his zip code. Professor Sweeney sent the governor his health records in the post. (Laughter) Every day, we hear about new examples of companies digging ever deeper into our personal lives. In the nvbeeomr US presidential election, a little-known British company known as Cambridge Analytica was tasked with winning the election for a certain candidate: daonld Trump, using data analytics. The cpnaomy eyolmepd cookies online to track people around the web, lggiong every website visited, every search term tepyd, and every video watched. They also created a viral Facebook quiz to dig into people's personalities, which was taken by over six million ppeole. In total, they managed to amass data on 220 million vntoig Americans with an argveae of about 5,000 pieces of data on each peorsn. They then used this data to understand people's inner feelings and then targeted adverts to them on Facebook. Researchers have called them a pdgonrapaa machine. It's not just lgrae companies dginigg into your life; it's free apps and salml startups as well. I rlaeeisd on my phone that every time I logged fitness data into the app Endomondo, it was sharing my details including my location and gender with third-party advertisers. WebMD, a symptom chkerecs app, was sharing even more sensitive information including the symptoms, procedures, and drugs viewed by users within its app with its third parties. Fitbit was sanhrig data with yohao. A pregnancy tracking app was slnelig on information about its users' ovulation cycles and fertility cycles with people or advertisers like inmboi. As long as my phone is turned on, my location can be treakcd, not just by the obvious apps like Google Maps, but a whole host of unrelated seirvces from Uber to Twitter, Photos, Snapchat, TripAdvisor, and others. You're not even safe in your own home. In 2015, Samsung was found to be recording people in the homes in which their TVs had been sold using their voice riocgineton systems. They have now adapted this so they only record when the voice recognition is activated. But the creepy factor remains. Even services like Google and Facebook, trusted and used by bliilnos around the world, have been accused of crossing the line. A few weeks ago, my hsnubad and I were dvrinig home from work and discussing where we should have dinner. I sgeetgusd a restaurant that I knew was somewhere on our way back and then opened up Google Maps to plot it. Turns out it was already mkerad on the map with a little bubble. That sikning feeling of being wcatehd is not unique to me. There have been several anecdotal reports of people being shwon adverts based on things and cneoovnastris they were having in real life, prompting concerns that Facebook and Google are eavesdropping on people via their personal dvieces. To piece together what all these companies knew about me, I spoke to a data profiler called Eyeota. Eyeota uses cookies to assign me to thousands of different categories, including my job, how many children I have, and whether I'm likely to buy Star Wars memorabilia. (Laughter) They don't know my name, but they know more about me than my nurbehoigs do. Eyeota also buys information from third parties such as the credit rating agency Experian, which amasses a missave database of 15 different demographic types and 66 lifestyles, all based on people's post codes. Because Eyeota buys this information, it knows that I'm more likely to take taxis home rather than night besus late at night and that I'm very, very unlikely to ever be found in a DIY store. (lugaethr) It can then sell this information on to the highest bidder. Sometimes, large data sets can be useful for the public good, for example for the use of health researchers or city and urban plnerans. But most of this information being collected is sustained by advertisers and traded calmloiercmy. In fact, eMarketer has predicted that the online advertising industry, which is based almost completely on data targeting and tracking, will hit an all-time high of 77 biiloln dollars this year. If you think you don't care about being unmasked, you may want to reconsider. Personalised browser ads may be harmless, but connecting disparate aspects of your life to predict your fturue behaviour could lead to unexpected consequences. For instance, decisions on whether your child gets to go to a certain university or what price you pay for your home or car insurance permmuis could be made based on data given to third pitears that you never identned to, such as your own lifestyle habits or family members' ailetmns. In 2014, Ross Anderson, a professor of Privacy and sucietry at Cambridge University found that the NHS had been sharing its hospitals' database, which idlunced details of hospitalisations for every citizen in Britain with the ittusitne and fcuatly of Actuaries, a body that was researching how likely people are to devleop chronic illnesses at certain ages. Of course, this resulted in an increase in health insurance premiums. As the amount of data that is collected increases exponentially, it becomes much easier to identify you. For example, your Fitbit measures our heart rate or your gait patterns and these can be used to estimate things like your height, your wgheit, or even your gender. These are details that are very hard to miimc or change. The data is no leongr about you. It is you. cneaopims are also starting to pcidret future behaviours - for example, whether you're a trustworthy driver, a good employee, or a good credit risk, based on things like your social media activity, your health and fitness, or your home energy use. The more the companies know about you - where you live, how many children you have, what your medical ailments are, what you buy - your anionmtyy becomes ielveranrt. What's more, you lose your right to free choice, as companies make decisions on your behalf without your knowledge. Along my journey of dirceovsy, my first reaction was shock. I immediately wrote to my local council and asked them to make my voter records private. I made up a fake email adderss, and I started registering with a fake age and gender. I turned off targeted advertising, and I aksed Facebook to send me all the information that they held on me, including things I had deleted, and spent hours poring over it obsessively. But after a few weeks I realised this was a psntelios exercise. I couldn't be a digital hemirt. It wasn't realistic for me to stop using social media, scareh and navigation apps, and my iPhone, all a part of mdoren life that I cherished and needed. Instead, I realised that the knowledge itself was empowering. Knowing all the different ways in which my data was being shared and collected made me more responsible about where I put it. For example, I stopped signing up to supposedly free services, for example, a VIP card at my local hairdresser or a discount coupon at your sreapmkreut. Whenever I download an app, I make sure to check my snegtits to see what permissions it has. Anything that seems unnecessary like access to my location, I turn off. Ultimately, there is hope. As more of us begin to realise the enxett of our data footprint, we will start to demand custody and control of this data. Some critics have even suggested that people be paid for their data in order to give them more control. This means it will become too expensive for companies, gornmvetnes, and non-profits to recklessly mine and hold our data, and sell it on ilacrnimsediitny But until the data enoocmy matures, and pewor meovs back from the corporation to the idianduivl, I have lost more than my anonymity. I have given up my right to self-determination and free cohcie. All I have left is my name. Thank you. (Applause)

Open Cloze

I'm a 26-year-old British Asian _____ working in media and ______ in a South West postcode in London. I have previously lived at two _________ in Sussex, and two others in North East London. While _______ up, my family lived in a detached house in Kent and took holidays to India every year. They mostly did their shopping online at Ocado, gave _____ to charities and read the Financial Times. Now, I live in a recently _________ flat with a private landlord, and I have a housemate. I'm interested in movies and startups, and I have taken five holidays in the past 12 ______, mostly to visit friends abroad. I'm about to buy flights within 14 days. My annual salary is between 30,000 and 40,000 pounds a year. I don't own a TV or watch any scheduled ___________, but I do _____ on-demand services such as Netflix or Now TV. Last week, I passed through _____ Street in North London on Monday and _________ ________ at 7 p.m. I cook a little, but I tend to eat out or get _________ often. My favourite cuisines are Thai and Mexican food. I don't own any furniture, and I don't have any children. On weeknights, I tend to spend the evenings with my __________ friends having dinner. I usually buy my groceries at Sainsbury's but only because it's on my way home. I don't care for cars or own one. I don't like any form of housework, and I have a _______ who lets herself in while I'm at work. On Fridays, you'll find me at the pub after work. At home, I'm far more likely to be browsing restaurant reviews rather than managing my finances or looking at property prices online. I like the idea of living abroad someday. I prefer to work as a team than on my own. I'm ambitious, and it's important to me that my many thinks I'm doing well. I'm ______ swayed by others' _____. This ______ set of characteristics, _________, thoughts, and desires come very close to defining me as a person. It is also a precise and accurate description of what a _____ of companies I had never _____ of, personal data trackers, had learned about me. My journey to uncover what data companies knew began in 2014, when I became curious about the murky world of data brokers, a multi-billion-pound industry of companies that collect, package, and sell detailed profiles of individuals based on their online and offline behaviours. I decided to write about it for Wired Magazine. What I found out shocked me, and reinforced my _________ about a profit-led system designed to log behaviours every time we interact with the connected world. I already knew about my daily records being collected by services such as ______ Maps, Search, Facebook, or contactless credit card transactions. But you combine that with public information such as land registry, council tax, or voter _______, along with my shopping habits and real-time health and location ___________, and these benign data sets begin to reveal a lot, such as whether you're optimistic, political, ambitious, or a risk-taker. Even as you're listening to me, you may be sedentary, but your __________ can reveal your exact ________, and even your posture. Your life is being converted into such a data package to be sold on. Ultimately, you are the product. Ostensibly, we're all protected by data protection laws. In the UK, the law states that any personal data set has to be stripped of identifiers such as your name or your National Insurance number. ________ data is considered anything that can be traced directly back to you. without the need for __________ information. This doesn't mean it can't be sold on. It only _____ that they need your permission. Simple examples of personal data include your full credit card ______, your bank statement, or a criminal ______. However, I __________ that ______ anonymity is a complete myth. Particulars such as your postcode, your date of birth, and your ______ can be traded freely and without your __________ because they're not considered personal but pseudonymous. In other _____, they can't be traced back to you without the need for additional information. So why does it matter if a _____ of companies you've never heard of know your age or your postcode, you may think. Well, it _______ quite a lot. About a decade ago, Latanya Sweeney, a professor of privacy at _______ University proved that about 87% of US citizens could be ________ identified by just three facts about them: their zip code, their date of birth, and their gender. In the UK, where we have far fewer citizens serviced by much longer postcodes, that probability is far higher. Professor Sweeney proved this in a rather cheeky way when _______ Weld, a former governor of _________, Massachusetts, in the US decided to support the commercial release of 135,000 _____ employee ______ records along with their ________, including his own. These records did not contain a name or a social security number, but did contain hundreds of fields of sensitive medical information _________ _____ __________, hospitalisations, and procedures performed on these _________. For $20, _________ Sweeney _________ the voter records for Cambridge, Massachusetts, containing the names, zip codes, dates of birth, and gender for every _____ in the area, and then cross-referenced this with their health records. Within minutes, she had pinpointed Governor Welds' own health records. Only six people in Cambridge ______ his date of birth. Three of them were men. And he was the only one living in his zip code. Professor Sweeney sent the governor his health records in the post. (Laughter) Every day, we hear about new examples of companies digging ever deeper into our personal lives. In the ________ US presidential election, a little-known British company known as Cambridge Analytica was tasked with winning the election for a certain candidate: ______ Trump, using data analytics. The _______ ________ cookies online to track people around the web, _______ every website visited, every search term _____, and every video watched. They also created a viral Facebook quiz to dig into people's personalities, which was taken by over six million ______. In total, they managed to amass data on 220 million ______ Americans with an _______ of about 5,000 pieces of data on each ______. They then used this data to understand people's inner feelings and then targeted adverts to them on Facebook. Researchers have called them a __________ machine. It's not just _____ companies _______ into your life; it's free apps and _____ startups as well. I ________ on my phone that every time I logged fitness data into the app Endomondo, it was sharing my details including my location and gender with third-party advertisers. WebMD, a symptom ________ app, was sharing even more sensitive information including the symptoms, procedures, and drugs viewed by users within its app with its third parties. Fitbit was _______ data with _____. A pregnancy tracking app was _______ on information about its users' ovulation cycles and fertility cycles with people or advertisers like ______. As long as my phone is turned on, my location can be _______, not just by the obvious apps like Google Maps, but a whole host of unrelated ________ from Uber to Twitter, Photos, Snapchat, TripAdvisor, and others. You're not even safe in your own home. In 2015, Samsung was found to be recording people in the homes in which their TVs had been sold using their voice ___________ systems. They have now adapted this so they only record when the voice recognition is activated. But the creepy factor remains. Even services like Google and Facebook, trusted and used by ________ around the world, have been accused of crossing the line. A few weeks ago, my _______ and I were _______ home from work and discussing where we should have dinner. I _________ a restaurant that I knew was somewhere on our way back and then opened up Google Maps to plot it. Turns out it was already ______ on the map with a little bubble. That _______ feeling of being _______ is not unique to me. There have been several anecdotal reports of people being _____ adverts based on things and _____________ they were having in real life, prompting concerns that Facebook and Google are eavesdropping on people via their personal _______. To piece together what all these companies knew about me, I spoke to a data profiler called Eyeota. Eyeota uses cookies to assign me to thousands of different categories, including my job, how many children I have, and whether I'm likely to buy Star Wars memorabilia. (Laughter) They don't know my name, but they know more about me than my __________ do. Eyeota also buys information from third parties such as the credit rating agency Experian, which amasses a _______ database of 15 different demographic types and 66 lifestyles, all based on people's post codes. Because Eyeota buys this information, it knows that I'm more likely to take taxis home rather than night _____ late at night and that I'm very, very unlikely to ever be found in a DIY store. (________) It can then sell this information on to the highest bidder. Sometimes, large data sets can be useful for the public good, for example for the use of health researchers or city and urban ________. But most of this information being collected is sustained by advertisers and traded ____________. In fact, eMarketer has predicted that the online advertising industry, which is based almost completely on data targeting and tracking, will hit an all-time high of 77 _______ dollars this year. If you think you don't care about being unmasked, you may want to reconsider. Personalised browser ads may be harmless, but connecting disparate aspects of your life to predict your ______ behaviour could lead to unexpected consequences. For instance, decisions on whether your child gets to go to a certain university or what price you pay for your home or car insurance ________ could be made based on data given to third _______ that you never ________ to, such as your own lifestyle habits or family members' ________. In 2014, Ross Anderson, a professor of Privacy and ________ at Cambridge University found that the NHS had been sharing its hospitals' database, which ________ details of hospitalisations for every citizen in Britain with the _________ and _______ of Actuaries, a body that was researching how likely people are to _______ chronic illnesses at certain ages. Of course, this resulted in an increase in health insurance premiums. As the amount of data that is collected increases exponentially, it becomes much easier to identify you. For example, your Fitbit measures our heart rate or your gait patterns and these can be used to estimate things like your height, your ______, or even your gender. These are details that are very hard to _____ or change. The data is no ______ about you. It is you. _________ are also starting to _______ future behaviours - for example, whether you're a trustworthy driver, a good employee, or a good credit risk, based on things like your social media activity, your health and fitness, or your home energy use. The more the companies know about you - where you live, how many children you have, what your medical ailments are, what you buy - your _________ becomes __________. What's more, you lose your right to free choice, as companies make decisions on your behalf without your knowledge. Along my journey of _________, my first reaction was shock. I immediately wrote to my local council and asked them to make my voter records private. I made up a fake email _______, and I started registering with a fake age and gender. I turned off targeted advertising, and I _____ Facebook to send me all the information that they held on me, including things I had deleted, and spent hours poring over it obsessively. But after a few weeks I realised this was a _________ exercise. I couldn't be a digital ______. It wasn't realistic for me to stop using social media, ______ and navigation apps, and my iPhone, all a part of ______ life that I cherished and needed. Instead, I realised that the knowledge itself was empowering. Knowing all the different ways in which my data was being shared and collected made me more responsible about where I put it. For example, I stopped signing up to supposedly free services, for example, a VIP card at my local hairdresser or a discount coupon at your ___________. Whenever I download an app, I make sure to check my ________ to see what permissions it has. Anything that seems unnecessary like access to my location, I turn off. Ultimately, there is hope. As more of us begin to realise the ______ of our data footprint, we will start to demand custody and control of this data. Some critics have even suggested that people be paid for their data in order to give them more control. This means it will become too expensive for companies, ___________, and non-profits to recklessly mine and hold our data, and sell it on ________________ But until the data _______ matures, and _____ _____ back from the corporation to the __________, I have lost more than my anonymity. I have given up my right to self-determination and free ______. All I have left is my name. Thank you. (Applause)

Solution

  1. takeaways
  2. uniquely
  3. health
  4. future
  5. planners
  6. company
  7. enjoy
  8. record
  9. converted
  10. months
  11. motley
  12. professor
  13. modern
  14. google
  15. sinking
  16. faculty
  17. shown
  18. marked
  19. information
  20. group
  21. recognition
  22. anonymity
  23. laughter
  24. donald
  25. number
  26. conversations
  27. institute
  28. intended
  29. bunch
  30. views
  31. personal
  32. billion
  33. weight
  34. premiums
  35. average
  36. permission
  37. commercially
  38. logging
  39. develop
  40. employed
  41. rarely
  42. supermarket
  43. parties
  44. employees
  45. asked
  46. william
  47. longer
  48. selling
  49. money
  50. addresses
  51. additional
  52. evenings
  53. online
  54. extent
  55. checkers
  56. driving
  57. services
  58. heard
  59. typed
  60. moves
  61. cleaner
  62. means
  63. address
  64. cambridge
  65. individual
  66. drugs
  67. companies
  68. devices
  69. power
  70. pointless
  71. words
  72. including
  73. choice
  74. woman
  75. suggested
  76. discovery
  77. yahoo
  78. purchased
  79. prescribed
  80. person
  81. sharing
  82. small
  83. tracked
  84. included
  85. voting
  86. programming
  87. husband
  88. matters
  89. neighbours
  90. attitudes
  91. shared
  92. upper
  93. irrelevant
  94. ailments
  95. smartphone
  96. anxieties
  97. watched
  98. billions
  99. buses
  100. voter
  101. economy
  102. records
  103. mimic
  104. university
  105. harvard
  106. growing
  107. gender
  108. state
  109. wednesday
  110. digging
  111. realised
  112. discovered
  113. people
  114. search
  115. families
  116. indiscriminately
  117. hermit
  118. propaganda
  119. living
  120. large
  121. inmobi
  122. governments
  123. security
  124. location
  125. predict
  126. november
  127. massive
  128. settings

Original Text

I'm a 26-year-old British Asian woman working in media and living in a South West postcode in London. I have previously lived at two addresses in Sussex, and two others in North East London. While growing up, my family lived in a detached house in Kent and took holidays to India every year. They mostly did their shopping online at Ocado, gave money to charities and read the Financial Times. Now, I live in a recently converted flat with a private landlord, and I have a housemate. I'm interested in movies and startups, and I have taken five holidays in the past 12 months, mostly to visit friends abroad. I'm about to buy flights within 14 days. My annual salary is between 30,000 and 40,000 pounds a year. I don't own a TV or watch any scheduled programming, but I do enjoy on-demand services such as Netflix or Now TV. Last week, I passed through Upper Street in North London on Monday and Wednesday evenings at 7 p.m. I cook a little, but I tend to eat out or get takeaways often. My favourite cuisines are Thai and Mexican food. I don't own any furniture, and I don't have any children. On weeknights, I tend to spend the evenings with my university friends having dinner. I usually buy my groceries at Sainsbury's but only because it's on my way home. I don't care for cars or own one. I don't like any form of housework, and I have a cleaner who lets herself in while I'm at work. On Fridays, you'll find me at the pub after work. At home, I'm far more likely to be browsing restaurant reviews rather than managing my finances or looking at property prices online. I like the idea of living abroad someday. I prefer to work as a team than on my own. I'm ambitious, and it's important to me that my many thinks I'm doing well. I'm rarely swayed by others' views. This motley set of characteristics, attitudes, thoughts, and desires come very close to defining me as a person. It is also a precise and accurate description of what a group of companies I had never heard of, personal data trackers, had learned about me. My journey to uncover what data companies knew began in 2014, when I became curious about the murky world of data brokers, a multi-billion-pound industry of companies that collect, package, and sell detailed profiles of individuals based on their online and offline behaviours. I decided to write about it for Wired Magazine. What I found out shocked me, and reinforced my anxieties about a profit-led system designed to log behaviours every time we interact with the connected world. I already knew about my daily records being collected by services such as Google Maps, Search, Facebook, or contactless credit card transactions. But you combine that with public information such as land registry, council tax, or voter records, along with my shopping habits and real-time health and location information, and these benign data sets begin to reveal a lot, such as whether you're optimistic, political, ambitious, or a risk-taker. Even as you're listening to me, you may be sedentary, but your smartphone can reveal your exact location, and even your posture. Your life is being converted into such a data package to be sold on. Ultimately, you are the product. Ostensibly, we're all protected by data protection laws. In the UK, the law states that any personal data set has to be stripped of identifiers such as your name or your National Insurance number. Personal data is considered anything that can be traced directly back to you. without the need for additional information. This doesn't mean it can't be sold on. It only means that they need your permission. Simple examples of personal data include your full credit card number, your bank statement, or a criminal record. However, I discovered that online anonymity is a complete myth. Particulars such as your postcode, your date of birth, and your gender can be traded freely and without your permission because they're not considered personal but pseudonymous. In other words, they can't be traced back to you without the need for additional information. So why does it matter if a bunch of companies you've never heard of know your age or your postcode, you may think. Well, it matters quite a lot. About a decade ago, Latanya Sweeney, a professor of privacy at Harvard University proved that about 87% of US citizens could be uniquely identified by just three facts about them: their zip code, their date of birth, and their gender. In the UK, where we have far fewer citizens serviced by much longer postcodes, that probability is far higher. Professor Sweeney proved this in a rather cheeky way when William Weld, a former governor of Cambridge, Massachusetts, in the US decided to support the commercial release of 135,000 state employee health records along with their families, including his own. These records did not contain a name or a social security number, but did contain hundreds of fields of sensitive medical information including drugs prescribed, hospitalisations, and procedures performed on these employees. For $20, Professor Sweeney purchased the voter records for Cambridge, Massachusetts, containing the names, zip codes, dates of birth, and gender for every voter in the area, and then cross-referenced this with their health records. Within minutes, she had pinpointed Governor Welds' own health records. Only six people in Cambridge shared his date of birth. Three of them were men. And he was the only one living in his zip code. Professor Sweeney sent the governor his health records in the post. (Laughter) Every day, we hear about new examples of companies digging ever deeper into our personal lives. In the November US presidential election, a little-known British company known as Cambridge Analytica was tasked with winning the election for a certain candidate: Donald Trump, using data analytics. The company employed cookies online to track people around the web, logging every website visited, every search term typed, and every video watched. They also created a viral Facebook quiz to dig into people's personalities, which was taken by over six million people. In total, they managed to amass data on 220 million voting Americans with an average of about 5,000 pieces of data on each person. They then used this data to understand people's inner feelings and then targeted adverts to them on Facebook. Researchers have called them a propaganda machine. It's not just large companies digging into your life; it's free apps and small startups as well. I realised on my phone that every time I logged fitness data into the app Endomondo, it was sharing my details including my location and gender with third-party advertisers. WebMD, a symptom checkers app, was sharing even more sensitive information including the symptoms, procedures, and drugs viewed by users within its app with its third parties. Fitbit was sharing data with Yahoo. A pregnancy tracking app was selling on information about its users' ovulation cycles and fertility cycles with people or advertisers like InMobi. As long as my phone is turned on, my location can be tracked, not just by the obvious apps like Google Maps, but a whole host of unrelated services from Uber to Twitter, Photos, Snapchat, TripAdvisor, and others. You're not even safe in your own home. In 2015, Samsung was found to be recording people in the homes in which their TVs had been sold using their voice recognition systems. They have now adapted this so they only record when the voice recognition is activated. But the creepy factor remains. Even services like Google and Facebook, trusted and used by billions around the world, have been accused of crossing the line. A few weeks ago, my husband and I were driving home from work and discussing where we should have dinner. I suggested a restaurant that I knew was somewhere on our way back and then opened up Google Maps to plot it. Turns out it was already marked on the map with a little bubble. That sinking feeling of being watched is not unique to me. There have been several anecdotal reports of people being shown adverts based on things and conversations they were having in real life, prompting concerns that Facebook and Google are eavesdropping on people via their personal devices. To piece together what all these companies knew about me, I spoke to a data profiler called Eyeota. Eyeota uses cookies to assign me to thousands of different categories, including my job, how many children I have, and whether I'm likely to buy Star Wars memorabilia. (Laughter) They don't know my name, but they know more about me than my neighbours do. Eyeota also buys information from third parties such as the credit rating agency Experian, which amasses a massive database of 15 different demographic types and 66 lifestyles, all based on people's post codes. Because Eyeota buys this information, it knows that I'm more likely to take taxis home rather than night buses late at night and that I'm very, very unlikely to ever be found in a DIY store. (Laughter) It can then sell this information on to the highest bidder. Sometimes, large data sets can be useful for the public good, for example for the use of health researchers or city and urban planners. But most of this information being collected is sustained by advertisers and traded commercially. In fact, eMarketer has predicted that the online advertising industry, which is based almost completely on data targeting and tracking, will hit an all-time high of 77 billion dollars this year. If you think you don't care about being unmasked, you may want to reconsider. Personalised browser ads may be harmless, but connecting disparate aspects of your life to predict your future behaviour could lead to unexpected consequences. For instance, decisions on whether your child gets to go to a certain university or what price you pay for your home or car insurance premiums could be made based on data given to third parties that you never intended to, such as your own lifestyle habits or family members' ailments. In 2014, Ross Anderson, a professor of Privacy and Security at Cambridge University found that the NHS had been sharing its hospitals' database, which included details of hospitalisations for every citizen in Britain with the Institute and Faculty of Actuaries, a body that was researching how likely people are to develop chronic illnesses at certain ages. Of course, this resulted in an increase in health insurance premiums. As the amount of data that is collected increases exponentially, it becomes much easier to identify you. For example, your Fitbit measures our heart rate or your gait patterns and these can be used to estimate things like your height, your weight, or even your gender. These are details that are very hard to mimic or change. The data is no longer about you. It is you. Companies are also starting to predict future behaviours - for example, whether you're a trustworthy driver, a good employee, or a good credit risk, based on things like your social media activity, your health and fitness, or your home energy use. The more the companies know about you - where you live, how many children you have, what your medical ailments are, what you buy - your anonymity becomes irrelevant. What's more, you lose your right to free choice, as companies make decisions on your behalf without your knowledge. Along my journey of discovery, my first reaction was shock. I immediately wrote to my local council and asked them to make my voter records private. I made up a fake email address, and I started registering with a fake age and gender. I turned off targeted advertising, and I asked Facebook to send me all the information that they held on me, including things I had deleted, and spent hours poring over it obsessively. But after a few weeks I realised this was a pointless exercise. I couldn't be a digital hermit. It wasn't realistic for me to stop using social media, search and navigation apps, and my iPhone, all a part of modern life that I cherished and needed. Instead, I realised that the knowledge itself was empowering. Knowing all the different ways in which my data was being shared and collected made me more responsible about where I put it. For example, I stopped signing up to supposedly free services, for example, a VIP card at my local hairdresser or a discount coupon at your supermarket. Whenever I download an app, I make sure to check my settings to see what permissions it has. Anything that seems unnecessary like access to my location, I turn off. Ultimately, there is hope. As more of us begin to realise the extent of our data footprint, we will start to demand custody and control of this data. Some critics have even suggested that people be paid for their data in order to give them more control. This means it will become too expensive for companies, governments, and non-profits to recklessly mine and hold our data, and sell it on indiscriminately But until the data economy matures, and power moves back from the corporation to the individual, I have lost more than my anonymity. I have given up my right to self-determination and free choice. All I have left is my name. Thank you. (Applause)

Frequently Occurring Word Combinations

ngrams of length 2

collocation frequency
personal data 4
health records 4
professor sweeney 3
companies knew 2
credit card 2
data sets 2
additional information 2
information including 2
voter records 2
companies digging 2
voice recognition 2
insurance premiums 2

Important Words

  1. access
  2. accurate
  3. accused
  4. activated
  5. activity
  6. actuaries
  7. adapted
  8. additional
  9. address
  10. addresses
  11. ads
  12. advertisers
  13. advertising
  14. adverts
  15. age
  16. agency
  17. ages
  18. ailments
  19. amass
  20. amasses
  21. ambitious
  22. americans
  23. amount
  24. analytica
  25. analytics
  26. anderson
  27. anecdotal
  28. annual
  29. anonymity
  30. anxieties
  31. app
  32. applause
  33. apps
  34. area
  35. asian
  36. asked
  37. aspects
  38. assign
  39. attitudes
  40. average
  41. bank
  42. based
  43. began
  44. behalf
  45. behaviour
  46. behaviours
  47. benign
  48. bidder
  49. billion
  50. billions
  51. birth
  52. body
  53. britain
  54. british
  55. brokers
  56. browser
  57. browsing
  58. bubble
  59. bunch
  60. buses
  61. buy
  62. buys
  63. called
  64. cambridge
  65. car
  66. card
  67. care
  68. cars
  69. categories
  70. change
  71. characteristics
  72. charities
  73. check
  74. checkers
  75. cheeky
  76. cherished
  77. child
  78. children
  79. choice
  80. chronic
  81. citizen
  82. citizens
  83. city
  84. cleaner
  85. close
  86. code
  87. codes
  88. collect
  89. collected
  90. combine
  91. commercial
  92. commercially
  93. companies
  94. company
  95. complete
  96. completely
  97. concerns
  98. connected
  99. connecting
  100. consequences
  101. considered
  102. contactless
  103. control
  104. conversations
  105. converted
  106. cook
  107. cookies
  108. corporation
  109. council
  110. coupon
  111. created
  112. credit
  113. creepy
  114. criminal
  115. critics
  116. crossing
  117. cuisines
  118. curious
  119. custody
  120. cycles
  121. daily
  122. data
  123. database
  124. date
  125. dates
  126. day
  127. days
  128. decade
  129. decided
  130. decisions
  131. deeper
  132. defining
  133. deleted
  134. demand
  135. demographic
  136. description
  137. designed
  138. desires
  139. detached
  140. detailed
  141. details
  142. develop
  143. devices
  144. dig
  145. digging
  146. digital
  147. dinner
  148. discount
  149. discovered
  150. discovery
  151. discussing
  152. disparate
  153. diy
  154. dollars
  155. donald
  156. download
  157. driver
  158. driving
  159. drugs
  160. easier
  161. east
  162. eat
  163. eavesdropping
  164. economy
  165. election
  166. email
  167. emarketer
  168. employed
  169. employee
  170. employees
  171. empowering
  172. endomondo
  173. energy
  174. enjoy
  175. estimate
  176. evenings
  177. exact
  178. examples
  179. exercise
  180. expensive
  181. experian
  182. exponentially
  183. extent
  184. eyeota
  185. facebook
  186. fact
  187. factor
  188. facts
  189. faculty
  190. fake
  191. families
  192. family
  193. favourite
  194. feeling
  195. feelings
  196. fertility
  197. fields
  198. finances
  199. financial
  200. find
  201. fitbit
  202. fitness
  203. flat
  204. flights
  205. food
  206. footprint
  207. form
  208. free
  209. freely
  210. fridays
  211. friends
  212. full
  213. furniture
  214. future
  215. gait
  216. gave
  217. gender
  218. give
  219. good
  220. google
  221. governments
  222. governor
  223. groceries
  224. group
  225. growing
  226. habits
  227. hairdresser
  228. hard
  229. harmless
  230. harvard
  231. health
  232. hear
  233. heard
  234. heart
  235. height
  236. held
  237. hermit
  238. high
  239. higher
  240. highest
  241. hit
  242. hold
  243. holidays
  244. home
  245. homes
  246. hope
  247. hospitalisations
  248. host
  249. hours
  250. house
  251. housemate
  252. housework
  253. hundreds
  254. husband
  255. idea
  256. identified
  257. identifiers
  258. identify
  259. illnesses
  260. immediately
  261. important
  262. include
  263. included
  264. including
  265. increase
  266. increases
  267. india
  268. indiscriminately
  269. individual
  270. individuals
  271. industry
  272. information
  273. inmobi
  274. instance
  275. institute
  276. insurance
  277. intended
  278. interact
  279. interested
  280. iphone
  281. irrelevant
  282. job
  283. journey
  284. kent
  285. knew
  286. knowing
  287. knowledge
  288. land
  289. landlord
  290. large
  291. latanya
  292. late
  293. laughter
  294. law
  295. laws
  296. lead
  297. learned
  298. left
  299. lets
  300. life
  301. lifestyle
  302. lifestyles
  303. line
  304. listening
  305. live
  306. lived
  307. lives
  308. living
  309. local
  310. location
  311. log
  312. logged
  313. logging
  314. london
  315. long
  316. longer
  317. lose
  318. lost
  319. lot
  320. machine
  321. magazine
  322. managed
  323. managing
  324. map
  325. maps
  326. marked
  327. massachusetts
  328. massive
  329. matter
  330. matters
  331. matures
  332. means
  333. measures
  334. media
  335. medical
  336. memorabilia
  337. men
  338. mexican
  339. million
  340. mimic
  341. minutes
  342. modern
  343. monday
  344. money
  345. months
  346. motley
  347. moves
  348. movies
  349. murky
  350. myth
  351. names
  352. national
  353. navigation
  354. needed
  355. neighbours
  356. netflix
  357. nhs
  358. night
  359. north
  360. november
  361. number
  362. obsessively
  363. obvious
  364. ocado
  365. offline
  366. online
  367. opened
  368. optimistic
  369. order
  370. ostensibly
  371. ovulation
  372. package
  373. paid
  374. part
  375. particulars
  376. parties
  377. passed
  378. patterns
  379. pay
  380. people
  381. performed
  382. permission
  383. permissions
  384. person
  385. personal
  386. personalised
  387. personalities
  388. phone
  389. photos
  390. piece
  391. pieces
  392. pinpointed
  393. planners
  394. plot
  395. pointless
  396. political
  397. poring
  398. post
  399. postcode
  400. postcodes
  401. posture
  402. pounds
  403. power
  404. precise
  405. predict
  406. predicted
  407. prefer
  408. pregnancy
  409. premiums
  410. prescribed
  411. presidential
  412. previously
  413. price
  414. prices
  415. privacy
  416. private
  417. probability
  418. procedures
  419. product
  420. professor
  421. profiler
  422. profiles
  423. programming
  424. prompting
  425. propaganda
  426. property
  427. protected
  428. protection
  429. proved
  430. pseudonymous
  431. pub
  432. public
  433. purchased
  434. put
  435. quiz
  436. rarely
  437. rate
  438. rating
  439. reaction
  440. read
  441. real
  442. realise
  443. realised
  444. realistic
  445. recklessly
  446. recognition
  447. reconsider
  448. record
  449. recording
  450. records
  451. registering
  452. registry
  453. reinforced
  454. release
  455. remains
  456. reports
  457. researchers
  458. researching
  459. responsible
  460. restaurant
  461. resulted
  462. reveal
  463. reviews
  464. risk
  465. ross
  466. safe
  467. salary
  468. samsung
  469. scheduled
  470. search
  471. security
  472. sedentary
  473. sell
  474. selling
  475. send
  476. sensitive
  477. serviced
  478. services
  479. set
  480. sets
  481. settings
  482. shared
  483. sharing
  484. shock
  485. shocked
  486. shopping
  487. shown
  488. signing
  489. simple
  490. sinking
  491. small
  492. smartphone
  493. snapchat
  494. social
  495. sold
  496. south
  497. spend
  498. spent
  499. spoke
  500. star
  501. start
  502. started
  503. starting
  504. startups
  505. state
  506. statement
  507. states
  508. stop
  509. stopped
  510. store
  511. street
  512. stripped
  513. suggested
  514. supermarket
  515. support
  516. supposedly
  517. sussex
  518. sustained
  519. swayed
  520. sweeney
  521. symptom
  522. symptoms
  523. system
  524. systems
  525. takeaways
  526. targeted
  527. targeting
  528. tasked
  529. tax
  530. taxis
  531. team
  532. tend
  533. term
  534. thai
  535. thinks
  536. thoughts
  537. thousands
  538. time
  539. times
  540. total
  541. traced
  542. track
  543. tracked
  544. trackers
  545. tracking
  546. traded
  547. transactions
  548. tripadvisor
  549. trump
  550. trusted
  551. trustworthy
  552. turn
  553. turned
  554. turns
  555. tv
  556. tvs
  557. twitter
  558. typed
  559. types
  560. uber
  561. uk
  562. ultimately
  563. uncover
  564. understand
  565. unexpected
  566. unique
  567. uniquely
  568. university
  569. unmasked
  570. unnecessary
  571. unrelated
  572. upper
  573. urban
  574. users
  575. video
  576. viewed
  577. views
  578. vip
  579. viral
  580. visit
  581. visited
  582. voice
  583. voter
  584. voting
  585. wars
  586. watch
  587. watched
  588. ways
  589. web
  590. webmd
  591. website
  592. wednesday
  593. week
  594. weeknights
  595. weeks
  596. weight
  597. weld
  598. west
  599. william
  600. winning
  601. wired
  602. woman
  603. words
  604. work
  605. working
  606. world
  607. write
  608. wrote
  609. yahoo
  610. year
  611. zip