Twitter Data Reveal Habits that Make You Sick
The latest update to GermTracker, a model of disease outbreaks based entirely on Twitter data, gives a detailed look into how people's habits correlate with their likelihood of getting sick.
CREDIT: University of Rochester
A new computer model uses only your tweets to predict whether you'll be sick in the next week with 91 percent accuracy, say its creators. The model, called GermTracker, finds correlations between people's lifestyles and personalities and their likeliness of getting sick at any time, all without the people in the study submitting any surveys, or even knowing they're part of a study.
GermTracker's creators, two University of Rochester computer scientists named Adam Sadilek and Henry Kautz, first introduced the model last year. They've since added far more detail, allowing them to link health to things like a person's popularity, neighborhood quality and socioeconomic status.
"If you want to know, down to the individual level, how many people are sick in a population, you would have to survey the population, which is costly and time-consuming," Sadilek said in a statement. "Twitter and the technology we have developed allow us to do this passively, quickly and inexpensively."
In the future, GermTracker could supplement more traditional flu-tracking efforts, such as the U.S. Centers of Disease Control's estimates, Sadilek and Kautz wrote in a paper they presented today (Feb. 8) at a conference hosted by the Association for Computing Machinery. Researchers could also tweak the computer model to help companies predict customer behavior and preferences, Sadilek explains on his website.
How does it know all that?
Over the past few years, researchers have realized that people's digital habits can reveal a great deal about their physical health. For example, Google's Flu Trends estimates how many people around the world have the illness, based on Google searches for flu-related terms. Meanwhile, several research groups have identified disease outbreaks and trends using Twitter. [SEE ALSO: Homeland Security to Test Social Media as Disease-Tracking Tool]
What's new about GermTracker is its detailed look at individual people's lives. GermTracker figured out who's been sick and for how long, and it found what kinds of behaviors and personal attributes correlated with getting sick more often. The researchers tracked 6,237 people in New York City in this way, testing whether 70 different factors corresponded to people's illness rates.
To find who was ill at any given time, GermTracker automatically "read" tweets. The model didn't just look for keywords, either; it distinguished between relevant phrases such as "I've been sick and stuck in bed all day" and unrelated content like "I'm so sick of this traffic."
As for people's habits, tweets provided a wealth of information. The location data in tweets show GermTracker what places people visited, when they rode the subway and how often they encountered sick Tweeters. The system found that regular gym-goers get sick slightly more often, for example, though those who talked about going to the gym but never went got sick even more.
GermTracker inferred which Twitter users had higher social status by seeing which Twitter posts were re-tweeted and favorite most often. The model then correlated that popularity to getting sick less often.
Sadilek and Kautz also combined the locations of tweets with a map of polluted areas in New York to see if people living near pollution were more likely to get sick (They were).
There are a couple major limits to what GermTracker can do. For one, Twitter users aren't a representative sample of the population. Younger people and minorities are more likely to be active on Twitter, so a Twitter-derived model of illness may apply better to those people than to the overall population.
In addition, the model doesn't directly measure when people are sick, only when they tweet about being sick. Some people never tweet about illness, while others may say they're ill when a doctor wouldn't agree, making the model potentially inaccurate.
Still, GermTracker could be an important supplement to other disease models and studies aiming to discover risk factors for falling ill, Sadilek and Kautz wrote. Next, the researchers plan to work with the University of Rochester's medical center to study health on Twitter more in-depth.