Computer Program Spots Fake Product Reviews
Really worth five stars? A computer scientists developed a mathematical way to spot fake reviews on sites such as Amazon or TripAdvisor.
CREDIT: Chatchawan | Shutterstock.com
A new computer program can spot fake online product reviews by determining whether the distribution of one-star through five-star scores appears natural or not.
A graph of an animal population in the wild often has a distinctive shape, with the animal’s numbers rising and falling in cycles in response to how many predators there are around or how much food there is to eat.
One team of computer scientists applied this knowledge to online reviews. They
discovered that when they graphed the number of one-star through five-star reviews on different review sites, they found distinctive graph shapes. Fake reviews knock those shapes askew, they said, which lets them mathematically scan for what may be paid-for positive feedback.
Companies often pay people to write glowing reviews, as many news stories have reported, because a high average star rating online is so good for business. At the same time, review sites are on constant lookout for ways to weed out the fakers, to maintain the trustworthiness of their sites.
A group of computer scientists at Stony Brook University in New York analyzed reviews for almost 4,000 hotels on TripAdvisor and more than 700,000 products on Amazon. They graphed how the reviews were distributed among different star ratings, then looked for unusual shapes.
For example, some of their graphs showed that one-time reviewers are more likely to give out five-star reviews than people who have written many reviews. That’s not necessarily suspicious, as many one-time reviewers might have logged on because they had such a great time that they were motivated to write a review when they normally wouldn’t.
However, compared to products with lower average ratings, products with the highest average star ratings had more five-star reviews from one-time reviewers than from regular reviews. That is, the highest-rated products had a disproportionate number of people who had apparently logged on only once, to give a rave review. That suggests that some of the highest-rated products are padded with fake reviews, the researchers wrote in their paper, which they will present June 5 at a social media conference held by the Association for the Advancement of Artificial Intelligence.
Using what they learned from their graphs of how reviews are distributed, the researchers developed three mathematical ways to ferret out fakers. Their best method was 72 percent accurate at identifying non-genuine reviews. That makes their method less accurate than an important computer program developed last year at Cornell University, which spotted fakers with almost 90 percent accuracy.
This new method could complement the Cornell method, the Stony Brook researchers wrote. The paper the Cornell team wrote pointed out some of the linguistic quirks of fakers, such as their overuse of pronouns such as “I” and “my husband.” (There’s nothing stopping fakers from learning to avoid those mistakes, however.)
The new method is also easier to apply to many different sites than the Cornell method, which required hiring people to write fake reviews, creating a database their program could scan for patterns. The Stony Brook method doesn’t need a database of fake reviews, only data on how reviews are distributed.
Fake reviews are nearly impossible to tell from real ones. People perform only slightly better than chance at finding imposters, the Cornell team discovered in their 2011 study. Yet it appears the hired hands leave fingerprints behind, not only in their words, but in their very numbers.