Summary: | Many customers rely on online reviews to make an informed decision about purchasing products and services. Unfortunately, fake reviews, which can mislead customers, are increasingly common. Therefore, there is a growing need for effective methods of detection. In this article, we present a case study showing research aimed at recognizing fake reviews in Google Maps places in Poland. First, we describe a method of construction and validation of a dataset, named GMR–PL (Google Maps Reviews—Polish), containing a selection of 18 thousand fake and genuine reviews in Polish. Next, we show how we used this dataset to train machine learning models to detect fake reviews and the accounts that published them. We also propose a novel metric for measuring the typicality of an account name and a metric for measuring the geographical dispersion of reviewed places. Initial recognition results were promising: we achieved an F1 score of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.92</mn></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.74</mn></mrow></semantics></math></inline-formula> when detecting fake accounts and reviews, respectively. We believe that our experience will help in creating real-life review datasets for other languages and, in turn, will help in research aimed at the detection of fake reviews on the Internet.
|