Summary: | Recently, more and more people have the preference for obtaining the latest news and posting their views relying on social media. In this way, some opinion leaders would ultimately get a large number of followers. Because of the significant influence imposed by their social accounts, some of them start to post native advertisements in their articles, and the articles that fall within the scope of such a category are generally known as content marketing articles. However, the content marketing articles have the tendency of going viral for the lack of supervision. For instance, some of them include misleading information, which, as a result, would do great harm to the benefits of ordinary consumers. In this paper, we take the initiative to deal with this problem and propose a fundamental approach for the purpose of detecting the content marketing articles based on the semantic features. In accordance with the characteristics shown by the content marketing articles, a novel approach is proposed to enhance the detection based on the sentence and word graph analysis. We extract both the graph-related and community-related features from the graphs of the two types, respectively. After that, a supervised classifier is trained based on a manually labeled dataset, and the evaluation is also conducted for its effectiveness by employing extensive experiments. Finally, the results show that the combination of features of different kinds can improve detection accuracy and recall significantly. Apart from that, an algorithm is also developed to extract the advertising content in a detected content marketing article for the aim of helping remove illegal advertisements from social platforms. Finally, relevant analysis is carried out for the writing patterns of content marketing articles on WeChat Subscription, and some interesting findings are discovered.
|