Summary: | Due to the remarkable increase in e-commerce transactions, people try to have an appropriate choice of purchase through considering other people's reflected experience in product's or service's reviews. Automatic analysis of such corpus requires enhanced developed algorithms based on natural language processing and opinion mining. Moreover, the linguistic differences make extending existing algorithms from one language to another challenging and in some cases impossible. Opinion mining focuses on different subjects of review analysis such as spam detection, aspect elicitation and polarity allocation. In this article, we focus on detection of explicit aspect and propose a methodology to overcome some difficult and problematic aspect compounds in the form of multi- words format in Persian language. Our approach proposes the construction of a directed weighted graph (ADG structure) based on some yielded information from FP-Growth frequent pattern identification algorithm on our corpus of Persian sentence. Traversing some special paths within the ADG graph according to our developed rules could lead us to the extraction of problematic multi-word aspects. We utilize Neo4j NoSQL graph database environment and its Cypher query language in order to create the ADG graph and access the desired paths that reflects our developed rules on the ADG structure which lead us to extract the multi-word aspects. The evaluation of our methodology with the existing approaches on the issue of aspect derivation in Persian language including ELDA, SAM, an MMI-based and an LRT-based algorithms indicates the robustness of our approach.
|