Extraction of Phrasal Verbs from the Comparable English Corpus of Legal Texts

This paper presents a corpus-based approach to semi-automatic extraction of English phrasal verbs, very productive, but complex and often non-transparent lexical units, via particles (prepositions, adverbs) they consist of and which are among the top-ranking functional words in the list of running w...

Full description

Bibliographic Details
Main Authors: Marija Bilić, Angelina Gaspar
Format: Article
Language:English
Published: Lasting Impressions Press 2018-06-01
Series:International Journal of English Language and Translation Studies
Subjects:
Online Access:http://www.eltsjournal.org/archive/value6%20issue2/21-6-2-18.pdf
Description
Summary:This paper presents a corpus-based approach to semi-automatic extraction of English phrasal verbs, very productive, but complex and often non-transparent lexical units, via particles (prepositions, adverbs) they consist of and which are among the top-ranking functional words in the list of running words of the British National Corpus (BNC). The research is carried out on a comparable English corpus of publicly available legal texts consisting of 392 255 words and using WordSmith Tools 6.0. The evaluation of the system efficiency is conducted via the statistical measures of Precision, Recall and F-measure, whereas the list of phrasal verbs is checked against the reference source Cambridge Phrasal Verbs Dictionary (2015). The results show that the process of semi-automatic extraction of phrasal verbs requires a considerable human intervention as well as control via their verbal segments since it revealed instances of wrong phrasal verb usage. Furthermore, the results point to the low frequency of phrasal verbs in legal texts since they account for only 2% in the total number of words, and their unequal distribution since 5 most frequent phrasal verbs account for nearly half, and 25 for more than 90% of all such items. Finally, tendency towards nominalisation of phrasal verbs, which is in line with the nature of legal language, is evident, especially in the texts originally written in English.
ISSN:2308-5460
2308-5460