Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique

Nagy, Á.: The Characteristics of Terminology of Sciences in relation to Family from point of view of Terminological Extraction According to Eugen Wüster, terms are lexical units that belong to a scientific domain where they are connected to a concept that they denote; therefore, terms have to hav...

Full description

Bibliographic Details
Main Author: Ágoston Nagy
Format: Article
Language:ces
Published: Ostium 2013-12-01
Series:Ostium
Subjects:
Online Access:http://www.ostium.sk/index.php?mod=magazine&act=show&aid=514
_version_ 1811262832597008384
author Ágoston Nagy
author_facet Ágoston Nagy
author_sort Ágoston Nagy
collection DOAJ
description Nagy, Á.: The Characteristics of Terminology of Sciences in relation to Family from point of view of Terminological Extraction According to Eugen Wüster, terms are lexical units that belong to a scientific domain where they are connected to a concept that they denote; therefore, terms have to have a precise definition. In the term extraction process, terms can mainly be recognised by morphosyntactic patterns: for example, noun+noun is a typical term pattern in French (e.g. navigateur web). One of the aims of this article is to find the typical term patterns and their frequency in the domain of social sciences. For this reason, three articles were chosen as corpus in the social sciences domain with the criterion that they include frequently the words famille ’family’ and/or individu ’individual’. In the three articles, all terms were manually annotated. The other aim of this article is to compare the frequencies of the term patterns in social sciences with the results of previous research on terms of a corpus of computer science. The further aim of this analysis is to determine whether an automatic term extractor fine-tuned for texts on computer science could also be used on a corpus of social sciences. In order to achieve this goal, problematic patterns – like adjectives preceding the nominal head in a term – are also examined. The results showed that the IT corpus followed the same tendency as the corpus on human sciences; however, juxtaposed nouns are less frequent in the latter which prefers the noun-adjective sequence. Concerning the problematic patterns, the two corpora did not show important differences: their presence is minimal in both (~7%). So the same rule-based extractor could work well on both corpora; however, psychological and sociological terms are more frequently used in common language, which makes statistical filtering more difficult.
first_indexed 2024-04-12T19:33:56Z
format Article
id doaj.art-391efa3e5987448eb162757625b93648
institution Directory Open Access Journal
issn 1336-6556
language ces
last_indexed 2024-04-12T19:33:56Z
publishDate 2013-12-01
publisher Ostium
record_format Article
series Ostium
spelling doaj.art-391efa3e5987448eb162757625b936482022-12-22T03:19:16ZcesOstiumOstium1336-65562013-12-0194Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologiqueÁgoston Nagy0Szegedi Tudományegyetem, Francia Nyelvi és Irodalmi Tanszék, Szeged, HungaryNagy, Á.: The Characteristics of Terminology of Sciences in relation to Family from point of view of Terminological Extraction According to Eugen Wüster, terms are lexical units that belong to a scientific domain where they are connected to a concept that they denote; therefore, terms have to have a precise definition. In the term extraction process, terms can mainly be recognised by morphosyntactic patterns: for example, noun+noun is a typical term pattern in French (e.g. navigateur web). One of the aims of this article is to find the typical term patterns and their frequency in the domain of social sciences. For this reason, three articles were chosen as corpus in the social sciences domain with the criterion that they include frequently the words famille ’family’ and/or individu ’individual’. In the three articles, all terms were manually annotated. The other aim of this article is to compare the frequencies of the term patterns in social sciences with the results of previous research on terms of a corpus of computer science. The further aim of this analysis is to determine whether an automatic term extractor fine-tuned for texts on computer science could also be used on a corpus of social sciences. In order to achieve this goal, problematic patterns – like adjectives preceding the nominal head in a term – are also examined. The results showed that the IT corpus followed the same tendency as the corpus on human sciences; however, juxtaposed nouns are less frequent in the latter which prefers the noun-adjective sequence. Concerning the problematic patterns, the two corpora did not show important differences: their presence is minimal in both (~7%). So the same rule-based extractor could work well on both corpora; however, psychological and sociological terms are more frequently used in common language, which makes statistical filtering more difficult.http://www.ostium.sk/index.php?mod=magazine&act=show&aid=514TerminologyTerm ExtractionTermSyntactic Pattern of TermsPrepositional Phrase
spellingShingle Ágoston Nagy
Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
Ostium
Terminology
Term Extraction
Term
Syntactic Pattern of Terms
Prepositional Phrase
title Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
title_full Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
title_fullStr Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
title_full_unstemmed Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
title_short Les caractéristiques de la terminologie des sciences relatives à la famille du point de vue de l’extraction terminologique
title_sort les caracteristiques de la terminologie des sciences relatives a la famille du point de vue de l extraction terminologique
topic Terminology
Term Extraction
Term
Syntactic Pattern of Terms
Prepositional Phrase
url http://www.ostium.sk/index.php?mod=magazine&act=show&aid=514
work_keys_str_mv AT agostonnagy lescaracteristiquesdelaterminologiedessciencesrelativesalafamilledupointdevuedelextractionterminologique