Lingua::YaTeA

Perl extension for extracting terms from a corpus and providing a syntactic analysis in a head-modifier format.
Download

Lingua::YaTeA Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Thierry Hamon
  • Publisher web site:
  • http://search.cpan.org/~thhamon/

Lingua::YaTeA Tags


Lingua::YaTeA Description

Perl extension for extracting terms from a corpus and providing a syntactic analysis in a head-modifier format. Lingua::YaTeA is a Perl extension for extracting terms from a corpus and providing a syntactic analysis in a head-modifier format.SYNOPSISuse Lingua::YaTeA::YaTeA;my %config = Lingua::YaTeA::load_config($rcfile);$yatea = Lingua::YaTeA->new($config{"OPTIONS"}, \%config);$corpus = Lingua::YaTeA::Corpus->new($corpus_path,$yatea->getOptionSet,$yatea->getMessageSet);$yatea->termExtraction($corpus);This module is the main module of the software named YaTeA. It aims at extracting noun phrases that look like terms from a corpus. It provides their syntactic analysis in a head-modifier format. As an input, the term extractor requires a corpus which has been segmented into words and sentences, lemmatized and tagged with part-of-speech (POS) information. The implementation of this term extractor allows to process large corpora. Data provided with YaTeA allow to extract terms from English and French texts. But new linguistic features can be integrated to extract terms from another language. Moreover, linguistic features can be modified or created for a sub-language or tagset.The main strategy of analysis of the term candidates is based on the exploitation of simple parsing patterns and endogenous disambiguation. Exogenous disambiguation is also made possible for the identification and the analysis of term candidates by the use of external resources, i.e. lists of testified terms. Requirements: · Perl


Lingua::YaTeA Related Software