Vis enkel innførsel

dc.contributor.authorAndersen, Gisle
dc.date.accessioned2016-06-24T07:42:30Z
dc.date.available2016-06-24T07:42:30Z
dc.date.issued2011
dc.identifier.citationSYNAPS - A Journal of Professional Communication 26(2013)nb_NO
dc.identifier.issn1893-0506
dc.identifier.urihttp://hdl.handle.net/11250/2393973
dc.description.abstractMultiword expressions are words that co-occur so often that they are perceived as a linguistic unit (Stubbs 2007). Identifying them correctly is important for a variety of tasks within terminology, lexicography and language technology. This paper presents a methodology for the systematic and corpus-driven study of multiword expressions in Norwegian. It reports on a series of experiments using a variety of different association measures in order to identify multiword expressions that occur in a large corpus consisting of Norwegian newspapers (Andersen & Hofland forthcoming). The output of each association measure is a ranked list of bigrams and trigrams in the corpus. The value of different association measures for terminology purposes is assessed by considering the relevance and salience of ranked candidates among the bigrams and trigrams in the data. It is shown that the association measures differ greatly in their ability to pick out relevant term candidates. The paper also briefly evaluates the corpus itself and its relevance for terminology work (Kristiansen Forthcoming).nb_NO
dc.language.isoengnb_NO
dc.publisherNHHnb_NO
dc.titleEvaluation of alternative association measures for extraction of terminology based on a large Norwegian corpusnb_NO
dc.typeJournal articlenb_NO
dc.source.pagenumber62-68nb_NO
dc.source.volume26nb_NO
dc.source.journalSYNAPS - A Journal of Professional Communicationnb_NO


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel