Nettet som korpus ved flersproglig term- og vidensbearbejdning

2011

SYNAPS - A Journal of Professional Communication 11(2002) pp.18-31

The web as a corpus for multilingual term and knowledge extraction has an important

potential in cases where a translator or knowledge engineer needs to identify unknown

equivalents. With the expansion of knowledge and parallel creation of new terms in

practically all fields, the need to search for unknown equivalents occurs more and more

frequently. It is a major barrier multilingual information exchange that relevant multi- or

bilingual term repositories and even monolingual resources are usually updated long after new

phenomena have been born and baptised. This paper discusses methods and tools for

translation-oriented knowledge extraction from the uncontrolled mass of texts on the web.

The main focus will be on ways of identifying candidate target language terms in cases where

no clues are at hand. Subsequently, smart searches may be used to validate the degree of

equivalence between source terms and candidate equivalents. The proposed method is

referred to as "multilingual term and knowledge extraction through clustered searches".

NHH

SYNAPS - A Journal of Professional Communication