Nettet som korpus ved flersproglig term- og vidensbearbejdning
Original version
SYNAPS - A Journal of Professional Communication 11(2002) pp.18-31Abstract
The web as a corpus for multilingual term and knowledge extraction has an important
potential in cases where a translator or knowledge engineer needs to identify unknown
equivalents. With the expansion of knowledge and parallel creation of new terms in
practically all fields, the need to search for unknown equivalents occurs more and more
frequently. It is a major barrier multilingual information exchange that relevant multi- or
bilingual term repositories and even monolingual resources are usually updated long after new
phenomena have been born and baptised. This paper discusses methods and tools for
translation-oriented knowledge extraction from the uncontrolled mass of texts on the web.
The main focus will be on ways of identifying candidate target language terms in cases where
no clues are at hand. Subsequently, smart searches may be used to validate the degree of
equivalence between source terms and candidate equivalents. The proposed method is
referred to as "multilingual term and knowledge extraction through clustered searches".