Nettet som korpus ved flersproglig term- og vidensbearbejdning
MetadataVis full innførsel
OriginalversjonSYNAPS - A Journal of Professional Communication 11(2002) pp.18-31
The web as a corpus for multilingual term and knowledge extraction has an important potential in cases where a translator or knowledge engineer needs to identify unknown equivalents. With the expansion of knowledge and parallel creation of new terms in practically all fields, the need to search for unknown equivalents occurs more and more frequently. It is a major barrier multilingual information exchange that relevant multi- or bilingual term repositories and even monolingual resources are usually updated long after new phenomena have been born and baptised. This paper discusses methods and tools for translation-oriented knowledge extraction from the uncontrolled mass of texts on the web. The main focus will be on ways of identifying candidate target language terms in cases where no clues are at hand. Subsequently, smart searches may be used to validate the degree of equivalence between source terms and candidate equivalents. The proposed method is referred to as "multilingual term and knowledge extraction through clustered searches".