The purpose of a corporate semantic web is to let searchers find what authors have written. In both roles, i.e. as a searcher and as an author, people tend to express their knowledge and questions in
- inconsistent ways
- using inherently variable natural language.
Moreover and realistically, neither the searchers' nor the authors' language can be normed, which means that one needs to be semantically mapped to the other. This mapping requires background knowledge, because the meaning of something is not contained in its representation. For example, there is nothing big in the sequence of b-i-g, nor is there anything small in the sequence of s-m-a-l-l.
An internet search engine like Google can learn this background knowledge, a.k.a. "term semantics", using statistical methods. This approach is not feasible in the realm of corporate semantic webs, because
- there are far fewer documents with far fewer linkages,
- documents originate from many more diverse sources and
- there are far fewer users (and these users behave as mentioned above).
This is why — for the time being — term semantics should be declared manually in a corporate thesaurus by pragmatic/dynamic/periodic corporate terminology management which is characterized by
- pragmatically accepting preliminary gaps, simplifications and inexactitudes,
- avoiding irrelevant modelings and
- correcting thesaurus items dynamically on the occasion of periodic thesaurus reviews.