A text annotation method based on semantic sequences
This paper presents a text annotation method based on semantic sequences to label a document and a cluster of documents. The basic idea underlying the semantic sequence approach is to find locally frequent meanings to act as the labels of a document, using an ontology such as WordNet. The ontology is also used to measure the semantic similarity of labels that indicate similarity between documents. Further, a text clustering method based upon four natural rules is introduced to cluster documents and label each cluster. This method does not need any pre-defined number of clusters, which is necessary for the partitioning clustering method, and avoids the need to set appropriate levels as in the hierarachical clustering method.
Item Type | Article |
---|---|
Uncontrolled Keywords | semantic sequences; text annotation; WordNet; clustering |
Date Deposited | 14 Nov 2024 10:45 |
Last Modified | 14 Nov 2024 10:45 |