|
DYNAMIC CLUSTERING
CAAT’s powerful Dynamic Clustering is an automatic operation, using CAAT’s conceptual analytics to identify and cluster documents that contain conceptually similar content. Dynamic clustering doesn’t rely on previously identified topics of keywords. Instead, CAAT determines the conceptual content of each document in an index, then identifies which are conceptually similar, groups them together into clusters, and finally creates a name for each cluster that is representative of the documents contained in that cluster. In fact, CAAT will gain all the knowledge it need to accurately identify conceptually similar documents and resultant clusters from the group of documents being indexed.
Dynamic clustering provides a number of variable settings for users, including controls to determine how tightly, versus how generally, documents are clustered, and also how many levels (i.e., sub-clusters) should be created based on conceptual content. CAAT will even put documents that fail to achieve a specific threshold of conceptual similarity into a separate cluster for further analysis. There are a number of options for automatic title generation allowing users to control how CAAT identifies and names the clusters it creates.
CAAT’s powerful collection of filters ensure that “noise” like headers and footers, or email metadata, or even OCR errors, don’t distort CAAT’s ability to discern conceptual meaning and accurately cluster documents. Users can even provide “stop” and “go” words to manually identify encountered terminology that they want to either be ignored or specially treated during the clustering process (an example would be “spam” which is added to bulk-mail titling by a number of email solutions).
|
|