LSI is the machine-learning technique that enables CAAT technology to identify, represent, and compare concepts that exist within a collection of documents or data.
LSI is a mathematical approach to text analytics designed to extract every contextual relation among every term in every text object within a collection. It then generates a vector-space representation of all terms based on those relations using many dimensions (often in the hundreds). Within that space, proximity is a good indicator of conceptual similarity. The result: similarities can be identified based on concepts within the material. Since LSI is mathematics-based, it requires no word lists, taxonomies, or thesauri for CAAT to accurately identify conceptually similar text.
The Content Analyst team has evolved the original concept of LSI to create major extensions and refinements to the technology and the company now holds a number of patents related to text analytics.
Please Refer to our White Papers Section to read more about LSI.

Copyright 2012 Content Analyst Company, LLC All rights reserved.