Latent Semantic Indexing (LSI) is the core machine-learning technique that enables Content Analyst's CAAT and Cerebrant products to identify, represent, and compare concepts that exist within a collection of documents or other forms of unstructured content.
LSI is essentially a mathematical approach to text analytics designed to extract every contextual relation among every term in every text object within a collection. It then generates a vector-space representation of all terms based on those relations using many dimensions (often in the hundreds). Within that space, proximity is a good indicator of conceptual similarity. The result: similarities can be identified based on concepts within the material. Since LSI is mathematics-based, it requires no word lists, taxonomies, or thesauri for CAAT and Cerebrant to accurately identify conceptually related text.
The Content Analyst team has evolved the original concept of LSI to create major extensions and refinements to the technology and the company now holds a number of patents related to text analytics. In addition, Content Analyst provides industry-leading integration support services to ensure fast, seamless integration and continuous enhancements.
There are two ill-founded misconceptions regarding LSI technology. The first questions scalability and the second seeks to compare open source LSI libraries with Content Analyst's proprietary CAAT analytics engine.
CAAT is Content Analyst's proprietary text analytics engine that was built using LSI technology for much of its core functionality including conceptual search, clustering, auto-categorization, instant context and conceptual near-duplicate detection. LSI+ is Content Analyst's extensively-enhanced application of LSI and it is only one part of CAAT's broad set of text analytics capabilities. Content Analyst has had dozens of developers enhancing, strengthening and augmenting the core LSI capabilities for more than 10 years. The company's dramatically expanded LSI+ technology has made many significant enhancements to the core LSI technology giving it scalability as well as incomparable functionality to support even the most sensitive and rigorous text analytics applications.
Content Analyst's dedicated software development team has sustained uninterrupted innovation and enhancement to its LSI+ technology for more than 10 years. Our roadmap promises continuous additional enhancements to maintain the company's incomparable reputation for superior analytics capabilities among the extensive community of software product companies depending on CAAT and LSI+ for their customers.