• home contact us
  • who we are
  • chanllenges we solve
  • markets we serve
  • technologies we provide
  • who we are
  • Home
  • | Main
  • | Value
  • | Technology
  • | Support

Technology Overview

Latent Semantic Indexing (LSI)

Concept Search

Concept-Based Categorization

Dynamic Clustering

Email Thread Analysis

Near-Duplicate Document Identification

Difference Highlighting

Instant Context

Language Analytics

Automatic Summarization

CAAT Software Developers Kit



Latent Semantic Indexing (LSI) Technology

LSI is the machine-learning technique that enables CAAT technology to identify, represent, and compare concepts that exist within a collection of documents or data.

LSI is a mathematical approach to text analytics designed to extract every contextual relation among every term in every text object within a collection. It then generates a vector-space representation of all terms based on those relations using many dimensions (often in the hundreds). Within that space, proximity is a good indicator of conceptual similarity. The result: similarities can be identified based on concepts within the material. Since LSI is mathematics-based, it requires no word lists, taxonomies, or thesauri for CAAT to accurately identify conceptually similar text.

The Content Analyst team has evolved the original concept of LSI to create major extensions and refinements to the technology and the company now holds a number of patents related to text analytics.

Please Refer to our White Papers Section to read more about LSI.

 

  • cac_logo_img


Copyright 2012 Content Analyst Company, LLC All rights reserved.

  • COMPANY
  • About Us
  • Press Releases
  • Careers
  • Contact Us
  • PARTNERS
  • Partners
  • Approach
  • Get Started
  • ContentCare®
  • MARKETS
  • Legal
  • Intelligence
  • Brand Research
  • Compliance
  • Data LP
  • Forensics
  • SOLUTIONS
  • Value
  • Technology
  • Support
  • References