File and Content Inspection

Enable deep content inspection with the Document Filters SDK

If your application relies on processing files it did not create, content inspection and identification is a crucial first step to success. Document Filters enables software developers to embed industry-leading content identification and inspection functionality into their solutions. By leveraging intelligent file identification to accurately inspect source content without relying on the filename extension, Document Filters gives your software the power of deep content inspection, format conversion, output manipulation and viewing — for virtually any type of document.

Discover the benefits of Document Filters file and content inspection functionality

  • Identify and inspect text and metadata of over 550 file formats including Word, Excel, PowerPoint, PDF, AutoCAD, ZIP, MSG, Visio and hundreds more
  • Perform optical character recognition (OCR) of document images to inspect contents
  • Analyze all text and metadata including previously hidden information such as tracked changes, comments, notes, annotations and embedded links
  • Identify and inspect contents of packaged, archived, compressed and other container files
  • Determine the true nature of content, ensuring that source information is accurately identified for filtering without relying on file-name extensions
  • Deploy it your way — Document Filters runs natively on 27 platforms, and flexible APIs give you the choice of language to integrate with your application

Document Filters has powered industry leading software products such as email archival, antivirus protection, content management, business intelligence, document imaging and intelligent capture for well over 25 years.

It is a powerful and proven SDK alternative to open-source options and other OEM solutions. Document Filters have been used for content mining and intelligence gathering applications like compliance systems, eDiscovery, text analytics and Lucene deployments.