Outside In Search Export

Outside In Search Export extracts the text and metadata of nearly 500 supported file types and converts it into XML, HTML or text specifically designed for search and forensic applications. This SDK offers a rich feature set and the option of four output formats:

  • SearchML: Lightweight XML containing text, embeddings and metadata optimized for search and text extraction;
  • SearchHTML: HTML optimized for Web crawlers but with limited display formatting;
  • SearchText: Plain text file (UTF-8 encoded Unicode) with properties and body text from the input file;
  • PageML: XML which provides paginated text.

Its use is appropriate for search, forensics or any application that needs to extract content and convert it into a format conducive to post-processing and analysis.

Left Curve
Popular Downloads
Right Curve
Untitled Document

Left Curve
More Middleware Downloads
Right Curve