dtSearch Corp.’s enterprise and developer text retrieval product line instantly searches terabytes of text across a broad variety of online and offline data types. All dtSearch products embed dtSearch’s proprietary document filters, enabling searching of not only file directories, but also emails and nested attachments, online dynamic as well as static data, and other databases (including BLOB data). At the core of the dtSearch product line is the dtSearch Engine.

The dtSearch Engine for Android beta will join the existing dtSearch Engine for Linux (native 64-bit/32-bit C++ and Java APIs) and dtSearch Engine for Win & .NET (native 64-bit/32-bit C++, Java and .NET APIs) in making available dtSearch’s instant searching and document filters for a wide range of Internet, Intranet and other commercial applications. (Please see www.dtsearch.com/casestudies.html for hundreds of developer case studies.)

Document Filters and Supported Data. dtSearch products can parse, index, search, display with highlighted hits, and extract content from (using the developer APIs) full-text and metadata in the following data types:

  • Web-ready static and dynamic content: support covers integrated image and text support in HTML, XML/XSL, PDF, ASP.NET, PHP, SharePoint, etc.
  • Other databases: support covers XML, Access, XBASE, CSV, etc.; dtSearch Engine APIs support SQL-type data along with the full-text of BLOB data.
  • MS Office formats: support covers integrated browser-ready image and text support in Word (RTF/DOC/DOCX), PowerPoint (PPT/PPTX), Excel (XLS/XLSX), Access (MDB/ACCDB) and OneNote (ONE).
  • PDF, other “Office” documents, compression formats: support covers PDF with integrated image and text support, other “Office” formats, RAR, ZIP, GZIP/TAR, etc.
  • Emails and attachments: support covers integrated browser-ready image and text support — plus support for attachments — in Outlook/Exchange (MSG/PST/OST) and Thunderbird, Eudora, etc. (EML/MBOX).
  • Recursively embedded objects: support covers recursively embedded objects and images in supported email types and MS Office formats. For example, the dtSearch document filters would support an email attachment consisting of a ZIP container including both a PDF and an Access database, where the latter also includes an embedded PowerPoint with embedded images.

Terabyte Indexer. dtSearch enterprise and developer products can index over a terabyte of text in a single index, spanning multiple directories, emails and attachments, online data and other databases. The products can create and search any number of indexes. Indexed search time is typically less than a second, even across terabytes of data.

Concurrent, Multithreaded Searching. dtSearch developer products provide efficient multithreaded searching, with no limit on the number of concurrent search threads. For online search, the products can run in a completely stateless manner, making it very easy to scale.

Federated Searching and the dtSearch Spider. dtSearch products offer federated searching across any number of directories, emails with attachments, and databases. The dtSearch Spider adds local and remote, static and dynamic online content to a search. The Spider can index sites to any level of depth, with support for public and private or secure online content, including log-ins and forms-based authentication. dtSearch products support integrated relevancy ranking with highlighted hits across both online and offline data repositories.

Faceted Search and Other Data Classification Options. The dtSearch Engine supports categorization based on document full-text contents, internal document metadata, database content, or data attributes associated with documents during document indexing. Advanced data classification options include faceted search and full-text and/or fielded data positive and negative variable term weighting.

25+ Search Options; International Language Support. The dtSearch product line offers 25+ hit-highlighted search options, including special forensics search features. dtSearch products provide Unicode support for international language text, including support for right-to-left languages, and Chinese/Japanese/Korean character handling options.

Other dtSearch Products. Beyond the dtSearch Engine for Win & .NET, Linux, and now Android, other dtSearch products include: dtSearch Web with Spider, for publishing with HTML5 templates instantly searchable data to an Internet or Intranet site; dtSearch Network with Spider for instant concurrent network-based searching; dtSearch Desktop with Spider for desktop search; and dtSearch Publish for publishing searchable data to portable media.