October 07, 2022

Reading time minutes

Hyland’s Document Filters is more than a commodity

Far from a commodity, Document Filters has the ability to inspect, extract data from, convert and manipulate over 550 different file formats!

Photo of Sam Babic

Sam Babic

SVP, chief technology officer of Hyland

Two people collaborate in a conference room with a whiteboard.

The IT industry, as much as if not more than any other industry, is built out of commodities. In IT, commodities are everywhere, from software tools, to hardware and services.

With the rapid pace of technology, the commoditization of IT is heavily compressed, and some technologies go from innovation to commodity within just a few short years. The continuing expansion of SaaS and cloud computing have further accelerated our expectation of rapid distillation of leading hardware and software capabilities into commodity services.

Throughout my time with Hyland, I’ve seen my own share of commoditization of many capabilities in the content services and document management industry. It’s not that these capabilities become less critical to organizations; it’s just that they become more widely available without any significant differences from one vendor to the next.

Content viewing is a great example. It is a seemingly simple, yet critical capability without which many business processes would grind to a halt. Today, every enterprise-grade content management system provides some ability to view content. Whether it is a PDF, a Microsoft document, an image or a video, document viewing capabilities are often taken for granted — we just expect to be able to view our files!

Twenty-two years ago, when I joined the industry, content viewing already pretty much felt like a commodity. Multiple vendors offered document viewing and rendering capabilities. Sure, there were still some difficulties caused by the lack of maturity of many file formats. Anyone who had to process a TIFF image at that time knew that not all TIFF images are created equal. That was also true for many other ‘standard’ formats.

Such nuanced format-related quirks made life more difficult, but we’ve had several decades to unwind the secrets of these image formats. So, one would think that over the last 20+ years, file viewing technology has pretty much had its story written. Or has it?

Related articles

Hyland’s Document Filters: A hidden gem

Over the past few years, Hyland has acquired a number of industry-leading companies that also help customers manage and deliver content. Much of the coverage that followed from the media and analysts focused on our expanded capabilities around helping our customers manage business documents, clinical content, videos, customer communications and more.

However, in the portfolio of the acquired technologies, there was also a hidden gem — innocently called Document Filters — that did not get much coverage. I suspect this is because most people assumed it was just a commodity. It’s not.

To explain, let’s go back to our example of file viewing. Today, file viewing is anything but trivial. Not only are there hundreds of possible file formats compared to 20 years ago, but we also now have many more layers to our data. Besides the plain text, our files now often contain useful metadata about the file’s origin and contents, as well as comments, annotations, embedded and linked data, images and other, sometimes hidden, information. A good file viewer needs to be able to not only correctly identify the file type, but also inspect and extract all useful data — not just the plain text — and then be able to render it without relying on having access to the native application for that file type. And it needs to be able to do it quickly and over and over again, in order to keep up with today’s accelerating pace of digital transformation.

At Hyland, like at many other software firms, over the years we have had to leverage a number of third-party commodity technologies to enable file inspection and content rendering in our document viewers. These commodity technologies worked well, but they just really did not offer any additional value to our customers over the other viewers available on the market. This all changed when we added Document Filters to our technology portfolio.

Document Filters is a software development kit (SDK) for file processing

So, what exactly is Document Filters? It’s not an application that your users would interact with the way they interact with, say, OnBase. Instead, it is a software development kit — an “under-the-hood” technology that enables many of the functionalities your users need to interact with content. Software developers can use Document Filters to give their applications the ability to identify and inspect files, extract text and metadata and even annotate documents and convert between file formats.

At the core of Document Filters’ capabilities is its uncanny prowess in data parsing. It can even use optical character recognition (OCR) to extract text from images. Being able to deeply inspect all data within a file gives it the ability to render the various formats with near-pixel-perfect precision in a viewer on users screens.

Document Filters provides industry-leading scope of coverage

Today, Document Filters has the ability to inspect, extract data from, convert and manipulate over 550 different file formats! This is staggering, industry-leading scope of coverage that can offer unparalleled flexibility to end-users without requiring them to have access to the native applications for all these formats.

And, because Document Filters runs natively on 27 platforms, developers have the flexibility to leverage it across everything from Windows to mobile, to highly customized versions of Linux or even in embedded systems. This ability to interact with so many file formats as well as the outstanding cross-platform compatibility have made Document Filters the top choice for many companies outside Hyland.

Today, Document Filters is licensed to many major technology providers to support a wide range of use cases beyond file viewing, from inspecting files for malicious content in security software and hardware, to data loss prevention in email applications, to generating file previews on multi-function printers, to enabling search, document classification, metadata extraction and many, many others. I can’t name most of those organizations here due to contractual obligations, but trust me when I say that you have likely already experienced the power of Document Filters without even knowing it.

So next time you’re viewing a document in your favorite application or device, take a moment to appreciate all that goes into the viewing of that document. Whether that vendor is using a technology similar to Document Filters, or actually using Document Filters, that technology under the hood is far from a commodity, and has the power to unlock a whole array of possibilities and use cases.

Want to learn more about Document Filters or even take it for a spin? Start here.