4 min read

10 Things You Don’t Want to See in eDiscovery

Featured Image

This article was posted as original content on the ACEDS Blog, and written by Gavin W. Manes.

An eDiscovery project can be complex when just considering in its general parameters. Maybe it’s a certain document type, production style, or naming convention – there are more than a few things that make eDiscovery life difficult. But keep in mind, these are exceptions rather than the rule. Of the hundreds of millions of documents Avansic has loaded for online review, the vast majority of them fall into only seven file types. So, if you have one of the items that follow in your case, it just may require some exception handling, extra time, and – so you’re prepared – potentially some extra cost.

10 Things You Don’t Want to See

1. DWG or CAD files

These don’t fit neatly into the 8.5×11 size that all of us think a document should be. There are layers and renderings and it’s difficult to determine how to present them in a data set comprised of other documents that do fit on a “page.” This is especially true for CAD drawings with multiple layers. Isolating to just the data needed for presentation can be very helpful. There may be instances where the opposing party has a viewer that can accept CAD files, in which case providing the native version is an easy solution.

2. Select Email on a Macintosh

In this case, the difficulty is that email headers, messages, and attachments are stored separately. Common eDiscovery and forensics tools don’t understand the relationship between these fractured parts of an email. The solution is to re-create the email from its parts using a custom tool based on the data. Alternatively, if one has the Macintosh device it can be used to export email into PST or MBox format.

3. Encrypted/Password Protected Files

Without a password, these files aren’t accessible. By far the best solution is to obtain the password from the user. There are methods to figure out passwords, but it requires very significant computer power, time, and cost, and ultimately there is no guarantee of success.

4. Email Archivers

Each archiver is different and may have different problem areas in terms of eDiscovery. For example, some archivers gather mail as it is inbound to the system and don’t know a custodian – for instance, sales@company.com might redirect depending on the employee in charge of sales only at that time. Archivers often store emails in a manner that makes them easy to search the body but not the attachments. Then, the entire archive contents must be exported in order to perform proper searches. Here, the best solution is to work with the archive vendor to determine methods for extraction.

5. Poor Packaging

A loose hard drive in a cardboard box is not an effective way to ensure digital media will arrive at its destination intact. Before you ship, discuss proper shipping guidelines with the party receiving the materials.

6. Cloud Data That Isn’t eDiscovery Ready

More cloud providers are popping up, but they are not all equal. Most make it very easy to get your data loaded but don’t necessarily have similarly robust procedures for you to get your data exported when you need it. It’s common for a mass export to take several times longer than it took to upload. As well, the associated metadata may be missing or inaccurate. Unfortunately, one of the most common email providers makes it difficult to download hosted email that is ready to be processed in eDiscovery. Evaluate each provider at the earliest opportunity to determine the length of collection and if special handling or post-processing is needed.

7. Files Within Files (Complex File Types)

The most common example is a PST or OST file that contains emails that themselves have a PST attached. Most eDiscovery tools don’t track a hierarchical relationship of attachments which makes it difficult to understand the context of an item in a given family. Proper searching of any file container will require the expansion of these files beforehand, which frequently results in many more documents than originally anticipated.

8. Non-Searchable Types That Contain Text Within ESI

A good example of this is a scanned document that is an attachment to an email within a PST. Searching the email for content within that document wouldn’t necessarily locate the text in the scanned document. Similar to container file issues, proper searching requires them to be expanded and the individual parts to be processed as well, including imaging and OCR for items without extractable text.

9. Legacy and Uncommon Email Formats

This includes email in older or uncommon formats or programs such as older MBox, GroupWise, and Lotus Notes NSF files. Often these types require a large amount of additional pre-processing to get to what is considered bare minimum for other formats. The most common solution is to convert these to the EML format, which in many cases loses valuable information including format-specific metadata; and once converted, it still may not look like regular email. A solution is to find eDiscovery vendors with experience or that have development resources to decode the format.

10. Legacy Hardware

Challenges include poor connection speeds, types no longer supported with modern operating systems, and drives that may not spin up. A vendor with experience getting information off those devices is needed, as well as an understanding that the time frame may be longer than expected.


Each of these issues require some expertise to address. It is worth the time to talk with your eDiscovery vendor and determine their level of familiarity with the data type before sending the work, along with evaluating the importance and value of the information to your case. The nature of most of these will require additional work that can often be time consuming and expensive. Ask the vendor if they know of options to get a similar result without processing the most difficult types directly.