5 min read

Best Practices for Email in Discovery

Featured Image

Email is a critical part of daily business and personal activity and as such, provides a tremendous amount of information potentially relevant for discovery. This post begins by providing a basic background of the types of email and important features for corporate and legal businesses use, including email retention and deletion. The collection of email during discovery and practical suggestions for requesting, processing and reviewing are then covered. The post concludes with a brief discussion on email security and encryption.

Types of Email

There are five types of email servers in two categories, cloud-based and on-premise or data center. Most companies use on-premise approaches with either Microsoft Exchange technology or a non-Exchange email system (Unix mail, Lotus Notes, etc). The most common cloud-based email systems are Gmail and Office 365, with many other mail services being offered by Yahoo, cable and telephone providers, and more.

Best practices for using email will be dependent on what type of mail system is being used. Exchange, on-premise, or Office 365-based (cloud) and Gmail are the most flexible. For most of the cloud offerings, free accounts generally do not have management features.

Corporate Use

From a discovery perspective, there are two key features needed in an email system: the ability to place litigation holds and to search and gather mail across mailboxes.

Systems that allow for litigation and other legal holds will prevent users from permanently deleting email; legacy systems and free accounts may not have this feature.

It is simple in the current versions of Gmail and Exchange to search and retrieve email, but care must be taken when applying keyword searches since documents in these systems may not have text that can be indexed. Some systems allow for the export of these “unsearchable” items.

Email in Discovery

There are two main ways that email is handled in eDiscovery: requesting and reviewing email from your client, and requesting and reviewing email from the opposing party.

The best method to receive email for review of your client's data during discovery is in true native form such as MSG, EML, PST, OST, MBOX. Other formats commonly referred to as native but do that not provide as much information or ease of review include MHT, HTML, and PDF.

Having a true native file set allows access to all relevant metadata for the purposes of processing, filtering, sorting and review. When receiving email produced from other parties, the most common format is Bates stamped images with a load file including select metadata. One of the most overlooked and useful metadata fields is Conversation Index, which is a unique identifier that describes the email and its position in the email thread. It is used for de-dupe, email threading, and missing email analysis. Including this field in a request for data is an advantage during eDiscovery. For instance, using Conversation Index, you can de-dupe your client's data against the data received from an opposing party and not have to re-review data. Whenever possible, request Conversation index as part of your production specification.

The most common mistakes made when reviewing client email is to not review true native files, not track the custodian from which the email came, and to de-dupe email based on content hash.

Document review tools generally allow for viewing emails within their threads, and some allow the filtering of non-inclusive emails (which potentially reduces review). If the email set has rich, deep threads, performing an inclusive-only review would allow review of only the ones with unique content or attachments. In the case where an email has gone back and forth multiple times and has no attachments, it is possible to only review that last email in the chain. But it is important to analyze the email set prior to determining that the email thread review would be time saving since it entirely depends on the data. Other advanced review techniques include breaking the context of emails and their attachments apart and reviewing only the content of a final near-duped set.

Email Collection

Historically, the use of SMTP and POP protocols meant that email could be on multiple devices in different forms and may not exist in a central location. If this is the case, eDiscovery collection should be considered for every device potentially having email. Most modern email is accessed via a syncing protocol such that sent email drafts, calendar appointments and email exist on devices and on a central server. In addition, there are other items that may be stored on email servers for a given user such as text messages, loose files, journals, that contain useful information.

In the past, email collections were performed by downloading the mail using iMap, POP, or the Exchange connector – more modern methods using either the embedded tools or third party tools are becoming more common.

Email Retention and Deletion

Much like files on a computer, deleting an email may not mean it is gone. Deleted emails move to a trash can or recycle bin and in modern email, the deletion on a user's computer would sync with the server and that file would move to the trash can on all connected devices. Once that item is deleted out of the trash can, it is removed from that user's view in their mailbox; however, it may not be removed from the server. The settings on the server will determine how long and if an email is deleted from the server. Note that once it is removed from the server it may still exist in external email archives, backups, exports, and even in the free space of local computers.

In the last few years, the vocabulary about how email is kept has been converted from IT terms (tombstoning) to legal terms of art (litigation hold) which allows better communication between technology administrators and legal professionals. It still requires an individual with IT know-how and legal knowledge to fully understand and operate the search, recall and export from modern email platforms.

Absent a corporate policy, most users tend to keep email indefinitely and are selective about deletion (such as spam). If a litigation hold only references older emails, a simple export of the mailbox may suffice instead of preventing future deletions. However, if the litigation hold is ongoing, you may have to keep all email including spam. In this case settings will need to be changed to preserve all email and prevent future deletions. Some policies require and enforce the deletion of email after a set time frame. Care must be taken to ensure these policies follow guidelines related to the industry, such as regulation retention (HIPAA, SOX, GLB, etc.) and to suspend this deletion in the case of a litigation hold.

Email Security

Email security is an entire topic on its own, but briefly: most email is sent in plain text, which means if intercepted it can be easily read. However, if a protocol called TSL is used on the email server, it will attempt to send encrypted. This type of encryption prevents someone from intercepting the email in transit from mail server to mail server and being able to read its contents. When collecting email for discovery, this type of encryption generally does not interfere as sent and received messages are stored and viewed in a readable form.

End users have the option to encrypt individual emails. This is most common in banking and other financial institutions, and they send an email that contains a link that routes you to a website to receive a message. In this case, the information is not even sent through a server. Other technologies may be more secure but less convenient and limit the types of devices. The most common type of these is called PGP and it requires pre-configuration for sending and receiving parties. These types of encryption impact collecting email for eDiscovery, and passwords or keys may be needed to open the email.


Email in its classic definition has not changed since it became common in the late 90s. How it is utilized by the end user, the servers and the clients have all changed dramatically including centralization of storage, better control for litigation hold, more metadata and information available for use in eDiscovery. This means far more potentially relevant information, but also more information to review during discovery – a problem solved by careful planning and consideration during eDiscovery projects.