Guest post by Doug Austin, Editor of eDiscovery Today
Unless you live under a rock, you’ve probably been inundated lately with stories about Artificial Intelligence (AI) and machine learning. The introduction of OpenAI’s ChatGPT late last year has thrust AI – particularly generative AI – into the spotlight even more than it already was (which is saying a lot). I’ve been covering it so much lately that I’m wondering if I shouldn’t change the name of my blog to Artificial Intelligence Today!
From an eDiscovery and forensics perspective, organizations have been leveraging AI algorithms to address Big Data challenges for years now – without it, we would literally be drowning in data. AI algorithms have become required components of any legal technology toolset today and the ability to apply AI to more eDiscovery and forensics use cases than ever creates significant benefits!
However, AI also creates significant challenges. With close to a billion users a month now on ChatGPT, chances are you’ve probably at least tried it, as it can do some amazing things and provide some really amazing content. But ChatGPT isn’t the only generative AI solution by any means – it’s simply the one that has gotten the most coverage recently, in large part because of its ease of use and its virtually immediate delivery of responses to inquiries. Generative AI solutions can generate all types of content – text, images, audio and video – and the content they generate is creating challenges for organizations in authenticating data, which has magnified the importance of eDiscovery and forensics even more.
Let’s look at the eDiscovery and forensics use cases that AI and machine learning technology can help address, as well as challenges that AI creates that eDiscovery and forensics can help address.
AI Use Cases for Digital Forensics and eDiscovery
For eDiscovery, the most well-known use case of machine learning over the years has been predictive coding/technology assisted review (TAR). But there are several other use cases for AI/machine learning technology that can help streamline eDiscovery and forensics, including:
- Personally Identifiable Information (PII) Identification: Machine learning algorithms can be applied to help identify PII, which is a key focus for eDiscovery workflows to support incident response after a data breach in helping identify individuals affected by the breach.
- Auto Redaction: Similar AI algorithms can be used to automatically apply redactions to PII or other sensitive or privileged information within a document collection.
- Bulk Document Classification: AI and machine learning can be used to perform bulk document classification on a data set to identify characteristics, such as determining what user-created information exists on a storage device.
- User Behavior Analysis: Sentiment analysis is a natural language processing (NLP) technique used to determine whether data is positive, negative or neutral. It can lead to determining emotion, such as whether a subject is happy, sad or angry. Sentiment analysis can help to prioritize documents for review, such as documents with a strong negative sentiment that might be more likely to indicate evidence of wrongdoing.
- Conversion of Data to Text: AI algorithms can even help convert non-textual data (such as images or audio files) into a text format that can be more easily analyzed (e.g., determine which images have a red car, identify relevant information quickly in audio files, etc.).
How eDiscovery and Forensics is Helping Address AI Challenges
Of course, what AI “giveth”, AI tries to “taketh” away in the form of challenges that it creates. One of the biggest challenges can be summed up with one word: Deepfakes. As generative AI tools are getting more sophisticated for creating manipulated audio, video, or image content, deepfakes can make it challenging to differentiate between authentic and manipulated evidence.
In other words, the potential of deepfakes in evidence for discovery and investigations today is a complete game changer when it comes to the evidence that might be produced or analyzed. Not only is there a need to identify evidence that might be a deepfake, but there is also a growing need to simply prove that evidence isn’t a deepfake. A recent paper I covered (The GPTJudge: Justice in a Generative AI World, written by Maura R. Grossman, Paul W. Grimm, Daniel G. Brown, and Molly (Yiming) Xu) discusses concern about the emergence of “the deepfake defense”, where the idea that as people become more aware of how easy it is to manipulate audio and visual evidence, defendants will use that skepticism to their benefit. The lack of deepfake evidence in a case doesn’t exclude the possibility that deepfakes will still be an issue in the case in the age of generative AI.
So, what to do? This is where forensics can help draw the line through authentication of evidence. Metadata has become so important in recent years in many cases I’ve covered. In last year’s Johnny Depp/Amber Heard trial, we saw how the metadata associated with the famous Amber Heard bruise photo showed that the software associated with the photo showed that the file was saved in “Photos 3.0” – a photo editing software program – instead of the iOS version associated with the iPhone 6, which certainly cast doubt on the authenticity of the photo.
There was also the Rossbach case from a couple of years ago – what I like to call the “smoking emoji” case – where the lack of metadata (and the use of an emoji in the fabrication of a text exchange that wasn’t possible with the plaintiff’s version of iOS) was critical in the case being dismissed.
While those aren’t deepfake examples, they illustrate the importance of metadata in authenticating evidence. Forensics will be the key to analyzing that evidence – especially the metadata – to determine which evidence is a deepfake and which isn’t. The role of eDiscovery in that process will involve working with experts who understand the issues, how to evaluate the evidence, and how to request ESI in the proper forms of production to ensure that forensic analysis can be conducted.
Expect a Bumpy Ride
To paraphrase Bette Davis in All About Eve, “fasten your seatbelts, it’s going to be a bumpy ride!” AI and machine learning technology is going to change – rapidly. Forget the word “evolve” – that word is too slow to reflect how quickly things will change when it comes to the benefits and challenges of AI.Regardless how quickly things change, eDiscovery and forensics will be more important than ever – either in terms of how AI can be applied to eDiscovery and forensics use cases, or how eDiscovery and forensics can help organizations address the challenges that AI brings in terms of authentication of evidence. When you talk about eDiscovery and forensics today, its impact on AI – and vice versa – must be part of the discussion.
Doug Austin is the Editor of eDiscovery Today and an established eDiscovery thought leader with over 30 years of experience providing eDiscovery best practices, legal technology consulting, and technical project management services to numerous commercial and government clients.
eDiscovery Subscription Services Experience
The largest hurdle in modern eDiscovery is not the tool or the platforms but how easy it is to...
On-Staff Litigation Support Professionals - Experience Varies
Law firms and corporations may have people on staff who perform eDiscovery management tasks every...
So, You Want DIY eDiscovery? Part II: Search and Analytics
It may not be the most exciting activity but searching is critical to successful eDiscovery. When...