Story Engine™ Success Story: Understanding Massive Volumes of Emails without Metadata

In an ideal world, we’d only deal with clean, nicely labeled data. Unfortunately, that isn’t the reality of a world where new data is created every second. I thought it would be interesting to share a story from one of the companies using Story Engine™ because it showcases just how powerful natural language processing (NLP) and machine learning technology is for understanding huge amounts of data, even when it isn’t nicely formatted and labeled.

When time is of the essence, organizations can't wait days to understand their data.

When time is of the essence, organizations can't wait days to understand their data.

The Problem: Short Deadline, No Metadata

The company works with law firms to extract information relevant to their cases. In one of their recent projects, they received a request to identify people who would needed to be contacted for depositions. They had to have a list of important people ready within a few days, so they had to act quickly.

The problem? The data set they received had absolutely no metadata, and it contained hundreds of thousands of emails as well as attachments. This would make it impossible for traditional legal technology tools to efficiently recognize who the people involved in the communications were. 

In fact, without Story Engine, the company would have had to open each email and read line-by-line to make sense of who was sending/receiving email and whether these emails were relevant to the case.  

How Story Engine Understands Unstructured Data

When Story Engine looks at an unstructured data set, it uses NLP technology to understand the context within the entire data universe. This means it can identify key facts, such as who’s communicating and what they’re talking about, even when there is no metadata to build a communications analysis from.

One of the ways Story Engine can do this is through its sophisticated entity extraction technique; it looks beyond a single document to understand the context around people and names. This enables our AI solution to not only identify names, but any aliases or nicknames that refer to the same person within a data set. All of this is done automatically, so users can then turn to Story Engine’s visual exploration tool to focus on high-profile communicators (even if they've used different emails or aliases) or filter by numerous other Features (e.g. emotional sentiment, behavioral anomalies, high-pressure conversations).

Story Engine Shines a Light on Dark Data

Can you see your dark data?

Can you see your dark data?

In this case, the company used Story Engine's communication analysis functionality to identify the top 20 communicators, who they communicated with and what these conversations were about. This enabled them quickly identify who communicated the most and let them focus on emails that were relevant to their client. 

With Story Engine, the team was able to recreate the information missing from metadata within a matter of hours. This allowed them to create a list of people who would likely need to be contacted for depositions before they even started looking at the actual content of the emails.

This may be a legal technology story, but the concept behind it has far broader implications. After all, how much unstructured data does your organization have, and how much of it is actually being used to generate value?

The biggest advantage that NLP and machine learning bring to unstructured data is adaptability. We’ve been able to train Story Engine on just about any kind of unstructured data because it understands more than individual can understand complex concepts and their relationships within massive communications data sets.

Are you interested in learning more about what artificial intelligence can do? Schedule a demo to see Story Engine in action.

Not sure where to start with AI? You can always contact our team of AI, machine learning, and natural language processing experts to help find the best answer to your legal technology and business communication challenges.