By Rupa Bhatt*
Electronic discovery is a multi-stage process of collecting, processing, analyzing, searching and reviewing electronically stored information. Yet all these stages are directed towards two basic tasks: identifying relevant information and preventing the disclosure of privileged information.
For cases involving significant collections of ESI, a key stage in the accomplishment of these two tasks is review. Even as search and analytics enable more sophisticated identification of documents through computer technology, there is still the need in virtually every case for eyes-on review by trained reviewers.
But if, as the saying goes, “Beauty is in the eyes of the beholder,” the same can be said for relevance. It is a slippery concept and its application is somewhat subjective. As reviewers move from document to document, they make judgment calls about each one’s relevance. Their judgment calls are not always perfect or consistent.
The fact of the matter is, when it comes to e-discovery document review, a certain number of inconsistent review calls are not only to be expected, but are unavoidable. Given this, how can you be better prepared in order to prevent these incidents from impacting your outcomes?
Later in this post, I will offer some suggestions on best practices to follow in order to minimize the impact of inconsistent relevance determinations. But first, some background.
Guideposts for Determining Relevance
The federal rules provide the two primary guideposts that we follow in defining relevance in discovery and at trial:
FRCP Rule 26: Duty to Disclose; General Provisions Governing Discovery. (b) Discovery Scope and Limits. (1) Scope in General. Unless otherwise limited by court order, the scope of discovery is as follows: Parties may obtain discovery regarding any non privileged matter that is relevant to any party’s claim or defense. … Relevant information need not be admissible at the trial if the discovery appears reasonably calculated to lead to the discovery of admissible evidence.
FRE Rule 401: Definition of “Relevant Evidence.” “Relevant evidence” means evidence having any tendency to make the existence of any fact that is of consequence to the determination of the action more probable or less probable than it would be without the evidence.
Given the broadness of these definitions, the determination of relevance can be a highly subjective judgment call. At the least, it is dependent on the case and on the counsel or reviewer. And the impact of that judgment call can be wide and varied. It can affect every subsequent stage of the e-discovery and document review process.
It is often the fact that the reviewer’s subjective determination impacts the relevance determination for other similar documents. Given this, one can always question the relevance assessment criteria – human and subjective, electronic and objective — applied at every stage of document review. These judgment calls in coding documents are often referred to as “coding inconsistencies” or “bad coding calls.” Such inconsistencies can have serious consequences, ranging from production of irrelevant or privileged documents to the imposition of judicial sanctions such as an adverse inference.
How Reviewers Determine Relevance
Relevance does not behave. People behave. Although various factors can contribute to deviations in relevance determinations, the single-most important factor is the subjectivity of the reviewer.
A paper produced by the TREC 2008 Legal Track described it this way:
While the ultimate determination of responsiveness (and whether or not to produce a given document) is a binary decision, the breadth or narrowness with which “responsiveness” is defined is often dependent on numerous subjective determinations involving, among other things, the nature of the risk posed by production, the party requesting the information, the willingness of the producing party to face a challenge for underproduction, and the level of knowledge that the producing party has about the matter at a particular point in time. Lawyers can and do draw these lines differently for different types of opponents, on different matters, and at different times on the same matter. This makes it exceedingly difficult to establish a “gold standard” against which to measure relevance/responsiveness and explains why document review cannot be completely automated.
Hence, the interplay of all these factors defines and directs the outcome of the review itself. Some examples of “influencers” that may impact the relevance determinations include:
Human/subjective factors. These factors can include the prior experience and knowledge of the reviewers as a whole (i.e., Is this an off-shore review team unfamiliar with U.S. litigation? Is the review team trained on the review platform?), as well as issues with individual reviewers’ ability to grasp, perceptions, ideas, concepts, emotional state, cultural norms, etc.).
Instead of “subjective,” it may be more appropriate to say that discovery involves judgment about the situation as well as about the documents and their contents. Some judgments bias the reviewer to be more inclusive and some bias the reviewer to be less inclusive, but these judgments are not made willy-nilly. As opposed to pure errors, which are random, these judgment calls are based on a systematic interpretation of the evidence and the situation.
Objective factors. These factors can include relevance criteria as presented by lead counsel (i.e., how complex is the review in terms of numbers of issues and parties? How many reviewers are there? What is the timeline? Is the review happening in different geographical locations and time-zones? What is the review workflow? Is it well-formulated?).
Technology factors. These can include the review platform (i.e., How easy is it for counsel and reviewers to navigate through documents? How fast is the tool? Are there any technological limitations hindering the review?).

Figure 1
Ultimately, the determination of document relevance involves the interplay among these three factors and can be represented in many ways, as illustrated by Figure 1 and Figure 2.

Figure 2
Apart from the above factors, the reviewer is also intuitively — based on the matter, training and information provided before the start of review — and at a sub-conscious level figuring out a pattern while trawling through the documents.
Research indicates that reviewers look for clues and triggers in the document. While there is no wide consensus, on a general level, we can group these triggers into the following categories:
- Overt and latent semantic content: Topic, quality, depth, scope, treatment, clarity.
- Object: Characteristics of the document, e.g. type, organization, representation, format, availability, accessibility, information flow, threads.
- Validity: Accuracy of information provided, authority, trustworthiness of sources, verifiability with coding docket.
- Situational match: Appropriateness to situation or tasks, urgency.
- Belief match: Credence given to information, acceptance as to truth, reality, confidence.
Reviewers’ Learning Curve as a Factor
As a reviewer progresses through documents, the reviewer’s knowledge of the matter and comfort with the review platform improve. As they do, the accuracy of document relevance determinations also improves.
We typically see more coding inconsistencies within the first five days of a review project. During this time, the reviewers are learning the matter as well as getting a feel for the underlying documents. A reviewer’s understanding and knowledge of the review platform can either benefit or hinder the ability to accurately assess document relevance.
By way of illustration, many of you might have answered market research surveys. Observe that a question might have been asked three different times using different terms and in different sections of the questionnaire. Ever wondered why? It is because the researcher is checking the reliability and consistency of the individual’s response over time. Invariably, there is an inconsistency observed in the individual’s response.
Another example is the study conducted by Ellen M Voorhees (2000) with TREC data. One set of reviewers was requested to review a random sample of documents and assess for relevance pursuant to pre-defined criteria. A second team then assessed the same set of documents (already coded for relevance). On analyzing the results, it was determined that of the documents considered relevant by the first set of reviewers, only 80% of the documents were considered relevant by the second set of reviewers. So what changed? The documents and the pre-defined criteria were the same, but the reviewers were different.
Best Practices to Prevent Inconsistent Relevance Assessments
By now you get the picture. Relevance is a slippery and subjective concept. As I said at the outset, a certain number of inconsistent review calls are not only to be expected, but are unavoidable. That does not mean, however, that you should throw your hands up in defeat. There are various best practices you should employ in review in order to minimize the number and impact of relevance miscalls.
- Plan, plan, plan. It is imperative to walk through a review workflow and to integrate counsel into each stage of the quality control and assurance process.
- Provide repeatable, detailed case training and easily accessible case documents. If you use an outside review team, consider recording their training session. The recording will always help the team revisit the issues that may come up repeatedly and can be used as a training tool for new reviewers.
- Choose a good team. In most cases, the team of reviewers can either win or lose the case. Hence, choose wisely.
- Implement a thorough review workflow. The workflow should include escalation points. Supplement it by creating a detailed review journal outlining your expectations for reviewers. Clearly identify the issues and the criteria for relevance. Make certain to find example documents to use for training. Address the logistics of the review, including how quickly reviewers should move through documents and how they should handle parent–child relationships, email threads and near duplicates, redactions and annotations etc. Our suggestion is to create “cheat sheets” for easy reviewer reference throughout the review project.
- Identify a senior attorney as a dedicated subject-matter expert. During the initial days of a review, it is a necessity to ensure strict quality control procedures are followed to identify inconsistencies in coding. Involving counsel in the initial days – to answer questions, offer insight, provide further clarifications to what may seem like minor points, participating in “hands on” quality control, providing further ad-hoc, on-demand training etc. — can offer immense benefits to ensuring accurate and consistent document coding.
- Build-in tools for quality control. Your review process should include QC checks for coding calls regarding issues, privilege, responsiveness etc. The process should provide timely reports of review quality, accuracy, speed, code reversals etc. The system should provide alerts for inconsistent coding across families. If a parent document is tagged responsive, the tool should be able to tag all the family members as responsive (or something similar). It is a good practice to use automated rules and intelligent auto-coding methodologies, as well as bulk updates. To ensure uniformity and prioritize review, consider leveraging software analytics, such as classifiers, clustering and e-mail threading.
- Second level QC plan. This should be developed by senior attorneys, specifically to address the case at hand.
- Review audits. Conduct these periodically and document findings, successes and failures.
- Employ sampling. Sample documents extensively for relevance and privilege.
- Post review audit. You should conduct an extensive post-review audit and QC check.
These best practices won’t make relevance any less slippery a concept to nail down. But they can help prevent the review process from slipping out of your control.
*Rupa Bhatt is employed by Catalyst as a member of the Catalyst Consulting team.