E-Discovery Search Blog

Federal Court Affirms Judge Peck’s Predictive Coding Order

Judge Carter

When last we left the case of Da Silva Moore v. Publicis Groupe–the groundbreaking case in which U.S. Magistrate Judge Andrew J. Peck issued the first judicial opinion to endorse the use of computer-assisted review and predictive coding–it was headed for review by U.S. District Judge Andrew L. Carter Jr. Now, thanks to a heads-up from Evan Koblentz at Law Technology News, we learn that Judge Carter has issued his ruling and has adopted Judge Peck’s opinion.

“The Court adopts Judge Peck’srulings because they are well reasoned and they consider the potential advantages and pitfalls of the predictive coding software,” Judge Carter wrote in an opinion filed today.

In challenging Judge Peck’s order, the plaintiffs had argued that he had mischaracterized and confused the issue of whether they had consented to the use of predictive coding. Judge Carter concluded that any such confusion was immaterial.

The confusion is immaterial because the ESI protocol contains standards for measuring the reliability of the process and the protocol builds in levels of participation by Plaintiffs. It provides that the search methods will be carefully crafted and tested for quality assurance, with Plaintiffs participating in their implementation. For example, Plaintiffs’ counsel may provide keywords and review the documents and the issue coding before the production is made. If there is a concern with the relevance of the culled documents, the parties may raise the issue before Judge Peck before the final production. Further, upon the receipt of the production, if Plaintiffs determine that they are missing relevant documents, they may revisit the issue of whether the software is the best method.

Plaintiffs also challenged Judge Peck’s order on the ground that predictive coding is not a reliable method. Judge Carter ruled that this issue is also premature. As the litigation continues, if the parties believe the predictive coding software is flawed or that the process produces incomplete results, they can raise their concerns with Judge Peck and ask him to reconsider, Judge Carter noted. “To call the method unreliable at this stage is speculative.”

There simply is no review tool that guarantees perfection. The parties and Judge Peck have acknowledged that there are risks inherent in any method of reviewing electronic documents. Manual review with keyword searches is costly, though appropriate in certain situations. However, even if all parties here were willing to entertain the notion of manually reviewing the documents, such review is prone to human error and marred with inconsistencies from the various attorneys’ determination of whether a document is responsive. Judge Peck concluded that under the circumstances ofthis particular case, the use of the predictive coding software as specified in the ESI protocol is more appropriate than keyword searching. The Court does not find a basis to hold that his conclusion is clearly erroneous or  contrary to law.

As to that secondary issue I mentioned in an earlier blog post–whether Rule 702 and Daubert apply to a court’s acceptance of a predictive-coding protocol–Judge Carter made short work of that. In a footnote, he wrote: “The Court adopts Judge Peck’s analysis of Rule 26(g) and Fed. R. Evidence 702 for similar reasons provided in his written opinion.”

Thus, Judge Peck’s predictive coding order has stood its ground and, with Judge Carter’s adoption of his reasoning, the use of predictive coding has taken another giant step towards the mainstream.

Spoliation Meets the Backyard Barbecue: The Bonfire of Veracity

We don’t often write about spoliation cases here, but some cases are so burning hot that they ignite our interest. Such is the recent ruling in an employment discrimination case from U.S. District Chief Judge William H. Steele of the Southern District of Alabama, Evans v. Mobile County Health Department.

The plaintiff, Sandra Evans, appealed to Judge Steele from a finding of a U.S. magistrate judge. The magistrate concluded that she had purposefully destroyed electronic evidence contained on her personal computer during the pendency of the litigation and that the destruction amounted to spoliation, deserving of sanctions. The plaintiff was clearly hot around the collar about this, asserting that her PC contained no discoverable evidence.

Judge Steele made short order of plaintiff’s assertion. There was abundant evidence that her PC contained information that was potentially discoverable, he concluded. These included emails she had admitted forwarding from her work computer to her home computer and a daily work diary she kept on her home computer.

But the one fact that seemed to most burn the judge–and that turned the plaintiff’s arguments to ashes–was the method by which she destroyed the evidence.

She took her home computer out into her yard and set it ablaze.

How did she explain this rather unusual act? Her computer had crashed, she contended, so she set it on fire in order to keep tax information it contained out of the hands of third parties.

Given all this, the judge’s response was rather cool. Her explanation was “facially implausible,” he said, “leaving only the destruction of discoverable information as a likely motive.” Sanctions affirmed.

Had I been the plaintiff’s counsel, I would have cited that famous Sedona Conference e-discovery document in her defense. You know the one, The Conflagration Proclamation.

Should the ‘Daubert’ Standard Apply to Predictive Coding? We May Know Soon

It’s been a month since U.S. Magistrate Judge Andrew J. Peck issued his seminal opinion on predictive coding, Da Silva Moore v. Publicis Groupe, and it continues to make waves. Notably, it appears that U.S. District Judge Andrew L. Carter Jr. will weigh in on the issue. On March 13, he entered an order granting plaintiffs’ request to submit additional briefing on their objections to Judge Peck’s order.

A key issue Judge Carter may need to address is one given short shrift in coverage of and commentary on Judge Peck’s opinion. Understandably, most of the commentary focused on the fact that Judge Peck’s opinion marked a milestone — the first judicial opinion to recognize that computer-assisted review is an acceptable way to search for electronically stored information.

But in the course of that opinion, Judge Peck made another significant ruling. He concluded that Federal Rule of Evidence 702 and the Supreme Court’s decision in Daubert v. Merrell Dow Pharmaceuticals do not apply to a court’s acceptance of a predictive-coding protocol.

Rule 702 and Daubert give trial judges the responsibility to act as “gatekeepers” to exclude unreliable scientific and technical expert testimony. Judge Peck reasoned that these did not apply to the Da Silva Moore case because no one was trying to put anything into evidence. Here is how he explained it:

If MSL sought to have its expert testify at trial and introduce the results of its ESI protocol into evidence, Daubert and Rule 702 would apply. Here, in contrast, the tens of thousands of emails that will be produced in discovery are not being offered into evidence at trial as the result of a scientific process or otherwise. The admissibility of specific emails at trial will depend upon each email itself (for example, whether it is hearsay, or a business record or party admission), not how it was found during discovery.

Rule 702 and Daubert simply are not applicable to how documents are searched for and found in discovery.

You may recall that before Judge Peck issued his written opinion in this case on Feb. 22, he made oral rulings at the motion hearing on Feb. 8. On Feb. 22, just as Judge Peck was issuing his written opinion, the plaintiffs filed objections to his Feb. 8 rulings. One of their central arguments was that Judge Peck erred in disregarding his gatekeeper role under Daubert.

Because predictive coding is a new and novel technology, they argued, Judge Peck should have required expert testimony regarding its reliability or appropriateness. They cite Magistrate Judge Paul Grimm’s well-known ruling in Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260 n.10 (D. Md. 2008), where he said, “[R]esolving contested issues of whether a particular search and information retrieval method was appropriate … involves scientific, technical or specialized information.” Relying on this, the plaintiffs argued:

[A]t no point did the Magistrate review any evidence to support his decision. The Magistrate took no judicial notice of any documents or studies that support the reliability of MSL’s method, nor did he receive any affidavits or declarations from purported experts that supported the methodology of MSL’s method. To his credit, the Magistrate did ask the parties to bring the ESI experts they had hired to advise them regarding the creation of an ESI protocol. These experts, however, were never sworn in, and thus the statements they made in court at the hearings were not sworn testimony made under penalty of perjury. The Magistrate judge never asked for or evaluated the qualifications of these experts, nor were the parties given an opportunity to question or cross-examine the experts in order for the Court to make a finding regarding the reliability of the experts’ opinions. Thus, the Magistrate’s decision relies only on the arguments made by counsel.

On March 7, the defendants responded to plaintiffs’ objections. With regard to the Daubert issue, they took the same position as Judge Peck–that Rule 702 and Daubert do not apply to the methods used to take discovery, but only kick in when evidence is presented at trial.

Plaintiffs simply are incorrect in their assertion that Victor Stanley requires expert testimony regarding the methodology selected by a party to search for electronically stored information. Rather, this case only requires that the selected methodology was carefully planned by qualified persons, contains provisions for quality assurance, and is supported by persons with the requisite qualifications and experience.

After receiving the defendants’ response, the plaintiffs wrote to Judge Carter on March 9 asking for leave to file their own response.

[W]hile Plaintiffs were denied an opportunity to respond to Magistrate Judge Peck’s written opinion, MSL had the benefit of filing its opposition approximately two weeks after the written opinion had been issued. Indeed, MSL’s opposition brief and supporting expert declarations not only reference, but also largely rely upon Magistrate Judge Peck’s observations regarding Plaintiffs’ Objection.

Judge Carter granted the plaintiffs’ request on March 13. (The brief was due March 19.) That means that we can expect him to issue a ruling of his own. It seems unavoidable that any ruling he issues will address the core issue of the appropriateness of computer-assisted review, at least in this case. Most likely, he will also have to address this secondary issue of the applicability of Daubert. If he does, in fact, squarely address these issues–and regardless of whether he agrees with Judge Peck–his ruling will be yet another milestone for predictive coding.

Judge Peck Provides a Primer on Computer-Assisted Review

Magistrate Judge Andrew J. Peck issued a landmark decision in Monique Da Silva Moore v. MSL Group, filed on Feb. 24, 2012. This much-blogged-about decision made headlines as being the first judicial opinion to approve the process of “predictive coding,” which is one of the many terms people use to describe computer-assisted coding.

Well, Judge Peck did just that. As he hinted during his presentations at LegalTech, this was the first time a court had the opportunity to consider the propriety of computer-assisted coding. Without hesitation, Judge Peck ushered us into the next generation of e-discovery review—people assisted by a friendly robot. That set the e-discovery blogosphere buzzing, as Bob Ambrogi pointed out in an earlier post.

I recommend reading the decision (and its accompanying predictive-coding protocol) not for its result but for its reasoning. This is one of the best sources I have seen on the reasons for and processes underlying predictive coding. Indeed, Judge Peck provided a primer on how to conduct predictive coding that is must reading for anyone wanting to get up to speed on this process.

What is Computer-Assisted Review?

Judge Peck started by quoting from his earlier article in Law Technology News:

By computer-assisted coding, I mean tools (different vendors use different names) that use sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e. training by) a human reviewer.

As Judge Peck concluded: “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”

Why Do We Need Computer-Assisted Review?

The answer for Judge Peck was simple: Other methods of finding relevant documents are expensive and less effective. As he explained:

  • The objective of e-discovery is to identify as many relevant documents as possible while reviewing as few non-relevant documents as possible.
  • Linear review is often too expensive. Despite being seen as the “gold standard,” studies show that computerized searches underlying predictive coding are at least as accurate as human review, if not more accurate.
  • Studies also show a high rate of disagreement among human reviewers as to whether a document is relevant. In most cases, the difference is attributable to human error or fatigue.
  • Key word searches to reduce data sets also miss a large percentage of relevant documents. The typical practice of opposing parties choosing keywords resembles a game of “Go Fish,” as Ralph Losey once pointed out.
  • Key word searches are often over-inclusive, finding large numbers of irrelevant documents that increase review costs. They can also be under-inclusive, missing relevant documents. In one key study the recall rate was just 20%.

Ultimately, Judge Peck reminded us of the goals underlying the Federal Rules of Civil Procedure. Perfection is not required. The goal is the “just, speedy, and inexpensive determination” of lawsuits.

Judge Peck concluded that the use of predictive coding was appropriate in this case for the following reasons:

  1. The parties’ agreement.
  2. The vast amount of ESI (over 3 million documents).
  3. The superiority of computer-assisted review over manual review or keyword searches.
  4. The need for cost effectiveness and proportionality.
  5. The transparent process proposed by the parties.

The last point was perhaps the most important factor leading to the decision: “MSL’s transparency in its proposed ESI search protocol made it easier for the Court to approve the use of predictive coding.”

How Does the Process Work?

The court attached the parties’ proposed protocol to the opinion. While it does not represent the only way to do computer-assisted review, it provides a helpful look into how the process works.

  1. The process in this case began with attorneys developing an understanding of the files and identifying a small number that will function as an initial seed set representative of the categories to be reviewed and coded. There are a number of ways to develop the seed set, including the use of search tools and other filters, interviews, key custodian review, etc. You can see more on this subject below.
  2. Opposing counsel should be advised of the hit counts and keyword searches used to develop the seed set and invited to submit their own keywords. They should also be provided with the resulting seed documents and allowed to review and comment on the coding done on the seed documents.
  3. The seed sets are then used to begin the predictive coding process. Each seed set (one per issue being reviewed) is used to begin training the software.
  4. The software uses each seed set to identify and prioritize all similar documents over the complete corpus under review. Essentially, they review at least 500 of the computer-selected documents to confirm that the computer is properly categorizing the documents. This is a calibration process.
  5. Transparency requires that opposing counsel be given a chance to review all non-privileged documents used in the calibration process. If the parties disagree on tagging, they meet and confer to resolve the dispute.
  6. At the conclusion of the training process, the system then identifies relevant documents from the larger set. These documents are reviewed manually for production. In this case, the producing party reserved the right to seek relief should too many documents be identified.
  7. Accuracy during the process should be tested and quality controlled by both judgmental and statistical sampling.
  8. Statistical sampling involves a small set of documents randomly selected from the total files to be tested. That allows the parties to project error rates from the sample.
  9. Here, the parties agreed on a series of issues that will, of necessity, vary on other cases. The key point is that the parties agree on the issues and test the coding during the process.

Random Samples

It is important to create an initial random sample from the entire document set. The parties used a 95% confidence level with an error margin of 2%. They determined that the sample size should be 2,399 documents. You can figure this out using one of the publicly available sample-size calculators such as Raosoft, which we often use:

Seed Sets

The protocol goes on to describe a number of ways to generate seed sets including:

  • Agreed-upon search terms.
  • Judgmental analysis.
  • Concept search.

The parties frequently sampled the results from searches to evaluate their effectiveness.

There is at least a good blog post to be written about seed sets. Some computer-assisted coding systems like the one used for this case start their process with seed sets. The notion is that attorneys understand the cases, know what is and is not relevant and can train the system to recognize more relevant documents more effectively than starting with no seed documents.

Others think this is a mistake. They believe that however well meaning, the attorneys will bias the system to find what they think is relevant and get self-reinforcing results. In this regard, they are suggesting that the attorneys will make the same mistakes found in key word searches—thinking that you know which words will be most effective at finding your documents.

Systems following this logic urge the user to start from scratch, telling the system what is and is not relevant based on reviewing documents. As you do that, the system begins developing its own profile of relevant documents and builds out the searches. The belief is that the system may create a better search through this process than it might if you bias it with your seed documents.

There is a middle ground here as well. Many of the latter systems (no seed) will allow you to submit a limited number of seed documents as part of the training process. That may represent the best of both worlds or it may not, depending on your beliefs. The important point is that there are different approaches to computer-assisted processing. This protocol shows you one approach only.

Training Iterations

The process involves a number of computer runs to find responsive documents. The parties started with a first set of potentially relevant documents based on analysis of the seed set. After that review, the computer was asked to consider the new tagging and find a second set for testing. Then a third and a fourth.

The protocol suggested that the parties run through this process seven times. The key is to watch the change in the number of relevant documents predicted by the system after each round of testing. Once that number dropped below a delta of 5%, the parties had the option to stop. The notion is that the system has become stable by that time, with further review unlikely to uncover many more relevant documents.

Finishing the Process

Once the training has completed and the system is “stable,” we move from computer-assisted to human-powered review. At that point, the producing party reviews all of the potentially responsive documents and produces accordingly.

Final QC Protocol

As a final stage, the parties need to focus on the potentially non-responsive documents—the ones the system says to ignore. The parties select a random sample (2,399 documents again) to see how many were, in fact, responsive.

These same documents (non-privileged ones) must be produced to the opposing party for review. If that party finds too many responsive documents in the sample or otherwise objects, it is time for a meet-and-confer to resolve the dispute. Failing that, you can always go to the court and fight it out.

Is This the Bible on Predictive Coding?

Certainly not. There are a lot of ways to approach this process. However, first opinions on any topic carry a lot of weight. We chose a profession that is guided by precedent, and these are first tracks on this new and exciting subject. The suggested procedures make sense to me and provide a starting point for your predictive coding efforts. This opinion and its accompanying protocol are important reading whether you are proposing or opposing the process for your next case.

 

In A Milestone for Predictive Coding, Judge Peck Says, ‘Go Ahead, Dive In!’

Lawyers and predictive coding are like kids around the swimming hole — no one wants to be the first to dive in for fear the water is cold or it harbors scary creatures. But once someone takes the lead, dives in and declares the water fine, everyone else is quick to follow.

That is why U.S. Magistrate Judge Andrew J. Peck’s opinion published Friday marks a major milestone for the use of predictive coding in e-discovery. It is the first judicial opinion in which a court has expressly approved the use of computer-assisted review.

Just last October, a prescient Judge Peck published an article in which he described (metaphorically speaking) the lawyers standing around the predictive-coding swimming hole:

To my knowledge, no reported case (federal or state) has ruled on the use of computer-assisted coding. While anecdotally it appears that some lawyers are using predictive coding technology, it also appears that many lawyers (and their clients) are waiting for a judicial decision approving of computer-assisted review.

Well, on Friday he gave them what they were waiting for. “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases,” he wrote in Da Silva Moore v. Publicis Groupe.

Predictive Coding Delivers Precision and Value

We wrote about this case just two weeks ago, recounting Judge Peck’s on-the-record colloquy about predictive coding with counsel and their e-discovery consultants. Near the end of that hearing, after discussing the issue at length, Judge Peck stated, “This may be for the benefit of the greater bar, but I may wind up issuing an opinion on some of what we did today.” If he did issue an opinion regarding predictive coding, I said then, it would no doubt be a milestone in the industry’s adoption of the technique.

In the opinion, Judge Peck readily concedes that his decision to allow predictive coding in this case was relatively easy, given that the parties agreed to its use. Their disagreement hinged on how to implement predictive coding, not whether it should be used.

Even so, he forcefully makes the case for predictive coding and computer-assisted review. In a section of the opinion titled, “Further Analysis and Lessons for the Future,” he describes why predictive coding delivers greater precision and value than either manual review or keyword searching.

The good old-fashioned method of linear, manual review is simply too expensive to be practical in cases with large numbers of documents, Judge Peck notes. In Da Silva Moore, for example, there are over 3 million emails. He then goes on to challenge the “myth” that manual review is the “gold standard.”

Moreover, while some lawyers still consider manual review to be the “gold standard,” that is a myth, as statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review. Herb Roitblatt, Anne Kershaw, and Patrick Oot of the Electronic Discovery Institute conducted an empirical assessment to “answer the question of whether there was a benefit to engaging in a traditional human review or whether computer systems could be relied on to produce comparable results,” and concluded that “[o]n every measure, the performance of the two computer systems was at least as accurate (measured against the original review) as that of human re-review.”

Likewise, Wachtell, Lipton, Rosen & Katz litigation counsel Maura Grossman and University of Waterloo professor Gordon Cormack, studied data from the Text Retrieval Conference Legal Track (TREC) and concluded that: “[T]he myth that exhaustive manual review is the most effective – and therefore the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.” … The technology-assisted reviews in the Grossman-Cormack article also demonstrated significant cost savings over manual review: “The technology-assisted reviews require, on average, human review of only 1.9% of the documents, a fifty-fold savings over exhaustive manual review.” (Citations omitted.)

The Limits of Keyword Searching

Next, Judge Peck reviews the limitations of keyword searching. Keyword searching has its place, he acknowledges, to cull data to a more manageable volume and to identify documents for predictive-coding seed sets. But far too often, he writes, “the way lawyers choose keywords is the equivalent of the child’s game of ‘Go Fish.’” (Judge Peck cites Ralph Losey’s blog for the analogy. We’ve used it here in a different context.)

The requesting party guesses which keywords might produce evidence to support its case without having much, if any, knowledge of the responding party’s “cards” (i.e.,the terminology used by the responding party’s custodians). Indeed, the responding party’s counsel often does not know what is in its own client’s “cards.”

Another problem with keyword searching is that it is often over-inclusive, finding large numbers of irrelevant documents, Judge Peck writes. Even worse, keyword searches are not very effective, he points out. He cites the landmark e-discovery study by David C. Blair and M.E. Maron, An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System, 28 COMMUNC’NS. OF THE ACM 289, 295 (1985). In that study, the attorneys were confident that their searches had found more than 75% of the responsive documents.  But they were wrong.  In fact, the searches had only found 20% of the relevant documents. More recent TREC Legal Track studies have corroborated these results.

Given the weaknesses in both manual review and keyword searching, Judge Peck concludes:

Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases. While this Court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection. … Courts and litigants must be cognizant of the aim of Rule 1, to “secure the just, speedy, and inexpensive determination” of lawsuits. Fed. R. Civ. P. 1. That goal is further reinforced by the proportionality doctrine set forth in Rule 26(b)(2)(C).

Returning to the case at bar, Judge Peck lists the the factors that led him to conclude that predictive coding was appropriate:

  1. The parties’ agreement.
  2. The vast amount of ESI to be reviewed (over 3 million documents).
  3. The superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches).
  4. The need for cost effectiveness and proportionality under Rule 26(b)(2)(C).
  5. The transparent process proposed by defendants.

With reference to the transparency of the process, Judge Peck stresses that this was a key factor in his decision, describing himself as a strong supporter of the Sedona Conference Cooperation Proclamation. While not all e-discovery counsel will be as transparent as counsel were in this case, “such transparency allows the opposing counsel (and the Court) to be more comfortable with computer-assisted review, reducing fears about the so-called ‘black box’ of the technology. This Court highly recommends that counsel in future cases be willing to at least discuss, if not agree to, such transparency in the computer-assisted review process.”

Other ‘Lessons for the Future’

Magistrate Judge Andrew Peck

Judge Peck wraps up his “Lessons for the Future” with four additional points:

  • It is unlikely that courts will be able to approve a party’s proposal as to when review and production can stop until the computer-assisted review software has been trained and the results are quality control verified. Only at that point can the parties and the court see where there is a clear drop off from highly relevant to marginally relevant to not likely to be relevant documents.
  • Staging of discovery by starting with the sources and custodians that are most likely to be relevant–without prejudice to the requesting party seeking more after conclusion of that first stage review–is a way to control discovery costs.
  • In many cases, the requesting party’s client will have knowledge of the producing party’s records, either because of an employment relationship or because of other dealings between the parties. Lawyers should be sure to use their client’s knowledge of the opposing party’s custodians and document sources. Think of cooperation to mean “strategic proactive disclosure of information,” Judge Peck advises. “If you are knowledgeable about and tell the other side who your key custodians are and how you propose to search for the requested documents, opposing counsel and the court are more apt to agree to your approach.
  • It was helpful that the parties had their e-discovery vendors present at the court hearings where the ESI protocol was discussed, Judge Peck says. “Even where as here counsel is very familiar with ESI issues, it is very helpful to have the parties’ e-discovery vendors (or in-house IT personnel or in-house e-discovery counsel) present at court conferences where ESI issues are being discussed. It also is important for the vendors and/or knowledgeable counsel to be able to explain complicated e-discovery concepts in ways that make it easily understandable to judges who may not be tech-savvy.”

What This Means for E-Discovery

No one need wonder what this opinion means for the future of e-discovery, because Judge Peck answers that question himself.

The opinion does not mean that computer-assisted review must be used in all cases, he says. Nor should the opinion be considered an endorsement of any particular vendor or of any particular review tool.

What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review. Counsel no longer have to worry about being the “first” or “guinea pig” for judicial acceptance of computer-assisted review. As with keywords or any other technological solution to e-discovery, counsel must design an appropriate process, including use of available technology, with appropriate quality control testing, to review and produce relevant ESI while adhering to Rule 1 and Rule 26(b )(2)(C) proportionality. Computer-assisted review now can be considered judicially-approved for use in appropriate cases.

In other words, Judge Peck has taken the dive into the scary pool of predictive coding and now he is beckoning to all of us, “Come on in, the water’s fine!”

Check for Privilege Before Turning Over Your Database: The Lesson in Thorncreek Apartments

Before you give opposing counsel the keys to your production database, run at least one check on the privilege field to see if any of your documents are marked “privileged.” That is the lesson a federal judge taught a hapless defense counsel in Thorncreek Apartments III v. Village of Park Forest, 2011 U.S. Dist. Lexis 88281 (N.D. Ill August 9,2011). If you don’t, you may be deemed to waive the privilege. I hate when that happens!

Icon of two keys on a keyring“What’s going on here?” you might ask. Can anyone be that sloppy? “Maybe,” I say in response. At least that’s what it seemed like here. Counsel literally made a production database available for more than seven months without once checking to see if it included privileged documents. A waiver is not inadvertent if you were hopelessly sloppy about it. Here is the story.

The Facts

The plaintiffs filed a motion before the District Court arguing that privilege was waived for six documents included in a production database provided by defendant Village of Park Forest. The Village argued that the documents were inadvertently produced in a production database hosted by its online vendor Kroll Ontrack. (I don’t think Kroll did anything wrong here.)

Defense counsel went through what seemed like a reasonable process in pulling files off of a number of tape backups. First, they conducted a key word search to pull back documents that might be responsive. The key words had been agreed upon by opposing counsel and in some cases ordered by the court. (Looks like there may have been some controversy around the key words and no, people weren’t talking about predictive coding in this case.)

As a second step, Kroll put the potentially responsive documents in an online database for defense counsel to review. They did so, marking documents as responsive, non-responsive and privileged.

The third step was to place the documents released by defense counsel in another online database made accessible to plaintiffs’ counsel. As the court noted, documents that the Village elected to withhold from production were not placed in this database. So far it all makes sense.

Producing the Privilege Documents

Here is where the rub came. At some point, plaintiffs complained that they were not able to see the documents returned from the agreed-upon searches but marked non-responsive. The Village had previously said it would include non-responsive documents in the production database in order to show how many its review had identified. In an attempt to be magnanimous, the Village elected to begin placing all of the documents in the production database—responsive and non-responsive.

As the court noted, the parties’ briefs left it a bit “murky” as to how the Village intended to handle the documents counsel had reviewed and marked “privileged.” Why they were not pulled out of the population before the production database was marked live is simply beyond me.

But they were not and 159 privileged documents went online, easily available to plaintiffs’ counsel. Even more surprising, during the seven months the production database was live, Village counsel did not bother to produce a privilege log. At one point, counsel claimed that there were no privileged documents to withhold. Somebody tell me how that happened.

So, as you probably guessed by now, depositions started and some wiseacre slapped two of the juicy privileged documents in front of a witness. Village counsel erupted, claiming privilege and inadvertent production. The game was on.

Game On: Reel Those Privileged Documents Back In

Actually, the game was over, at least for defense counsel. It appears that the parties came to agreement with respect to most of the privileged documents (probably the non-important ones) but disagreed with respect to six of them. Quickly concluding that at least some of the six were privileged, the court thus was required to review the doctrine of inadvertent waiver.

The test for inadvertent waiver is pretty simple, made more so by the recent amendments to F.R.E 502:

  1. Were the documents privileged?
  2. Was the production inadvertent?
  3. Should privilege be waived nonetheless?

The privilege discussion didn’t interest me much. I did all that in law school. So, my attention was focused on the second and third prongs of the test.

Was the Disclosure Inadvertent?

As the court noted, this issue could be wrapped up with the third element, which goes to the heart of forgiveness. Rather than do that, the court espoused a simple analysis for this element.

It simply presumed based on the evidence before it that counsel didn’t really mean to include all of those privileged documents in the production database. And who would?

Certainly counsel wasn’t using those documents in an affirmative way, which was the original reason courts held that the full privilege would be waived. The old, “I did it because counsel told me to do it,” was a key way to waive privilege for that entire subject matter. That didn’t happen here.

Nor did the Village sit silent as opposing counsel was cramming those two juicy privileged documents down their witness’s throat. They got up and objected, allowing the deposition to go forward only under protest.

Surprisingly—and this seemed important to the court—it then took more than four months before Village counsel came up with a proper privilege log to join the dispute. While the court stated this as a dispositive fact, you can just tell it didn’t like counsels’ lackadaisical approach. Neither would I; I wonder what happened there.

Anyway, the court let them off the hook and ruled that the production of privileged documents were inadvertent. On to the next step in the test. (But don’t you take four months to produce your privilege log.)

Reasonable Steps to Prevent Disclosure?

Once the court concluded the production was inadvertent, the Village had two more hurdles to cross:

  1. Did the holder of the privilege take reasonable steps to prevent production of privileged documents?
  2. Did the holder of the privilege promptly attempt to rectify the error after it became known?

This case was about the first prong of that test. The court complained that the Village provided “precious little” about the steps taken to find and isolate privileged documents. Counsel for the Village wrote an email stating that he spent “countless hours” reviewing a “relatively large amount” of documents to find those with privileged content. Why, the court asked, was there no affidavit to support this allegation? Why indeed.

The court had no sympathy for the next argument—counsel thought that marking the document “privileged” would keep it out of the Kroll database. As the court astutely pointed out:

It would have been a simple matter for the Village to check the production database created by Kroll—before it went live online and became available to [plaintiffs]—to verify that privileged documents were not disclosed.

Duh! I am not privy to the Kroll software but I bet there was a simple way to search to see if anything was privileged—either by privilege tag, if that was included in the production database, or by the reference number given to the privileged documents. What happened here?

In a somewhat gratuitous fashion, the court piled on by noting that not a single privileged document was withheld and that no privilege log was produced.

I confess that I am puzzled myself as to how this happened. Counsel went to the trouble of marking 159 documents privileged. Why in the seven months that followed did counsel not ask for a printout from the Kroll system sufficient to produce a simple privilege log? Alas, we case readers don’t get to ask these questions.

As the court concluded—ironically citing yet another case as precedent for the point:

It is axiomatic that a screening procedure that fails to detect confidential documents that are actually listed as privileged is patently inadequate.

Sorry Charlie. You lose on that one.

Failing to Notice and Rectify

The court didn’t stop there. It went on to fail the Village on the second element of the test. It held that the Village failed to rectify the error in a timely way. Again, ignorance in using the Kroll database seemed to be at the heart of the finding.

For starters, the court forgave the Village for taking another four months after the deposition to issue the privilege log. Methinks the court didn’t like that fact at all but just didn’t want to say so.

Instead, the court jumped on counsel for failing to find its own error for over nine months—the period from March to December 2009 when the production database was available to the plaintiffs. According to the court, defense counsel should have logged-in and run a privilege search during this period. The 150+ hits that came back would have been a dead giveaway.

As the court said:

Yet for some none months, the Village apparently had no inkling that the production database contained documents that the Village wished to withhold as privileged, or that [plaintiffs] were reviewing and obtaining those documents. If that is true (and we accept that it is), that means the Village as not paying any attention whatsoever to what documents its opponent in the litigation was selecting from the database. Perhaps [plaintiff] simply selected all of them; the parties’ briefs do not tell us if this is so.  But, even if that were the case, a single visit to the production database could have alerted the Village to the problem.

This seems like piling on to me. Counsel clearly didn’t know much about databases or pay much attention to the process. The process was sloppy or non-existent. The client paid the price.

The court did go on to make one last point that brings us full circle. It noted that the problem might have come to light earlier had Village counsel provided a privilege log, which was its duty in the first place. Doing that might have forced plaintiff’s counsel to acknowledge what it knew—that the database had a bunch of privileged documents. But with no privilege log and a seeming statement that there were no documents being withheld on privilege grounds, all bets were off. Plaintiffs’ counsel could sit quietly until the deposition and then drop the bombshell on a hapless witness.

What Can We Learn From This?

This is a simple-enough case with a simple-enough message: Don’t produce documents in an online database without doing some basic checking first. Assuming, as the case does, that there was a field containing a privilege designation, it would take counsel milliseconds to realize that something bad was about to happen. If lead counsel wasn’t comfortable running the search, how about that tech-savvy associate or legal assistant? If not them, how about your friendly vendor? If asked, the Kroll people could have spotted the mistake. Some might say they should have but not me.

We built an automated production system that can be run by our clients with no Catalyst intervention. Rather than allow these kinds of mistakes to happen, we added a QC rule-set that will not allow a document marked “privileged” or “potentially privileged” to be produced without a specific override. Even if foldered for production, these documents are pulled out into a special folder that must be addressed by the client before a production can go through.

These rules are just another step in trying to make the process easier and more foolproof for lawyers who are not comfortable with technology. It doesn’t guaranty that a privileged document will never be produced—we have seen cases where the documents are marked privileged after they are produced—but it can cut down on mistakes. With the stakes (and the volumes) this high, you have to do everything you can to avoid an inadvertent waiver.

This is not a case where counsel took dozens of steps to avoid privilege yet something slipped through, as I have written about before. (See, Bad Facts Make Bad Law: ‘Mt. Hawley’ A Step Backward for Rule 502(b).) Rather, it is about a simple mistake that anyone could have caught with just a smidgen of effort. I don’t feel as bad for Village counsel as I do for some of the other victims. This case is no “derelict on the waters of the law,” it is a fair ruling on somewhat extreme circumstances. Counsel were sleeping on the job (or so it seems to me) and paid the price.

So, the lesson here? Don’t produce those documents without checking to see if privileged files might have snuck through. Run some searches, sample some documents and for God’s sake check the privilege field. You will sleep better, and pay lower malpractice premiums, if you do.

 

The Case of the Missing Acronyms, Or How to Avoid Having to C.Y.A. in Search

You don’t have to read too many legal documents to know that lawyers love to use acronyms and abbreviations. Why is it, then, that lawyers sometimes forget about acronyms and abbreviations when constructing keyword searches? If you’re going to search for “Federal Trade Commission,” shouldn’t you also search for “FTC”?

A recent order by a U.S. magistrate judge in the Southern District of California is a reminder of why lawyers should never forget about acronyms and abbreviations in e-discovery. Further, the ruling underscores the importance of thinking about these terms early on in a case.

In this antitrust matter, discovery was fairly far along before the plaintiffs realized something was missing. As the defendants began to produce documents in response to plaintiffs’ discovery requests, the plaintiffs realized that some of the various defendants routinely referred to each other using abbreviations or acronyms. Upon learning this, they filed a motion seeking to compel the defendants to run document searches containing these abbreviations and acronyms.

That might seem to be a reasonable request. But U.S. Magistrate Judge Louisa S. Porter denied the motion. To understand why, we need to go back to the beginning of discovery in this case.

The Perils of Cooperation

Before the defendants responded to plaintiffs’ discovery requests, they notified the plaintiff that they intended to use keyword searches to find relevant documents. They asked the plaintiffs to provide search terms, but the plaintiffs contended they could not do this because they lacked sufficient information at that point to construct meaningful searches. The defendants went ahead and created their own list of search terms, which they then showed to plaintiffs. Upon seeing it, plaintiffs protested that the terms were too restrictive and were unlikely to capture some highly relevant documents. At that point, the two sides sat down and negotiated a list of agreed-upon search terms. The negotiated list included several terms specifically targeted to capturing defendant-to-defendant communications.

With the parties having agreed on search terms, the defendants began to produce documents. From the court’s opinion, it appears that it was through review of those produced documents that plaintiffs discovered the frequent use of abbreviations and acronyms. Thus came plaintiffs’ motion and thus we arrive back at the magistrate-judge’s denial of that motion.

Plaintiffs had ‘Ample Opportunity’

As I began to read the judge’s reasoning, I was sure she was heading towards ruling in the plaintiffs’ favor. She starts out talking about the Sedona principles of cooperation. She talks about the need for keyword searching to be “a cooperative and informed process.” She emphasizes the importance of “a full and transparent discussion among counsel of the search terminology.” All of this had me thinking she was about to chastise the defendants for their collective failure to mention the acronyms and abbreviations.

Instead, it was the plaintiffs who the judge chastised. Her message to them–these are my words, not hers–was this: “You had plenty of chances to ask about abbreviations and acronyms and you should have thought to do so.”

Here is how she put it:

Here, the Court finds Plaintiffs had ample opportunity to obtain discovery regarding abbreviations and acronyms of Defendant companies, and the burden or expense to Defendants in having to comply with Plaintiffs’ request regarding abbreviations and acronyms outweighs its likely benefit. … First, Plaintiffs had two separate opportunities to suggest that Defendants search for abbreviations and acronyms of the Defendant companies; initially, before Defendant’s produced documents; and second, during negotiations between the parties on agreed-upon expanded search terms. In the spirit of the conclusions made at the Sedona Conference, and in light of the transparent discussion among counsel of the search terminology and subsequent agreement on the search method, the Court finds it unreasonable for Defendant to re-search documents they have already searched and produced.

Second, after meeting and conferring with Plaintiffs, and relying on their agreement with Plaintiffs regarding search terms, Defendants have already searched and produced a significant number of documents, thereby incurring significant expenses during this limited discovery period. Further, as articulated by Defendants, the new search terms Plaintiffs have proposed would require some Defendants to review tens of thousands of additional documents that would likely yield only a very small number of additional responsive documents. Therefore, the Court finds a re-search of documents Defendants have already searched and produced is overly burdensome.

Without knowing more about the precise abbreviations and acronyms involved in this case, it’s hard to judge whether this was a fair outcome. But it sure strikes me the wrong way.

In the normal course, search queries should use stemming, fuzzy searching, phonic searching and similar techniques to find similar or related words. But what if a term is off the map, so to speak? What if internal emails regularly referred to the company CEO as “Col.” because he had long ago been an officer in the military? How would the opposing party know, in advance of any discovery, to search for this? No stemming or fuzzy searching would pick that up. On the other hand, if you’re suing General Motors and neglect to search for GM, then you have no one to blame but yourself.

The lesson here, I suppose, is to remember that you don’t know what you don’t know. In the case of acronyms and abbreviations, anticipate and ask–so you don’t find yourself having to C.Y.A. later on.

In Re: National Association of Music Merchants, Musical Instruments and Equipment Antitrust Litigation, MDL No. 2121 (Dec. 19, 2011).

In E-Discovery, Even Google Needs Help with Search: Oracle Case is Lesson in the Complexity of Privilege Search

Lest anyone underestimate the complexity of privilege searching in e-discovery, consider the recent case in which none other than search giant Google got tripped up. Although the company’s privilege searches screened out an email labeled “Attorney Work Product,” the searches failed to catch nine drafts of the same email, autosaved by the author’s email program.

Worse yet, the email was a true “smoking gun.” In fact, the email was so potentially inculpatory that the judge remarked that it and the Magna Carta (common law) would be all that the opponent’s counsel would need to win its case. (It is interesting to note that this was not the only incriminating document, and this document is even more damaging to Google when paired with a 2005 email written by Andy Rubin, Google vice president in charge of Android.)

In this post, we’ll give you some background on the case and then discuss the lessons it teaches about privilege searching in e-discovery.

A Dispute Over Java

Since August 2010, Google has been locked in litigation with Oracle America over whether its Android operating system violates Oracle’s Java patents and copyrights. Within the month before Oracle filed the lawsuit, lawyers for both companies met to discuss the alleged infringement. Ten days later, Google General Counsel Kent Walker convened an internal staff meeting to discuss the matter further. One of the attendees was Google software engineer Tim Lindholm.

A week later, as a follow-up to that meeting and a prelude to another, Lindholm wrote an email addressed to Google senior counsel Ben Lee and Andy Rubin. The email was labeled “Attorney Work Product” and “Google Confidential.” In part, it said:

What we’ve actually been asked to do (by Larry and Sergei) is to investigate what technical alternatives exist to Java for Android and Chrome. We’ve been over a bunch of these, and think they all suck. We conclude that we need to negotiate a license for Java under the terms we need.

During the four minutes it took Lindholm to write this email, his computer auto-saved it nine times. Each time, the “to” and “cc” fields were still blank and the email was not labeled with the footer. Only the tenth and final version identified the recipients and included the labels for “work product” and “confidential.”

During discovery after the lawsuit commenced, Google produced the first eight drafts of the email to Oracle. It held back the ninth draft and the final version, listing both on its privilege log. Google later told the court that its electronic scanning mechanisms “did not catch those drafts before production” because the drafts did not contain the confidentiality or privilege headings and did not list any addressees.

In court documents, Google said, “Due to the volume and speed of production in this case, Google has been forced to rely on electronic screening mechanisms, which identify potentially privileged documents based in part on sender and recipient information, as well as privilege-related keywords.”  It appears that these drafts slipped through because there was no “eyes on” review of the documents unless they hit on privilege terms.

The Emails Come to Light

On July 21, 2011, in two separate hearings in the case – one a telephonic hearing before U.S. Magistrate Judge Donna M. Ryu to compel Lindholm’s deposition and the other a Daubert hearing before U.S. District Judge William Alsup – Oracle referenced one of the email drafts. In the Daubert hearing, Oracle read part of it into the record. Google’s attorneys responded by addressing the substance of the email, but they did not object to it as privileged or confidential.

Judge Alsup readily picked up on the email’s significance. At one point during the hearing, he said to Google, “You are going to be on the losing end of this document” with “profound implications for a permanent injunction.” At another point, he said that a good trial lawyer could use the simple combination of the Lindholm email and the Magna Carta to win Oracle’s case and get an injunction. And to add insult to injury, all this took place with several news reporters in attendance, without objection.

Judge William Alsup

After those hearings, later the same day, Google sent notice to Oracle that the draft email constituted “protected material” under a protective order/clawback agreement and asked that it no longer refer to it in public. The next day, asserting that the draft was “unintentionally produced privileged material,” Google clawed it back. Soon after, Google clawed back all draft versions of the email.

Google’s next move was to ask Judge Alsup to redact the transcript of the Daubert hearing to remove all references to the Lindholm document. The judge denied the request, ruling that the email was protected neither by the attorney-client privilege nor the work-product doctrine.

Meanwhile, Oracle filed a motion with Judge Ryu to compel Google to re-produce the emails it clawed back. Judge Ryu granted Oracle’s motion, finding that the email was neither privileged nor otherwise protected. Google appealed that finding to Judge Alsup, who entered an order on Oct. 20, 2011, affirming the magistrate judge. “The Lindholm email and drafts will not be treated as protected by attorney-client privilege or work-product immunity,” he concluded.

Google now says it will appeal Judge Alsup’s ruling.

What This Means for Privilege Search

Although much can be said about whether or not these documents were privileged, whether the privilege was waived, and whether the actions of the attorneys in the hearings were well thought out, we will focus on the implications for search.

Producing Autosave Documents. One big question this case raises is whether producing parties should ever include “autosave” drafts of documents. Many software packages delete the autosave information when the user saves the document or sends the email. The answer will depend on the circumstances of the case, but it is a question counsel should always consider.

Clawback Agreements under Rule 502. The only sure way to protect privileged documents is to review them. While it was the goal of Rule 502 of the Federal Rules of Evidence to obviate the need to review every document for privilege, this case once again shows that “you can’t un-ring a bell.” Once the other side sees a damaging privileged document, it’s almost impossible to control the damage.

Keyword Searching and Highlighting. Keyword searching would have flagged this document as Potentially Privileged only if the search terms had included “negotiat*” or “licens*”. Many firms don’t like to use such broad terms because they flag too many documents as Potentially Privileged. At Catalyst, we have developed a list of several hundred words commonly used in legal documents, and many clients like to use them in their “Potentially Privileged” searches. However, doing so will identify a large number of “false hit” documents that are not privileged. One approach we recommend is to use term highlighting so that all such terms are easy to see as the document is reviewed. Even that wouldn’t have helped in this case because it appears that only documents that hit on privilege search terms were reviewed for privilege.

Near Dupe Analysis. We generally recommend using email threading and near-dupe analysis software. While the principal benefit is speeding the review, a second important benefit is to prevent inconsistent coding. In this case, when the attorneys were reviewing the final version of the emails for privilege, they would also have seen the autosaved drafts and could “tag all” as privileged. Further, we always recommend an “inconsistent coding” analysis, part of which is to identify documents in the same set of near dupes/email threads that are inconsistently coded for privilege. This would have caught the autosaved drafts before production in this case, where non-hits weren’t reviewed.

(It is interesting to note that in the most famous case of inadvertent production of a privileged, “smoking gun” document, Mt. Hawley Insurance Co. v. Felman Production, it appears that the damaging document was a dupe of a document withheld and on the privilege log. This underscores the importance of doing an inconsistent coding analysis! On this point, see also our earlier post: Bad Facts Make Bad Law: ‘Mt. Hawley’ A Step Backward for Rule 502(b)).

Other inconsistent coding checks. In addition to duplicate, near-dupe and attachment family inconsistent coding checks, other  methods we recommend to look for similar documents are “More Like This” and Key Document Clustering. In this case, this was such an important document that the attorneys should have used analytics to find similar documents as a QC measure.

Predictive Coding. Would predictive coding have helped? It’s hard to say. If the final versions of the emails had been used as seed documents, or tagged as privilege in the set of documents in the privilege sample set, there’s a good chance that the autosaved versions would have been highly ranked for privilege. On the other hand, if they were not in the set chosen to be sampled and not among the seed documents, then they might have slipped through.

The Honor System. It is also worth noting that in most cases the claims of privilege for documents listed on the privilege log are very seldom challenged.  The attorneys producing the documents are essentially on the honor system.  Had the “autosave” emails not slipped through, Oracle would not have had any way to challenge the claim of privilege for the withheld emails.  The claim of privilege in this case was certainly arguable, but it is the practice of many parties to withhold whole families of documents even when only one document is privileged and otherwise to err on the side of claiming privilege.

The Bottom Line

Recently, IDG News Service reporter James Niccolai wrote a thoughtful examination of the Google-Oracle privilege fiasco, How Google Was Tripped Up By a Bad Search. As he noted, the irony in the case is that “the e-mail might never have seen the light of day if the search tools used to identify documents covered by attorney-client privilege had done their job.” In any other case, that might not be considered ironic. But this, after all, is Google.

Whether these draft emails slipped through as a result of human or technological error, Google attributed the error to “electronic scanning tools.” Google has not provided further details and Judge Alsup’s opinion contained no further explanation.

Even so, as we’ve reviewed in this post, the case is not without lessons for e-discovery professionals. No human or technology can offer 100 percent protection against inadvertent production of privileged documents. However, some of the steps we’ve outlined above will certainly help minimize the risk.

‘Mt. Hawley’ Affirmed and Claim Dismissed: District Judge Again Puts His Stamp of Approval on Troubling Rulings

For over a year, we have been writing about a West Virginia decision (and its progeny) that we believe went too far in making new e-discovery law. The original decision, issued May 18, 2010, was styled Mt. Hawley Insurance Co. v. Felman Production. You can read my original post at: Bad Facts Make Bad Law: ‘Mt. Hawley’ A Step Backward for Rule 502(b).

In that decision, Magistrate Judge Mary E. Stanley held that Felman had waived attorney-client privilege by inadvertently producing a smoking-gun email to counsel suggesting that it might be helpful to their insurance claim for business interruption to backdate several orders from clients. If the orders had come in while the machinery in question was under repair, that might provide support for their $38 million dollar insurance claim. You have to love their chutzpa at the very least.

The 'smoking-gun' email involved a furnace such as this one, at Felman's West Virginia facility.

In my original post, I suggested that bad facts (outright fraud it seemed to me) might be responsible for what I thought was bad law. After all, the production had been overseen by a highly reputable law firm (which had no involvement in this email). Counsel had not only been diligent in trying to screen out privileged documents, but it had gone far beyond what we have typically seen elsewhere. Indeed, counsel cited over 20 steps they had taken, including a variety of review and sampling efforts:

  1. Negotiated the ESI stipulation with defendants.
  2. Hired an ESI collection vendor, Innovative Discovery.
  3. Discussed with Felman’s IT department the company’s computer network structure and identified potential sources of relevant ESI.
  4. Visited Felman’s West Virginia plant to coordinate and oversee ESI collection.
  5. Decided to collect data using forensic imaging.
  6. Directed the vendor to collect ESI from the current server and the backup server.
  7. Collected 1,638 gigabytes of data.
  8. Downloaded emails from 29 custodians for processing by its law firm, Venable.
  9. Hired a new vendor to process Felman’s Oracle and Soloman databases.
  10. Identified the first six workstations to be processed and learned that each contained more data than anticipated.
  11. Examined methods to cull non-relevant materials.
  12. Selected search terms to retrieve documents responsive to defendants’ document requests.
  13. Tested the search terms against the Felman emails and added additional search terms.
  14. Tested the search terms, including the additional terms, against the Felman emails, tagged responsive documents, and set them aside for privilege review.
  15. Produced 17,064 Excel spreadsheets.
  16. Selected privilege search terms to identify materials which are potentially privileged and relevant.
  17. Set aside potentially privileged materials for individualized document-by-document review for relevancy and privilege.
  18. Tested the privilege search terms against Felman’s emails.
  19. Retrieved native files of all images and examined thumbnails.
  20. Conducted “eyes-on” review of all documents identified both as relevant and potentially privileged.
  21. Decided to use a vendor to complete the processing of Felman’s emails.
  22. Produced ESI in native or TIF format, with 36 fields of metadata.
  23. Produced more than 346 gigabytes of data without sampling for relevancy, over-inclusiveness or under-inclusiveness.

Counsel got nailed simply because several of the Concordance indexes they used turned out to be corrupt. As a result, privilege searches didn’t turn up anything for the documents in those indexes. Since counsel didn’t attempt to review every one of the millions of documents they produced, several key documents slipped through the net. Reading between the lines, Magistrate Judge Stanley seemed to lay blame on the fact that they did not appear to sample the documents that they produced but never reviewed to see if any might be privileged.

We wrote about subsequent decisions as well as other commentary about the decision here:

Now, in what has to be the final straw in this saga, the presiding judge in the case, U.S. District Judge Robert C. Chambers, has taken the ultimate step by issuing further sanctions and dismissing the lost-business claim: Felman Production v. Industrial Risk Insurers, 2011 U.S.Dist. Lexis 112161 (Sept. 29, 2011).

Let us look at the court’s reasoning.

Bad Discovery Practices

U.S. District Judge Robert C. Chambers

You can’t read any of the Mt. Hawley decisions without being reminded that both the magistrate judge and the district judge were not happy with Felman’s discovery practices. Among others we have chronicled, you won’t win favor with the courts by backing up the trucks and dumping a ton of irrelevant electronic files on the opposition. When you go the next step and make every one of them confidential, regardless of content, you only worsen your position. The judge was particularly incensed to see pictures of kitties with a big confidential stamp on them. Kitties. Yes kitties. Awww, how cute. Oh, but there were a couple of naked men in the photos too (although not with the kitties, thank heavens).

There were also missing files and no real attempt to issue a litigation hold by Privat, the Ukrainian company that was at the controls in this case. Indeed, it appears that party Felman actively dissembled with respect to its true owners, a fun loving bunch of Ukranians with little respect for the discovery process. They got caught when one of the inadvertently produced documents showed that they were running the show. Oh what a tangled web they weaved!

As Judge Chambers explained:

Felman’s failure to comply with Judge Stanley’s August 19 and October 19, 2010 Orders was inevitable in light of the lack of care Felman exercised. … Felman did not provide litigation hold memos to the West Virginia Felman staff until four months after this case was filed. Felman also admitted that the Ukrainian custodians were not instructed to preserve their documents.

This led to the destruction of documents when the Privat representatives sold their computers before receiving a document request. Convenient, to say the least.

All this led to a motion for sanctions—either a dismissal outright or dismissal of the business interruption claims plus an adverse inference instruction regarding the missing documents.

The judge made short order of the motion. While not dismissing the case outright, he did dismiss the $38 million business interruption claim. He started by discussing spoliation, citing Magistrate Judge Grimm in Victor Stanley, Inc. v. Creative Pipe, Inc., 269 F.R.D. 497, 522 (D. Md. 2010).

A party subject to [the duty to preserve] must “identify, locate, and maintain information that is relevant to specific,predictable, and identifiable litigation.” In ascertaining whether a party has fulfilled its duty to preserve, a court must “determine reasonableness under the circumstances … [which] in turn depends on whether what was done—or not done—was proportional to that case and consistent with clearly established applicable standards.

The court went on to find specific culpability—gross negligence—in Felman’s failure to issue litigation holds and otherwise take steps to preserve evidence.

In the end, the court didn’t dismiss all claims but rather threw out the big one—for business interruption. It left the other claims and counterclaims to be tried.

The sanctionable conduct of Felman and the resulting prejudice to Defendants merits dismissal fo the business interruption claim because the unavailable evidence and improper conduct related predominantly to it. As to Defendant’s counterclaim, the unavailable evidence is less prejudicial and an adverse inference instruction is an adequate remedy.

The court also went on to award attorneys’ fee in the bargain.

What Happens Next?

My guess is that we won’t hear any more from the parties in this case. The business interruption claim was the heart of Felman’s demand in the first place. An adverse inference instruction is a powerful tool to address the rest of the claims, at least to the extent Felman oversteps the bounds of its insurance policy. Settlement is the likely next step in this matter.

What about a malpractice claim? Given the Felman party antics, I wouldn’t put it past them. They might claim malpractice for producing the damaging documents. With a bogus claim for $38 million at stake, who knows what they might do. At the least, this would present interesting evidentiary questions for the firm’s malpractice carrier. I wouldn’t want to be in the settlement meeting.

But my concern is for other cases more than for how this one wraps up. With 23 steps being ruled as not enough, what will be adequate? Perhaps the problem could have been remedied by some simple sampling procedures, but that isn’t clear from anything I read. Perhaps enough other judges will choose to ignore it so that it becomes weak precedent. “A derelict on the waters of the law,” as famed Supreme Court Justice Felix Frankfurter once said. That’s my vote. Send the bad guys home but leave e-discovery law alone.

Does Sampling Case Set a ‘Dangerous Precedent’?

At her On the Case blog for Thomson Reuters, Alison Frankel has an intriguing report about a U.S. magistrate judge’s order in an e-discovery dispute that has prompted the U.S. Chamber of Commerce to leap into the fray, warning that the order, if allowed to stand, will set “a dangerous precedent” and will be of “profound significance to businesses in America.” In what Frankel describes as a “venture into the weeds of a federal district court discovery dispute,” the Chamber has filed an amicus brief in U.S. District Court in Manhattan asking a federal judge to overturn the order. The Washington Legal Foundation and the International Association of Defense Counsel have also weighed in as amicus.

Making this brouhaha even more notable is that this case is still only in the preliminary stage of litigation. It is a putative class action yet to be certified as such and one where the judge has stayed all discovery pending a decision on certification of the class.

Where, then, do the amici see such profound danger? The danger, they contend, lies in the magistrate judge’s refusal to allow the defendant to use sampling in order to limit the scope of its preservation obligation.

Thousands of Hard Drives

The case, Pippins v. KPMG, is a wage-and-hour dispute in which plaintiffs challenge KPMG’s treatment of accounting associates in its audit practice as exempt employees under federal and New York labor laws. While awaiting a ruling on whether the case will be certified as a class action, KPMG asked the court to issue a protective order limiting the scope of its duty to preserve.

Rather than preserve the hard drives of thousands of former employees who would fall under a nationwide class action, KPMG instead sought to preserve only a random sample of 100 hard drives. Against that sample of hard drives, plaintiffs could then apply keyword searches to determine whether they contain information relevant to the case.

Motivating KPMG to make this request was cost. It estimated that the potential class of plaintiffs could number 7,500 and that the cost to preserve each drive would be $600. It had already spent $1.5 million preserving the drives of 2,500 former audit associates and faced another $3 million bill to preserve the remainder.

Given this cost, KPMG argued that the court should apply a proportionality test to reconcile its duty to preserve discovery material with the burden it would face in light of this cost.

Plaintiffs objected to KPMG’s request on three grounds:

  • Allowing KPMG to destroy hard drives at this early stage of the litigation would be premature and would prevent the parties from crafting a more-informed plan for preservation and production and from having the information they need to generate search terms.
  • Using keyword searches alone, as KPMG proposed, is an “outmoded method of data recovery” that does not reflect context or capture all relevant materials.
  • Keyword searches, alone, would not be sufficient for plaintiffs to cull information from the drives about employees’ hours worked and work product.

Even with these objections, the plaintiffs were not opposed to the use of sampling. In fact, the parties negotiated extensively and even went through mediation in an attempt to come to terms on sampling. Only when they could not agree did KPMG seek the protective order.

(As an aside, I am told by Jim Eidelman, principal consultant in Catalyst’s Search & Analytics Consulting group, that the plaintiffs are correct that keyword searching won’t do the job. Plaintiffs’ counsel would want to look at the time stamps on the files to see when the employees were working and at the content of the documents to show the nature of the work, he says.)

Too Many Unknowns to Permit Destruction

On Oct. 7, 2011, U.S. Magistrate Judge James L. Cott issued a memorandum and order denying KPMG’s motion for a protective order. In his memorandum, Judge Cott stepped through a detailed analysis of KPMG’s duty to preserve the hard drives and of the applicability of a proportionality test. Yet the gist of his analysis seemed to boil down to one factor — it was just too early in the case to let KPMG destroy all those hard drives.

[C]ourts in this district have cautioned against the application of a proportionality test as it relates to preservation. … At this point in the litigation, it is unclear whether an application of a proportionality test would weigh in favor of a protective order. Certainly KPMG’s preservation of the hard drives is not without considerable expense. … However, KPMG has not been able to establish conclusively that the materials contained on the hard drives are of either “little value” or “not unique.” … Until discovery proceeds and the parties can resolve what materials are contained on the hard drives and whether those materials are responsive to Plaintiffs’ document requests, it would be premature to permit the destruction of any hard drives. Moreover, the parties should be able to make such a determination promptly once the Motion to Certify is resolved and the stay of discovery is lifted. Because it is not possible to predict when that determination will be made, it is similarly difficult to conclude what KPMG’s costs of preservation will be on an ongoing basis. With so many unknowns involved at this stage in the litigation, permitting KPMG to destroy the hard drives is simply not appropriate at this time.

In concluding his memorandum, Judge Cott made clear that the parties remained free to negotiate an agreement on a method for preserving only a sample of the hard drives. But unless the parties could reach agreement, and until the contours of the class could be defined, he ordered KPMG to continue its preservation efforts.

Case of ‘Exceptional Importance’

That order is now pending review by U.S. District Judge Colleen McMahon, where the Chamber and the other amici have filed their briefs. Noting that it rarely files amicus briefs outside of appellate courts and does so only in cases of “exceptional importance,” the Chamber says, “This is such a case.”

The Magistrate Judge’s opinion reached an unprecedented conclusion here: that, faced with an uncertified class or collective action alleging that employees were not properly compensated for overtime, KPMG, at considerable expense, has to rip out and retain every single hard drive from every computer that any member of the putative class or collective may have used before leaving the company. This KPMG must do, said the Judge, even though there is a database that directly recorded the employees’ hours, and even though virtually all of the data on the hard drives would be irrelevant to the case.

The Magistrate Judge made two errors of law that led to this novel conclusion. First, he held that the duty to preserve electronically stored information was not limited by any test of proportionality. Second, he held that every member of the proposed plaintiff class or collective action was a “key player” for purposes of discovery and the retention of electronic information. Both holdings are wrong, unprecedented, and—if affirmed here and followed by other courts—would be highly detrimental to the conduct of civil litigation under the Federal Rules.

Are the consequences of this ruling as dire as the Chamber predicts? Probably not. As I noted above, the nub of the ruling is that destruction is forever. But the case illustrates on a broad scale the burden companies face in meeting their preservation obligations.

What stands out to me is that all of this could have been avoided if the parties had been able to reach agreement on sampling. To their credit, they tried–even taking the issue to mediation. But while this case is characterized as one about proportionality, preservation and sampling, perhaps it is really an object lesson in the importance of cooperation.