Catalyst Repository Systems - Powering Complex Legal Matters

E-Discovery Search Blog

Catalyst E-Discovery Search Blog RSS Follow Catalyst on Twitter Join Catalyst on Facebook Catalyst on LinkedIn
Follow Us:
Technology, Techniques and Best Practices

Courts Should Consider Search Technology, Say New Penn. E-Discovery Rules

The Supreme Court of Pennsylvania

The Supreme Court of Pennsylvania has adopted new e-discovery rules that expressly distance federal e-discovery jurisprudence and instead emphasize “traditional principles of proportionality under Pennsylvania law.” Notably, the new rules provide that, when weighing proportionality, parties and courts should consider electronic search and sampling technology, among other factors.

The court promulgated the new e-discovery rules June 6 as amendments to the Pennsylvania Rules of Civil Procedure. They take effect Aug. 1, 2012.

The most significant change is to Rule 4009.1, governing requests for the production of documents and things. The current rule defines “documents” as including:

electronically created data, and other compilations of data from which information can be obtained, translated, if necessary, by the respondent party or person upon whom the request or subpoena is served through detection or recovery devices into reasonably usable form.

The amendment deletes this entire phrase and replaces it with the simpler phrase, “electronically stored information.” The amended rule will now read:

Any party may serve a request upon a party pursuant to Rules 4009.11 and 4009.12 or a subpoena upon a person not a party pursuant to Rules 4009.21 through 4009.27 to produce and permit the requesting party, or someone acting on the party’s behalf, to inspect and copy any designated documents (including writings, drawings, graphs, charts, photographs, and electronically stored information), or to inspect, copy, test or sample any tangible things or electronically stored information, which constitute or contain matters within the scope of Rules 4003.1 through 4003.6 inclusive and which are in the possession, custody or control of the party or person upon whom the request or subpoena is served, and may do so one or more times.

But while the rule adopts the phrase used in the federal rules, the official comment makes clear that the court’s intent is not to adopt federal e-discovery law:

Though the term “electronically stored information” is used in these rules, there is no intent to incorporate the federal jurisprudence surrounding the discovery of electronically stored information. The treatment of such issues is to be determined by traditional principles of proportionality under Pennsylvania law as discussed in further detail below.

One other significant change to the rule is addition of a new subparagraph (b) to Rule 4009.1 which addresses the form of production. The new rule says that the party requesting ESI may specify the format in which it is to be produced, to which the responding party may object. If the requesting party does not specify a format, then the ESI may be produced “in the form in which it is ordinarily maintained or in a reasonably usable form.”

Proportionality Should Prevail

The official comment to the amended rules emphasizes the importance of proportionality in determining the scope of discovery obligations. The overarching goal of the rules, the comment says, is to ensure that discovery is conducted in a manner that is “consistent with the just, speedy and inexpensive determination and resolution of litigation disputes.” To that end, the comment continues, courts faced with discovery disputes should consider five factors:

  1. The nature and scope of the litigation, including the importance and complexity of the issues and the amounts at stake.
  2. The relevance of ESI and its importance to the court’s adjudication in the given case.
  3. The cost, burden, and delay that may be imposed on the parties to deal with ESI.
  4. The ease of producing ESI and whether substantially similar information is available with less burden.
  5. Any other factors relevant under the circumstances.

The comment goes on to identify what it describes as “tools for addressing” ESI. It says:

Parties and courts may consider tools such as electronic searching, sampling, cost sharing, and non-waiver agreements to fairly allocate discovery burdens and costs. When utilizing non-waiver agreements, parties may wish to incorporate those agreements into court orders to maximize protection vis-à-vis third parties.

This language leaves much to interpretation. Even so, it clearly encourages courts and parties to take technology into consideration when weighing discovery burdens and costs. Implicit in this, it seems fair to say, is the court’s recognition that search, sampling and tools such as predictive coding can significantly reduce both the burden and cost of e-discovery.

With these new rules, Pennsylvania’s Supreme Court has made clear its intent to chart its own route on e-discovery, independent of federal jurisprudence. It will be interesting to see how this course develops. Even so, in their own way, these new rules add to the growing body of law that recognizes the increasingly essential link between sophisticated technology and cost-effective e-discovery.

Catalyst’s Jim Eidelman Discusses Predictive Coding in ‘Law Technology News’

Now that U.S. District Judge Andrew L. Carter Jr. has affirmed the groundbreaking predictive coding order issued by U.S. Magistrate Judge Andrew J. Peck in Da Silva Moore v. Publicis Groupe, Law Technology News reporter Evan Koblentz went back and spoke to leading professionals in the legal technology field for their reactions. You can read his story here: Take Two: Reactions to ‘Da Silva Moore’ Predictive Coding Order.

Jim Eidelman

One of the people Koblentz quotes is Catalyst’s own Jim Eidelman, senior search and analytics consultant on the Catalyst Search & Analytics Consulting team. These court decisions gave predictive coding “a legitimacy that was needed,” Eidelman told Koblentz. But before predictive coding can fully enter the mainstream, engineers need to work out some of the technology’s limitations, he said.

“Obviously it is all about the process, the sampling, and the use of common sense,” Eidelman said. “Some documents can only be found other ways, and predictive coding isn’t a universal solution. Clearly multi-mode searching and review is required in every case, with or without da Silva.”

Eidelman goes on to discuss what he says is “one of the big defensibility issues nobody is talking about.” That issue is pre-culling using keyword searching — something that can leave relevant documents behind and taint the process.

“Other big issues are sampling methodologies, how multiple issues are handled, and attorney-client privilege,” Eidelman says in the article. “There is still so much to be worked out. We are just at the infancy of the machine learning applied to e-discovery documents, even though ‘relevance feedback’ has been used in other areas for decades.”

Read the full article with reactions from a number of technology professionals at Law Technology News.

Should the ‘Daubert’ Standard Apply to Predictive Coding? We May Know Soon

It’s been a month since U.S. Magistrate Judge Andrew J. Peck issued his seminal opinion on predictive coding, Da Silva Moore v. Publicis Groupe, and it continues to make waves. Notably, it appears that U.S. District Judge Andrew L. Carter Jr. will weigh in on the issue. On March 13, he entered an order granting plaintiffs’ request to submit additional briefing on their objections to Judge Peck’s order.

A key issue Judge Carter may need to address is one given short shrift in coverage of and commentary on Judge Peck’s opinion. Understandably, most of the commentary focused on the fact that Judge Peck’s opinion marked a milestone — the first judicial opinion to recognize that computer-assisted review is an acceptable way to search for electronically stored information.

But in the course of that opinion, Judge Peck made another significant ruling. He concluded that Federal Rule of Evidence 702 and the Supreme Court’s decision in Daubert v. Merrell Dow Pharmaceuticals do not apply to a court’s acceptance of a predictive-coding protocol.

Rule 702 and Daubert give trial judges the responsibility to act as “gatekeepers” to exclude unreliable scientific and technical expert testimony. Judge Peck reasoned that these did not apply to the Da Silva Moore case because no one was trying to put anything into evidence. Here is how he explained it:

If MSL sought to have its expert testify at trial and introduce the results of its ESI protocol into evidence, Daubert and Rule 702 would apply. Here, in contrast, the tens of thousands of emails that will be produced in discovery are not being offered into evidence at trial as the result of a scientific process or otherwise. The admissibility of specific emails at trial will depend upon each email itself (for example, whether it is hearsay, or a business record or party admission), not how it was found during discovery.

Rule 702 and Daubert simply are not applicable to how documents are searched for and found in discovery.

You may recall that before Judge Peck issued his written opinion in this case on Feb. 22, he made oral rulings at the motion hearing on Feb. 8. On Feb. 22, just as Judge Peck was issuing his written opinion, the plaintiffs filed objections to his Feb. 8 rulings. One of their central arguments was that Judge Peck erred in disregarding his gatekeeper role under Daubert.

Because predictive coding is a new and novel technology, they argued, Judge Peck should have required expert testimony regarding its reliability or appropriateness. They cite Magistrate Judge Paul Grimm’s well-known ruling in Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251, 260 n.10 (D. Md. 2008), where he said, “[R]esolving contested issues of whether a particular search and information retrieval method was appropriate … involves scientific, technical or specialized information.” Relying on this, the plaintiffs argued:

[A]t no point did the Magistrate review any evidence to support his decision. The Magistrate took no judicial notice of any documents or studies that support the reliability of MSL’s method, nor did he receive any affidavits or declarations from purported experts that supported the methodology of MSL’s method. To his credit, the Magistrate did ask the parties to bring the ESI experts they had hired to advise them regarding the creation of an ESI protocol. These experts, however, were never sworn in, and thus the statements they made in court at the hearings were not sworn testimony made under penalty of perjury. The Magistrate judge never asked for or evaluated the qualifications of these experts, nor were the parties given an opportunity to question or cross-examine the experts in order for the Court to make a finding regarding the reliability of the experts’ opinions. Thus, the Magistrate’s decision relies only on the arguments made by counsel.

On March 7, the defendants responded to plaintiffs’ objections. With regard to the Daubert issue, they took the same position as Judge Peck–that Rule 702 and Daubert do not apply to the methods used to take discovery, but only kick in when evidence is presented at trial.

Plaintiffs simply are incorrect in their assertion that Victor Stanley requires expert testimony regarding the methodology selected by a party to search for electronically stored information. Rather, this case only requires that the selected methodology was carefully planned by qualified persons, contains provisions for quality assurance, and is supported by persons with the requisite qualifications and experience.

After receiving the defendants’ response, the plaintiffs wrote to Judge Carter on March 9 asking for leave to file their own response.

[W]hile Plaintiffs were denied an opportunity to respond to Magistrate Judge Peck’s written opinion, MSL had the benefit of filing its opposition approximately two weeks after the written opinion had been issued. Indeed, MSL’s opposition brief and supporting expert declarations not only reference, but also largely rely upon Magistrate Judge Peck’s observations regarding Plaintiffs’ Objection.

Judge Carter granted the plaintiffs’ request on March 13. (The brief was due March 19.) That means that we can expect him to issue a ruling of his own. It seems unavoidable that any ruling he issues will address the core issue of the appropriateness of computer-assisted review, at least in this case. Most likely, he will also have to address this secondary issue of the applicability of Daubert. If he does, in fact, squarely address these issues–and regardless of whether he agrees with Judge Peck–his ruling will be yet another milestone for predictive coding.

Judge Peck Provides a Primer on Computer-Assisted Review

Magistrate Judge Andrew J. Peck issued a landmark decision in Monique Da Silva Moore v. MSL Group, filed on Feb. 24, 2012. This much-blogged-about decision made headlines as being the first judicial opinion to approve the process of “predictive coding,” which is one of the many terms people use to describe computer-assisted coding.

Well, Judge Peck did just that. As he hinted during his presentations at LegalTech, this was the first time a court had the opportunity to consider the propriety of computer-assisted coding. Without hesitation, Judge Peck ushered us into the next generation of e-discovery review—people assisted by a friendly robot. That set the e-discovery blogosphere buzzing, as Bob Ambrogi pointed out in an earlier post.

I recommend reading the decision (and its accompanying predictive-coding protocol) not for its result but for its reasoning. This is one of the best sources I have seen on the reasons for and processes underlying predictive coding. Indeed, Judge Peck provided a primer on how to conduct predictive coding that is must reading for anyone wanting to get up to speed on this process.

What is Computer-Assisted Review?

Judge Peck started by quoting from his earlier article in Law Technology News:

By computer-assisted coding, I mean tools (different vendors use different names) that use sophisticated algorithms to enable the computer to determine relevance, based on interaction with (i.e. training by) a human reviewer.

As Judge Peck concluded: “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”

Why Do We Need Computer-Assisted Review?

The answer for Judge Peck was simple: Other methods of finding relevant documents are expensive and less effective. As he explained:

  • The objective of e-discovery is to identify as many relevant documents as possible while reviewing as few non-relevant documents as possible.
  • Linear review is often too expensive. Despite being seen as the “gold standard,” studies show that computerized searches underlying predictive coding are at least as accurate as human review, if not more accurate.
  • Studies also show a high rate of disagreement among human reviewers as to whether a document is relevant. In most cases, the difference is attributable to human error or fatigue.
  • Key word searches to reduce data sets also miss a large percentage of relevant documents. The typical practice of opposing parties choosing keywords resembles a game of “Go Fish,” as Ralph Losey once pointed out.
  • Key word searches are often over-inclusive, finding large numbers of irrelevant documents that increase review costs. They can also be under-inclusive, missing relevant documents. In one key study the recall rate was just 20%.

Ultimately, Judge Peck reminded us of the goals underlying the Federal Rules of Civil Procedure. Perfection is not required. The goal is the “just, speedy, and inexpensive determination” of lawsuits.

Judge Peck concluded that the use of predictive coding was appropriate in this case for the following reasons:

  1. The parties’ agreement.
  2. The vast amount of ESI (over 3 million documents).
  3. The superiority of computer-assisted review over manual review or keyword searches.
  4. The need for cost effectiveness and proportionality.
  5. The transparent process proposed by the parties.

The last point was perhaps the most important factor leading to the decision: “MSL’s transparency in its proposed ESI search protocol made it easier for the Court to approve the use of predictive coding.”

How Does the Process Work?

The court attached the parties’ proposed protocol to the opinion. While it does not represent the only way to do computer-assisted review, it provides a helpful look into how the process works.

  1. The process in this case began with attorneys developing an understanding of the files and identifying a small number that will function as an initial seed set representative of the categories to be reviewed and coded. There are a number of ways to develop the seed set, including the use of search tools and other filters, interviews, key custodian review, etc. You can see more on this subject below.
  2. Opposing counsel should be advised of the hit counts and keyword searches used to develop the seed set and invited to submit their own keywords. They should also be provided with the resulting seed documents and allowed to review and comment on the coding done on the seed documents.
  3. The seed sets are then used to begin the predictive coding process. Each seed set (one per issue being reviewed) is used to begin training the software.
  4. The software uses each seed set to identify and prioritize all similar documents over the complete corpus under review. Essentially, they review at least 500 of the computer-selected documents to confirm that the computer is properly categorizing the documents. This is a calibration process.
  5. Transparency requires that opposing counsel be given a chance to review all non-privileged documents used in the calibration process. If the parties disagree on tagging, they meet and confer to resolve the dispute.
  6. At the conclusion of the training process, the system then identifies relevant documents from the larger set. These documents are reviewed manually for production. In this case, the producing party reserved the right to seek relief should too many documents be identified.
  7. Accuracy during the process should be tested and quality controlled by both judgmental and statistical sampling.
  8. Statistical sampling involves a small set of documents randomly selected from the total files to be tested. That allows the parties to project error rates from the sample.
  9. Here, the parties agreed on a series of issues that will, of necessity, vary on other cases. The key point is that the parties agree on the issues and test the coding during the process.

Random Samples

It is important to create an initial random sample from the entire document set. The parties used a 95% confidence level with an error margin of 2%. They determined that the sample size should be 2,399 documents. You can figure this out using one of the publicly available sample-size calculators such as Raosoft, which we often use:

Seed Sets

The protocol goes on to describe a number of ways to generate seed sets including:

  • Agreed-upon search terms.
  • Judgmental analysis.
  • Concept search.

The parties frequently sampled the results from searches to evaluate their effectiveness.

There is at least a good blog post to be written about seed sets. Some computer-assisted coding systems like the one used for this case start their process with seed sets. The notion is that attorneys understand the cases, know what is and is not relevant and can train the system to recognize more relevant documents more effectively than starting with no seed documents.

Others think this is a mistake. They believe that however well meaning, the attorneys will bias the system to find what they think is relevant and get self-reinforcing results. In this regard, they are suggesting that the attorneys will make the same mistakes found in key word searches—thinking that you know which words will be most effective at finding your documents.

Systems following this logic urge the user to start from scratch, telling the system what is and is not relevant based on reviewing documents. As you do that, the system begins developing its own profile of relevant documents and builds out the searches. The belief is that the system may create a better search through this process than it might if you bias it with your seed documents.

There is a middle ground here as well. Many of the latter systems (no seed) will allow you to submit a limited number of seed documents as part of the training process. That may represent the best of both worlds or it may not, depending on your beliefs. The important point is that there are different approaches to computer-assisted processing. This protocol shows you one approach only.

Training Iterations

The process involves a number of computer runs to find responsive documents. The parties started with a first set of potentially relevant documents based on analysis of the seed set. After that review, the computer was asked to consider the new tagging and find a second set for testing. Then a third and a fourth.

The protocol suggested that the parties run through this process seven times. The key is to watch the change in the number of relevant documents predicted by the system after each round of testing. Once that number dropped below a delta of 5%, the parties had the option to stop. The notion is that the system has become stable by that time, with further review unlikely to uncover many more relevant documents.

Finishing the Process

Once the training has completed and the system is “stable,” we move from computer-assisted to human-powered review. At that point, the producing party reviews all of the potentially responsive documents and produces accordingly.

Final QC Protocol

As a final stage, the parties need to focus on the potentially non-responsive documents—the ones the system says to ignore. The parties select a random sample (2,399 documents again) to see how many were, in fact, responsive.

These same documents (non-privileged ones) must be produced to the opposing party for review. If that party finds too many responsive documents in the sample or otherwise objects, it is time for a meet-and-confer to resolve the dispute. Failing that, you can always go to the court and fight it out.

Is This the Bible on Predictive Coding?

Certainly not. There are a lot of ways to approach this process. However, first opinions on any topic carry a lot of weight. We chose a profession that is guided by precedent, and these are first tracks on this new and exciting subject. The suggested procedures make sense to me and provide a starting point for your predictive coding efforts. This opinion and its accompanying protocol are important reading whether you are proposing or opposing the process for your next case.

 

In A Milestone for Predictive Coding, Judge Peck Says, ‘Go Ahead, Dive In!’

Lawyers and predictive coding are like kids around the swimming hole — no one wants to be the first to dive in for fear the water is cold or it harbors scary creatures. But once someone takes the lead, dives in and declares the water fine, everyone else is quick to follow.

That is why U.S. Magistrate Judge Andrew J. Peck’s opinion published Friday marks a major milestone for the use of predictive coding in e-discovery. It is the first judicial opinion in which a court has expressly approved the use of computer-assisted review.

Just last October, a prescient Judge Peck published an article in which he described (metaphorically speaking) the lawyers standing around the predictive-coding swimming hole:

To my knowledge, no reported case (federal or state) has ruled on the use of computer-assisted coding. While anecdotally it appears that some lawyers are using predictive coding technology, it also appears that many lawyers (and their clients) are waiting for a judicial decision approving of computer-assisted review.

Well, on Friday he gave them what they were waiting for. “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases,” he wrote in Da Silva Moore v. Publicis Groupe.

Predictive Coding Delivers Precision and Value

We wrote about this case just two weeks ago, recounting Judge Peck’s on-the-record colloquy about predictive coding with counsel and their e-discovery consultants. Near the end of that hearing, after discussing the issue at length, Judge Peck stated, “This may be for the benefit of the greater bar, but I may wind up issuing an opinion on some of what we did today.” If he did issue an opinion regarding predictive coding, I said then, it would no doubt be a milestone in the industry’s adoption of the technique.

In the opinion, Judge Peck readily concedes that his decision to allow predictive coding in this case was relatively easy, given that the parties agreed to its use. Their disagreement hinged on how to implement predictive coding, not whether it should be used.

Even so, he forcefully makes the case for predictive coding and computer-assisted review. In a section of the opinion titled, “Further Analysis and Lessons for the Future,” he describes why predictive coding delivers greater precision and value than either manual review or keyword searching.

The good old-fashioned method of linear, manual review is simply too expensive to be practical in cases with large numbers of documents, Judge Peck notes. In Da Silva Moore, for example, there are over 3 million emails. He then goes on to challenge the “myth” that manual review is the “gold standard.”

Moreover, while some lawyers still consider manual review to be the “gold standard,” that is a myth, as statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review. Herb Roitblatt, Anne Kershaw, and Patrick Oot of the Electronic Discovery Institute conducted an empirical assessment to “answer the question of whether there was a benefit to engaging in a traditional human review or whether computer systems could be relied on to produce comparable results,” and concluded that “[o]n every measure, the performance of the two computer systems was at least as accurate (measured against the original review) as that of human re-review.”

Likewise, Wachtell, Lipton, Rosen & Katz litigation counsel Maura Grossman and University of Waterloo professor Gordon Cormack, studied data from the Text Retrieval Conference Legal Track (TREC) and concluded that: “[T]he myth that exhaustive manual review is the most effective – and therefore the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.” … The technology-assisted reviews in the Grossman-Cormack article also demonstrated significant cost savings over manual review: “The technology-assisted reviews require, on average, human review of only 1.9% of the documents, a fifty-fold savings over exhaustive manual review.” (Citations omitted.)

The Limits of Keyword Searching

Next, Judge Peck reviews the limitations of keyword searching. Keyword searching has its place, he acknowledges, to cull data to a more manageable volume and to identify documents for predictive-coding seed sets. But far too often, he writes, “the way lawyers choose keywords is the equivalent of the child’s game of ‘Go Fish.’” (Judge Peck cites Ralph Losey’s blog for the analogy. We’ve used it here in a different context.)

The requesting party guesses which keywords might produce evidence to support its case without having much, if any, knowledge of the responding party’s “cards” (i.e.,the terminology used by the responding party’s custodians). Indeed, the responding party’s counsel often does not know what is in its own client’s “cards.”

Another problem with keyword searching is that it is often over-inclusive, finding large numbers of irrelevant documents, Judge Peck writes. Even worse, keyword searches are not very effective, he points out. He cites the landmark e-discovery study by David C. Blair and M.E. Maron, An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System, 28 COMMUNC’NS. OF THE ACM 289, 295 (1985). In that study, the attorneys were confident that their searches had found more than 75% of the responsive documents.  But they were wrong.  In fact, the searches had only found 20% of the relevant documents. More recent TREC Legal Track studies have corroborated these results.

Given the weaknesses in both manual review and keyword searching, Judge Peck concludes:

Computer-assisted review appears to be better than the available alternatives, and thus should be used in appropriate cases. While this Court recognizes that computer-assisted review is not perfect, the Federal Rules of Civil Procedure do not require perfection. … Courts and litigants must be cognizant of the aim of Rule 1, to “secure the just, speedy, and inexpensive determination” of lawsuits. Fed. R. Civ. P. 1. That goal is further reinforced by the proportionality doctrine set forth in Rule 26(b)(2)(C).

Returning to the case at bar, Judge Peck lists the the factors that led him to conclude that predictive coding was appropriate:

  1. The parties’ agreement.
  2. The vast amount of ESI to be reviewed (over 3 million documents).
  3. The superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches).
  4. The need for cost effectiveness and proportionality under Rule 26(b)(2)(C).
  5. The transparent process proposed by defendants.

With reference to the transparency of the process, Judge Peck stresses that this was a key factor in his decision, describing himself as a strong supporter of the Sedona Conference Cooperation Proclamation. While not all e-discovery counsel will be as transparent as counsel were in this case, “such transparency allows the opposing counsel (and the Court) to be more comfortable with computer-assisted review, reducing fears about the so-called ‘black box’ of the technology. This Court highly recommends that counsel in future cases be willing to at least discuss, if not agree to, such transparency in the computer-assisted review process.”

Other ‘Lessons for the Future’

Magistrate Judge Andrew Peck

Judge Peck wraps up his “Lessons for the Future” with four additional points:

  • It is unlikely that courts will be able to approve a party’s proposal as to when review and production can stop until the computer-assisted review software has been trained and the results are quality control verified. Only at that point can the parties and the court see where there is a clear drop off from highly relevant to marginally relevant to not likely to be relevant documents.
  • Staging of discovery by starting with the sources and custodians that are most likely to be relevant–without prejudice to the requesting party seeking more after conclusion of that first stage review–is a way to control discovery costs.
  • In many cases, the requesting party’s client will have knowledge of the producing party’s records, either because of an employment relationship or because of other dealings between the parties. Lawyers should be sure to use their client’s knowledge of the opposing party’s custodians and document sources. Think of cooperation to mean “strategic proactive disclosure of information,” Judge Peck advises. “If you are knowledgeable about and tell the other side who your key custodians are and how you propose to search for the requested documents, opposing counsel and the court are more apt to agree to your approach.
  • It was helpful that the parties had their e-discovery vendors present at the court hearings where the ESI protocol was discussed, Judge Peck says. “Even where as here counsel is very familiar with ESI issues, it is very helpful to have the parties’ e-discovery vendors (or in-house IT personnel or in-house e-discovery counsel) present at court conferences where ESI issues are being discussed. It also is important for the vendors and/or knowledgeable counsel to be able to explain complicated e-discovery concepts in ways that make it easily understandable to judges who may not be tech-savvy.”

What This Means for E-Discovery

No one need wonder what this opinion means for the future of e-discovery, because Judge Peck answers that question himself.

The opinion does not mean that computer-assisted review must be used in all cases, he says. Nor should the opinion be considered an endorsement of any particular vendor or of any particular review tool.

What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review. Counsel no longer have to worry about being the “first” or “guinea pig” for judicial acceptance of computer-assisted review. As with keywords or any other technological solution to e-discovery, counsel must design an appropriate process, including use of available technology, with appropriate quality control testing, to review and produce relevant ESI while adhering to Rule 1 and Rule 26(b )(2)(C) proportionality. Computer-assisted review now can be considered judicially-approved for use in appropriate cases.

In other words, Judge Peck has taken the dive into the scary pool of predictive coding and now he is beckoning to all of us, “Come on in, the water’s fine!”

An On-the-Record Colloquy about Predictive Coding With Judge Peck

U.S. Magistrate Judge Andrew J. Peck

We all talk all the time about predictive coding, but it is not often that you get perspective on it direct from the battle-scarred trenches of high-stakes litigation. Over at Law Technology News, editor Sean Doherty reports on a recent hearing before U.S. Magistrate Judge Andrew J. Peck of the Southern District of New York in which Judge Peck ordered the parties to adopt a protocol for e-discovery that includes the use of predictive coding. It appears to be the first federal case to formally endorse the use of predictive coding, Doherty writes.

The case, Monique Da Silva Moore v. Publicis Groupe, is a class action alleging widespread discrimination against women employed by one of the world’s “big four” advertising conglomerates. In a sometimes contentious Feb. 8 teleconference with Judge Peck — of which LTN has published the transcript — the parties debate sanctions and various other e-discovery issues before getting down to the brass tacks of predictive coding. It was a hearing that started with Judge Peck showing little patience for the parties’ inability to cooperate. At one point, the exasperated judge declares to plaintiffs’ counsel:

Stop. Please. I take judicial notice of the fact that you don’t like the defendants. Stop whining and let’s talk substance. I don’t care how we got here and I’m not giving anyone money today. In the future not only will there be sanctions for whoever wins or loses these discovery disputes, — and so far you’re one for two, I think — there will be sanctions payable to the clerk of court for wasting my time because you can’t cooperate.

But as the hearing progressed, Judge Peck seemed to appreciate the fact that there were two e-discovery consultants on the call, Paul Neale, CEO of DOAR Litigation Consulting, for the plaintiffs, and David Baskin, a vice president at Recommind, for the defendants. The two consultants had participated in negotiations among the parties to draft a protocol for using predictive coding to identify documents relevant to the discrimination claim.

Still, the transcript is replete with examples of how convoluted the conversation can get when it comes to search parameters and the use of predictive coding. Consider the following exchange about search with regard to the emails of “comparators” — men who performed jobs comparable to those of the women plaintiffs.

THE COURT: This is a case where the plaintiffs worked at the company. What is it that you expect to see in the comparators’ email that is relevant? Describe the concepts to me. Frankly, I don’t disagree that whether they are comparators or not is a relevant issue, but I don’t see why, if you want to find out what their job duties were and these people have no stake in the case, you don’t just take their deposition.

MS. BAINS: We do want to take their depositions. To answer your question about the specific things we would be looking for, for example, one of the plaintiffs testified about her job duties, including client contact. We would look for client contact in the comparators.

THE COURT: That’s ridiculous. That means basically forget sophisticated searches, any email from one of these comparators to or from a client is relevant?

MS. BAINS: I mean on the substantive issues regarding contacts.

THE COURT: How do you train a computer for that? How do you do a key word on that? I’m having a very hard time seeing what it is you expect. You’ve got the plaintiffs’ emails. If you don’t have their emails, you have their memory of them. If comparator whoever, Kelly Dencker, I don’t know if that is a he Kelly or a she Kelly, but if Kelly wrote to a client and said, I’d like to meet with you next week to discuss the following presentation, that’s what you’re looking for?

MS. BAINS: That would be part of it.

THE COURT: What else? You keep giving me this is part of it. If you want me to order this done, you’ve got to tell me how it is that it could be done in a reasonable way.

MS. BAINS: I think we could treat the comparators as a separate search.

THE COURT: Then what is that search going to be? Also, by the way, we’ve gone from throw the comparators into the bundle but do a little key word screening first to reduce volume to now we are at the let’s do the comparators separate, and I’m still not hearing how you’re going to search through their emails separately.

MS. BAINS: One of our allegations is that they were given opportunities, including job assignments, etc., that plaintiffs weren’t.

THE COURT: That is basically every substantive email, every business email they have. All right, comparators are out at this time without prejudice to you coming up with some scientific way to get at this. Otherwise, take the deposition and go from there.

From there, the transcript evolves into a lengthy colloquy about the parties’ draft protocol for predictive coding. Whereas Judge Peck seemed impatient earlier in the hearing, here he engages in an in-depth back-and-forth with the two search consultants and counsel, trying to fully understand each side’s relative positions. Here is an excerpt:

MR. BASKIN: Judge, from what I understand, the request is not to do the random sample iterations, finish the iterations. I’m still not understanding.

THE COURT: What they are saying is each time you run it, whether it’s 7 or less, and it may be two different things to satisfy yourself on the defense side and something else to satisfy the plaintiffs, but whether you do the 500 best documents or not, the 500 and possibly more, Mr. Neale was suggesting that on each iteration there is a random sample drawn and the computer will have coded some of those as relevant and some of them as not relevant; and if it is miscoding the documents that are not relevant, then there’s a problem.

MR. BASKIN: Let me clarify. The computer doesn’t code documents. The computer suggests documents that are potentially relevant or similar.

THE COURT: Same thing.

MR. BASKIN: What happens is during the seven iterations, all the defense attorneys are going to do is refine the documents that they are looking at. After the seven iterations, what you are getting is a sum of it all. Then you are performing a random sample. Doing random samples in between makes no sense. The actual sum of the seven iterations will just be the sum of that. You are refining and learning.

THE COURT: What Mr. Neale is saying is that you might not have to do it seven times and that the sooner you find out how well the seed set or the training has worked, the better.

MR. BASKIN: What’s going to happen, at least from what I understand the request to be, is that you do one iteration, which is 500, then you do 2399 samples, then you do another iteration, do another 2399. I think they are looking for the 7 times 2400 plus the 500 each. We are looking at 21,000.

MR. NEALE: That’s not what we are suggesting. We are actually suggesting that each iteration be one sample randomly selected of 2399, indicating which of those the system would have flagged as relevant so we know the difference in the way in which it is being categorized.

MR. ANDERS: I would think, too, we are now just completely missing the power of the system. What we were going to review at each iteration are the different concept groups where the computer is taking not only documents it thinks are relevant but it has clustered them together and we can now focus on what is relevant to this case. By reverting back to a random sample after each iteration, we are losing out on all the ranking and all the other functionality of this system. It doesn’t seem to make sense to me.

THE COURT: I’m not sure I understand the seven iterations. As I understand computer-assisted review, you want to train the system and stabilize it.

MR. BASKIN: If I may. What happens when you seed the particular category is you take documents, you review them. The relevant documents are now teaching the system that these are good documents.

THE COURT: Right.

MR. BASKIN: It also takes the irrelevant documents and says these are not good documents. It continues to add more relevant documents and less irrelevant documents into the iterations. The seven iterations will then refine that set and continue to add the responsive documents to each category. At the end of that, after seven iterations, you will have not only positive responsive documents, also the nonresponsive documents, but the last set of computer-suggested documents the system suggests. From that point the defense is saying we can then verify with a 95 percent plus or minus 2 of 2399 to see if there is anything else that the system did not find.

THE COURT: Let me make sure I understand the iterations then. Is the idea that you are looking at different things in each iteration?

MR. BASKIN: Correct. It’s learning from the input by the attorneys. That’s the difference. That’s why the random sample makes no sense.

MR. NEALE: I don’t doubt that that is how Recommind proposes to do it. Other systems are, however, –

THE COURT: We are stuck with their black box.

MR. NEALE: — fine to do it.

MR. BASKIN: It’s not a black box. We actually show everything that we are doing.

THE COURT: I’m using “black box” in the legal tech way of talking. Let’s try it this way, then we’ll see where it goes. To the extent there is a difference between plaintiffs’ expert and the defendants’ on what to do — and to the extent I’m coming down on your side now, on the defense side, that doesn’t give you a free pass — random sample or supplemented random sample, once you tell me and them the system is trained, it’s in great shape, and there are not going to be very many documents, there will be some but there are not going to be many, coded as irrelevant that really are relevant, and certainly there are not going to be any documents coded as irrelevant that are smoking guns or game changers, if it turns out that that is proved wrong, then you may at great expense have to redo everything and do it more like the way Mr. Neale wants to do it or whatever.

For the moment, since I think I understand the training process, and going random is not necessarily going to help at that stage, and since Mr. Neale and the lawyers for the plaintiffs are going to be involved with you at all of these stages, let’s see how it develops.

Eventually, the parties work through to an agreement on the predictive-coding protocol. But before the discussion is concluded, Judge Peck gets in what he describes as a “tweak” about the Recommind predictive coding patent that caused such a stir when it was announced last year. Here is how it went:

MR. BASKIN: The system could return 300 documents in the first iteration. At that point you can’t do 2399. I’m actually impartial. I designed the system. I work for the company, and I’m not getting paid for this. I just wanted to let you know that 7 iterations from a quality perspective is better to the plaintiff.

MR. NEALE: It is also inconsistent with your patent, which suggests that you do the iterations until the system tells you it’s got it right. Speaking to the limit on that without having done it is not consistent with your own patent and with what is generally accepted as best practice.

THE COURT: They also claim to have a patent on the word “predictive coding” or a trademark or a copyright. We know where that went in the industry. But I’m just tweaking you.

Near the end of the transcript, as the hearing is wrapping up, Judge Peck states, “This may be for the benefit of the greater bar, but I may wind up issuing an opinion on some of what we did today.” If he does issue an opinion regarding predictive coding, it would no doubt be a milestone in the industry’s adoption of the technique.

‘Mt. Hawley’ Affirmed and Claim Dismissed: District Judge Again Puts His Stamp of Approval on Troubling Rulings

For over a year, we have been writing about a West Virginia decision (and its progeny) that we believe went too far in making new e-discovery law. The original decision, issued May 18, 2010, was styled Mt. Hawley Insurance Co. v. Felman Production. You can read my original post at: Bad Facts Make Bad Law: ‘Mt. Hawley’ A Step Backward for Rule 502(b).

In that decision, Magistrate Judge Mary E. Stanley held that Felman had waived attorney-client privilege by inadvertently producing a smoking-gun email to counsel suggesting that it might be helpful to their insurance claim for business interruption to backdate several orders from clients. If the orders had come in while the machinery in question was under repair, that might provide support for their $38 million dollar insurance claim. You have to love their chutzpa at the very least.

The 'smoking-gun' email involved a furnace such as this one, at Felman's West Virginia facility.

In my original post, I suggested that bad facts (outright fraud it seemed to me) might be responsible for what I thought was bad law. After all, the production had been overseen by a highly reputable law firm (which had no involvement in this email). Counsel had not only been diligent in trying to screen out privileged documents, but it had gone far beyond what we have typically seen elsewhere. Indeed, counsel cited over 20 steps they had taken, including a variety of review and sampling efforts:

  1. Negotiated the ESI stipulation with defendants.
  2. Hired an ESI collection vendor, Innovative Discovery.
  3. Discussed with Felman’s IT department the company’s computer network structure and identified potential sources of relevant ESI.
  4. Visited Felman’s West Virginia plant to coordinate and oversee ESI collection.
  5. Decided to collect data using forensic imaging.
  6. Directed the vendor to collect ESI from the current server and the backup server.
  7. Collected 1,638 gigabytes of data.
  8. Downloaded emails from 29 custodians for processing by its law firm, Venable.
  9. Hired a new vendor to process Felman’s Oracle and Soloman databases.
  10. Identified the first six workstations to be processed and learned that each contained more data than anticipated.
  11. Examined methods to cull non-relevant materials.
  12. Selected search terms to retrieve documents responsive to defendants’ document requests.
  13. Tested the search terms against the Felman emails and added additional search terms.
  14. Tested the search terms, including the additional terms, against the Felman emails, tagged responsive documents, and set them aside for privilege review.
  15. Produced 17,064 Excel spreadsheets.
  16. Selected privilege search terms to identify materials which are potentially privileged and relevant.
  17. Set aside potentially privileged materials for individualized document-by-document review for relevancy and privilege.
  18. Tested the privilege search terms against Felman’s emails.
  19. Retrieved native files of all images and examined thumbnails.
  20. Conducted “eyes-on” review of all documents identified both as relevant and potentially privileged.
  21. Decided to use a vendor to complete the processing of Felman’s emails.
  22. Produced ESI in native or TIF format, with 36 fields of metadata.
  23. Produced more than 346 gigabytes of data without sampling for relevancy, over-inclusiveness or under-inclusiveness.

Counsel got nailed simply because several of the Concordance indexes they used turned out to be corrupt. As a result, privilege searches didn’t turn up anything for the documents in those indexes. Since counsel didn’t attempt to review every one of the millions of documents they produced, several key documents slipped through the net. Reading between the lines, Magistrate Judge Stanley seemed to lay blame on the fact that they did not appear to sample the documents that they produced but never reviewed to see if any might be privileged.

We wrote about subsequent decisions as well as other commentary about the decision here:

Now, in what has to be the final straw in this saga, the presiding judge in the case, U.S. District Judge Robert C. Chambers, has taken the ultimate step by issuing further sanctions and dismissing the lost-business claim: Felman Production v. Industrial Risk Insurers, 2011 U.S.Dist. Lexis 112161 (Sept. 29, 2011).

Let us look at the court’s reasoning.

Bad Discovery Practices

U.S. District Judge Robert C. Chambers

You can’t read any of the Mt. Hawley decisions without being reminded that both the magistrate judge and the district judge were not happy with Felman’s discovery practices. Among others we have chronicled, you won’t win favor with the courts by backing up the trucks and dumping a ton of irrelevant electronic files on the opposition. When you go the next step and make every one of them confidential, regardless of content, you only worsen your position. The judge was particularly incensed to see pictures of kitties with a big confidential stamp on them. Kitties. Yes kitties. Awww, how cute. Oh, but there were a couple of naked men in the photos too (although not with the kitties, thank heavens).

There were also missing files and no real attempt to issue a litigation hold by Privat, the Ukrainian company that was at the controls in this case. Indeed, it appears that party Felman actively dissembled with respect to its true owners, a fun loving bunch of Ukranians with little respect for the discovery process. They got caught when one of the inadvertently produced documents showed that they were running the show. Oh what a tangled web they weaved!

As Judge Chambers explained:

Felman’s failure to comply with Judge Stanley’s August 19 and October 19, 2010 Orders was inevitable in light of the lack of care Felman exercised. … Felman did not provide litigation hold memos to the West Virginia Felman staff until four months after this case was filed. Felman also admitted that the Ukrainian custodians were not instructed to preserve their documents.

This led to the destruction of documents when the Privat representatives sold their computers before receiving a document request. Convenient, to say the least.

All this led to a motion for sanctions—either a dismissal outright or dismissal of the business interruption claims plus an adverse inference instruction regarding the missing documents.

The judge made short order of the motion. While not dismissing the case outright, he did dismiss the $38 million business interruption claim. He started by discussing spoliation, citing Magistrate Judge Grimm in Victor Stanley, Inc. v. Creative Pipe, Inc., 269 F.R.D. 497, 522 (D. Md. 2010).

A party subject to [the duty to preserve] must “identify, locate, and maintain information that is relevant to specific,predictable, and identifiable litigation.” In ascertaining whether a party has fulfilled its duty to preserve, a court must “determine reasonableness under the circumstances … [which] in turn depends on whether what was done—or not done—was proportional to that case and consistent with clearly established applicable standards.

The court went on to find specific culpability—gross negligence—in Felman’s failure to issue litigation holds and otherwise take steps to preserve evidence.

In the end, the court didn’t dismiss all claims but rather threw out the big one—for business interruption. It left the other claims and counterclaims to be tried.

The sanctionable conduct of Felman and the resulting prejudice to Defendants merits dismissal fo the business interruption claim because the unavailable evidence and improper conduct related predominantly to it. As to Defendant’s counterclaim, the unavailable evidence is less prejudicial and an adverse inference instruction is an adequate remedy.

The court also went on to award attorneys’ fee in the bargain.

What Happens Next?

My guess is that we won’t hear any more from the parties in this case. The business interruption claim was the heart of Felman’s demand in the first place. An adverse inference instruction is a powerful tool to address the rest of the claims, at least to the extent Felman oversteps the bounds of its insurance policy. Settlement is the likely next step in this matter.

What about a malpractice claim? Given the Felman party antics, I wouldn’t put it past them. They might claim malpractice for producing the damaging documents. With a bogus claim for $38 million at stake, who knows what they might do. At the least, this would present interesting evidentiary questions for the firm’s malpractice carrier. I wouldn’t want to be in the settlement meeting.

But my concern is for other cases more than for how this one wraps up. With 23 steps being ruled as not enough, what will be adequate? Perhaps the problem could have been remedied by some simple sampling procedures, but that isn’t clear from anything I read. Perhaps enough other judges will choose to ignore it so that it becomes weak precedent. “A derelict on the waters of the law,” as famed Supreme Court Justice Felix Frankfurter once said. That’s my vote. Send the bad guys home but leave e-discovery law alone.

Does Sampling Case Set a ‘Dangerous Precedent’?

At her On the Case blog for Thomson Reuters, Alison Frankel has an intriguing report about a U.S. magistrate judge’s order in an e-discovery dispute that has prompted the U.S. Chamber of Commerce to leap into the fray, warning that the order, if allowed to stand, will set “a dangerous precedent” and will be of “profound significance to businesses in America.” In what Frankel describes as a “venture into the weeds of a federal district court discovery dispute,” the Chamber has filed an amicus brief in U.S. District Court in Manhattan asking a federal judge to overturn the order. The Washington Legal Foundation and the International Association of Defense Counsel have also weighed in as amicus.

Making this brouhaha even more notable is that this case is still only in the preliminary stage of litigation. It is a putative class action yet to be certified as such and one where the judge has stayed all discovery pending a decision on certification of the class.

Where, then, do the amici see such profound danger? The danger, they contend, lies in the magistrate judge’s refusal to allow the defendant to use sampling in order to limit the scope of its preservation obligation.

Thousands of Hard Drives

The case, Pippins v. KPMG, is a wage-and-hour dispute in which plaintiffs challenge KPMG’s treatment of accounting associates in its audit practice as exempt employees under federal and New York labor laws. While awaiting a ruling on whether the case will be certified as a class action, KPMG asked the court to issue a protective order limiting the scope of its duty to preserve.

Rather than preserve the hard drives of thousands of former employees who would fall under a nationwide class action, KPMG instead sought to preserve only a random sample of 100 hard drives. Against that sample of hard drives, plaintiffs could then apply keyword searches to determine whether they contain information relevant to the case.

Motivating KPMG to make this request was cost. It estimated that the potential class of plaintiffs could number 7,500 and that the cost to preserve each drive would be $600. It had already spent $1.5 million preserving the drives of 2,500 former audit associates and faced another $3 million bill to preserve the remainder.

Given this cost, KPMG argued that the court should apply a proportionality test to reconcile its duty to preserve discovery material with the burden it would face in light of this cost.

Plaintiffs objected to KPMG’s request on three grounds:

  • Allowing KPMG to destroy hard drives at this early stage of the litigation would be premature and would prevent the parties from crafting a more-informed plan for preservation and production and from having the information they need to generate search terms.
  • Using keyword searches alone, as KPMG proposed, is an “outmoded method of data recovery” that does not reflect context or capture all relevant materials.
  • Keyword searches, alone, would not be sufficient for plaintiffs to cull information from the drives about employees’ hours worked and work product.

Even with these objections, the plaintiffs were not opposed to the use of sampling. In fact, the parties negotiated extensively and even went through mediation in an attempt to come to terms on sampling. Only when they could not agree did KPMG seek the protective order.

(As an aside, I am told by Jim Eidelman, principal consultant in Catalyst’s Search & Analytics Consulting group, that the plaintiffs are correct that keyword searching won’t do the job. Plaintiffs’ counsel would want to look at the time stamps on the files to see when the employees were working and at the content of the documents to show the nature of the work, he says.)

Too Many Unknowns to Permit Destruction

On Oct. 7, 2011, U.S. Magistrate Judge James L. Cott issued a memorandum and order denying KPMG’s motion for a protective order. In his memorandum, Judge Cott stepped through a detailed analysis of KPMG’s duty to preserve the hard drives and of the applicability of a proportionality test. Yet the gist of his analysis seemed to boil down to one factor — it was just too early in the case to let KPMG destroy all those hard drives.

[C]ourts in this district have cautioned against the application of a proportionality test as it relates to preservation. … At this point in the litigation, it is unclear whether an application of a proportionality test would weigh in favor of a protective order. Certainly KPMG’s preservation of the hard drives is not without considerable expense. … However, KPMG has not been able to establish conclusively that the materials contained on the hard drives are of either “little value” or “not unique.” … Until discovery proceeds and the parties can resolve what materials are contained on the hard drives and whether those materials are responsive to Plaintiffs’ document requests, it would be premature to permit the destruction of any hard drives. Moreover, the parties should be able to make such a determination promptly once the Motion to Certify is resolved and the stay of discovery is lifted. Because it is not possible to predict when that determination will be made, it is similarly difficult to conclude what KPMG’s costs of preservation will be on an ongoing basis. With so many unknowns involved at this stage in the litigation, permitting KPMG to destroy the hard drives is simply not appropriate at this time.

In concluding his memorandum, Judge Cott made clear that the parties remained free to negotiate an agreement on a method for preserving only a sample of the hard drives. But unless the parties could reach agreement, and until the contours of the class could be defined, he ordered KPMG to continue its preservation efforts.

Case of ‘Exceptional Importance’

That order is now pending review by U.S. District Judge Colleen McMahon, where the Chamber and the other amici have filed their briefs. Noting that it rarely files amicus briefs outside of appellate courts and does so only in cases of “exceptional importance,” the Chamber says, “This is such a case.”

The Magistrate Judge’s opinion reached an unprecedented conclusion here: that, faced with an uncertified class or collective action alleging that employees were not properly compensated for overtime, KPMG, at considerable expense, has to rip out and retain every single hard drive from every computer that any member of the putative class or collective may have used before leaving the company. This KPMG must do, said the Judge, even though there is a database that directly recorded the employees’ hours, and even though virtually all of the data on the hard drives would be irrelevant to the case.

The Magistrate Judge made two errors of law that led to this novel conclusion. First, he held that the duty to preserve electronically stored information was not limited by any test of proportionality. Second, he held that every member of the proposed plaintiff class or collective action was a “key player” for purposes of discovery and the retention of electronic information. Both holdings are wrong, unprecedented, and—if affirmed here and followed by other courts—would be highly detrimental to the conduct of civil litigation under the Federal Rules.

Are the consequences of this ruling as dire as the Chamber predicts? Probably not. As I noted above, the nub of the ruling is that destruction is forever. But the case illustrates on a broad scale the burden companies face in meeting their preservation obligations.

What stands out to me is that all of this could have been avoided if the parties had been able to reach agreement on sampling. To their credit, they tried–even taking the issue to mediation. But while this case is characterized as one about proportionality, preservation and sampling, perhaps it is really an object lesson in the importance of cooperation.

Smart Sampling in E-Discovery: Reduce Document Review Costs Without Compromising Results

The cover of the October 2011 Tennessee Bar Journal features an article, Smart Sampling in E-Discovery: Reduce Document Review Costs Withut Compromising Results. The article was co-authored by John Tredennick, Catalyst founder and CEO, and Tom Turner, president and co-founder of Document Solutions Inc.

The article addresses a problem that is becoming more common in e-discovery: As we increasingly rely on search technology to help identify relevant documents and to exclude privileged documents, how can we be assured that our searches are not overlooking key documents?

One answer to that problem is data sampling, the authors write. Sampling is the process whereby the producing party reviews a sample set of documents and extrapolates the results to the entire document population.

In the article, Tredennick and Turner explain why sampling should be used in e-discovery. They look at recent court decisions that suggest that sampling may not just be useful but also required in many circumstances. Finally, they review how sampling works at different stages of e-discovery and discuss the most effective techniques.

Read the article online at the Tennessee Bar Journal website or download the PDF.