A TAR is Born: Continuous Active Learning Brings Increased Savings While Solving Real-World Review Problems

In July 2014, attorney Maura Grossman and professor Gordon Cormack introduced a new protocol for Technology Assisted Review that they showed could cut review time and costs substantially. Called Continuous Active Learning (“CAL”), this new approach differed from traditional TAR methods because it employed continuous learning throughout the review, rather than the one-time training used by most TAR technologies.

Barbra-Streisand-A-Star-is-Born-300x224

Barbra Streisand in ‘A Star is Born’

Their peer-reviewed research paper, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery,” also showed that using random documents was the least effective method for training a TAR system. Overall, they showed that CAL solved a number of real-world problems that had bedeviled review managers using TAR 1.0 protocols.

Not surprisingly, their research caused a stir. Some heralded its common-sense findings about continuous learning and the inefficiency of using random seeds for training. Others challenged the results, arguing that one-time training is good enough and that using random seeds eliminates bias. We were pleased that it confirmed our earlier research and legitimized our approach, which we call TAR 2.0.

Indeed, thanks to the Grossman/Cormack study, a new star was born.

How Does CAL Work?

CAL turns out to be much easier to understand and implement than the more complicated protocols associated with traditional TAR reviews.

TAR 1.0: One-Time Training

A typical TAR 1.0 review process

A Typical TAR 1.0 Review Process

A TAR 1.0 review is typically built around the following steps:

  1. A subject matter expert (SME), often a senior lawyer, reviews and tags a sample of randomly selected documents to use as a “control set” for training.
  2. The SME then begins a training process using Simple Passive Learning or Simple Active Learning. In either case, the SME reviews documents and tags them relevant or non-relevant.
  3. The TAR engine uses these judgments to build a classification/ranking algorithm that will find other relevant documents. It tests the algorithm against the control set to gauge its accuracy.
  4. Depending on the testing results, the SME may be asked to do more training to help improve the classification/ranking algorithm.
  5. This training and testing process continues until the classifier is “stable.” That means its search algorithm is no longer getting better at identifying relevant documents in the control set.

Even though training is iterative, the process is finite. Once the TAR engine has learned what it can about the control set, that’s it. You turn it loose to rank the larger document population (which can take hours to complete) and then divide the documents into categories to review or not. There is no opportunity to feed reviewer judgments back to the TAR engine to make it smarter.

TAR 2.0: Continuous Active Learning

In contrast, the CAL protocol merges training with review in a continuous process. Start by finding as many good documents as you can through keyword search, interviews, or any other means at your disposal. Then let your TAR 2.0 engine rank the documents and get the review team going.

Infographic_CAL_HighRes copy

Continuous Active Learning: A TAR 2.0 Process

As the review progresses, judgments from the review team are submitted back to the TAR 2.0 engine as seeds for further training. Each time reviewers ask for a new batch of documents, they are presented based on the latest ranking. To the extent the ranking has improved through the additional review judgments, reviewers receive better documents than they otherwise would have.

How Much Can I Save With CAL?

Grossman and Cormack compared CAL across eight different matters against  two leading TAR 1.0 protocols (Simple Passive Learning (SPL) and Simple Active Learning (SAL). Without exception, they found that CAL allowed them to find relevant documents more quickly than either of the traditional approaches:

Spreadsheet

On average across all matters, CAL found 75% of the relevant documents in the collection after reviewing just 1% of the total documents. In contrast, a review team would have to look at over 30% of the collection with SPL (which relies on random sampling for TAR training). They would have to review about 15% of the collection with SAL (which lets the computer select some of the documents for training).

Using typical review rates and charges, CAL saved from $115,000 to as much as $257,000 over TAR 1.0 protocols. The time saved from having fewer documents to review is equally compelling, especially when facing tight discovery deadlines.

Solving Real-World Problems

Along with substantial cost and time savings, the CAL protocol solves several other real-world problems that have hindered the wider use of TAR.

1.    Continuous Learning Yields Better Results

TAR 1.0 protocols are built on one-time training. Once that training is complete, the system has no way to get smarter about your documents. The team simply has to review all the documents ranked above the initial cutoff.

Grossman and Cormack show that CAL trumps one-time training. As the review progresses, the TAR 2.0 algorithm keeps learning about your documents. The team finds relevant documents more quickly, which saves substantially in time and review costs.

2.    More Efficient than Random Training

TAR 1.0 required that you train the system using randomly selected documents. Proponents argued that if attorneys select training documents by other means (keyword searching, for example), their unwitting judgments could bias the system.

Grossman and Cormack showed that random is the least effective approach for TAR training. Where relevant documents are few and far between (often the case with review collections), the SME may have to click through thousands of documents before finding enough relevant ones to train the system. That results in slower and more costly review.

To address bias concerns, TAR 2.0 systems include random or specially selected documents (e.g. contextually diverse) in the review batches to ensure the review goes beyond the initial training documents. This way, the team spends most of its time reviewing highly relevant documents, while the system ensures they see enough other documents to make sure the review is complete.

3.    Eliminates Need for Subject Matter Experts.

TAR 1.0 required that an SME train the system before review could begin. Often, a senior lawyer would spend hours reviewing thousands of documents before review could begin. Review managers chafed at waiting weeks for the SME to find time to review what were often irrelevant or marginal documents.

Grossman and Cormack showed that review-team training in a CAL protocol is far more effective. CAL systems base training on all the documents that have been reviewed, which means that natural variations by reviewers will not have a significant impact. As part of a quality control process, TAR 2.0 systems can present outliers to an SME for correction. Using reviewers to train the system makes the review cheaper (since experts bill at higher rates). It also lets review start right away, without waiting for the busy expert.

4.    Seamlessly Handles Rolling Uploads

One impractical limitation of TAR 1.0 was the need to collect all documents before beginning training. Because early systems trained against a randomly selected control set, the set had to be selected from all referenced documents in order to be valid. If you later received additional documents, the control set was no longer valid. In most cases you had to start the training over again from scratch.

By contrast, CAL systems continuously rank all of the documents in your collection. When you add new documents, they simply join in the ranking process. Since training/review is continuous, new documents are quickly integrated along with the others in the ranking mix.

5.    Excels at Low Richness Collections

TAR 1.0 systems choke with low-richness collections. One reason is the requirement that you train using randomly selected documents. When richness (the percentage of relevant documents in the collection) is low, it is hard to find relevant examples for training. Your SME may have to click through tens of thousands of documents. In some cases, the system won’t work at all.

CAL systems excel at low-richness collections, Grossman and Cormack concluded, often reducing the review population by 95% or more. This is because CAL starts by finding relevant documents through any means possible (keyword searches, witness interviews, analytics or otherwise). Those documents are fed to the TAR engine, which uses them to rank the document population. Reviewers get going right away reviewing the most-likely relevant documents first.

A Star is Born?

In the 1976 movie, an aging Kris Kristofferson helps newcomer Barbra Streisand break into show business. We quickly realize that her talent will soon eclipse that of the older star, whose career is in decline. Their love story is touching, but it doesn’t change the inevitable result. As the old star fades, the new one shines ever more brightly.

TAR is not show business, but there is a new star taking center stage. That star is Continuous Active Learning and it is pushing out the older TAR protocols. Grossman and Cormack have shown that it promises to save even more on review costs than its TAR 1.0 predecessors, while removing many of the limitations of those earlier protocols. A new TAR is born.

Ralph Losey: Catalyst’s Multidisciplinary Team is ‘The Ideal E-Discovery Team’

Ralph-Losey

Ralph Losey

Rarely do we use this blog to blow our own horn. But when a widely acknowledged leader in the field of e-discovery singles out Catalyst for having “the ideal e-discovery team,” we cannot let it pass unmentioned.

That is exactly what happened in a recent post by Ralph Losey at his e-Discovery Team blog. The topic of his post was visualizing data in a predictive coding project. He begins by discussing how an e-discovery team should be composed. It should include not just lawyers and technologists, as is often the case, but also scientists, Ralph says. Further, the lawyers on the team should be sophisticated about search. Finally, the lawyers should not simply be part of the team, but they should lead it.

He then continues:

For legal search to be done properly, it must not only include lawyers, the lawyers must lead. Ideally, a lawyer will be in charge, not in a domineering way (my way or the highway), but in a cooperative multi-disciplinary team sort of way. That is one of the strong points I see at Catalyst. Their team includes tons of engineers/technologists, like any vendor, but also scientists, and lawyers. Plus, and here is the key part, the CEO is an experienced search lawyer. That means not only a law degree, but years of legal experience as a practicing attorney doing discovery and trials. A fully multidisciplinary team with an experienced search lawyer as leader is, in my opinion, the ideal e-discovery team. Not only for vendors, but for corporate e-discovery teams, and, of course, law firms.

Even if Ralph had never mentioned Catalyst in this quote, we would emphatically endorse the point he makes. In the final analysis, discovery is a legal and judicial process, despite all the technology that supports it these days. Having lawyers at the helm who understand the law and the legal process is critical to ensuring it is done right.

Thanks for the kind words, Ralph. You are one of the true pioneers in our industry.

Measuring Recall in E-Discovery Review, Part Two: No Easy Answers

In Part One of this two-part post, I introduced readers to statistical problems inherent in proving the level of recall reached in a Technology Assisted Review (TAR) project. Specifically, I showed that the confidence intervals around an asserted recall percentage could be sufficiently large with typical sample sizes as to undercut the basic assertion used to justify your TAR cutoff.

download-pdfIn our hypothetical example, we had to acknowledge that while our point estimate suggested we had found 75% of the relevant documents in the collection, it was possible that we found only a far lower percentage. For example, with a sample size of 600 documents, the lower bound of our confidence interval was 40%. If we increased the sample size to 2,400 documents, the lower bound only increased to 54%. And, if we upped our sample to 9,500 documents, we got the lower bound to 63%.

Even assuming that 63% as a lower bound is enough, we would have a lot of documents to sample. Using basic assumptions about cost and productivity, we concluded that we might spend 95 hours to review our sample at a cost of about $20,000. If the sample didn’t prove out our hoped-for recall level (or if we received more documents to review), we might have to run the sample several times. That is a problem.

Is there a better and cheaper way to prove recall in a statistically sound manner? In this Part Two, I will take a look at some of the other approaches people have put forward and see how they match up. However, as Maura Grossman and Gordon Cormack warned in “Comments on ‘The Implications of Rule 26(g) on the Use of Technology-Assisted Review’” and Bill Dimm amplified in a later post on the subject, there is no free lunch. Continue reading

TAR 2.0 Capabilities Allow Use in Even More E-Discovery Tasks

Recent advances in Technology Assisted Review (“TAR 2.0”) include the ability to deal with low richness, rolling collections, and flexible inputs in addition to vast improvements in speed. [1] These improvements now allow TAR to be used effectively in many more discovery workflows than its traditional “TAR 1.0” use in classifying large numbers of documents for production.

To better understand this, it helps to begin by examining in more detail the kinds of tasks we face. Broadly speaking, document review tasks fall into three categories:[2]

  • Classification. This is the most common form of document review, in which documents are sorted into buckets such as responsive or non-responsive so that we can do something different with each class of document. The most common example here is a review for production.
  • Protection. This is a higher level of review in which the purpose is to protect certain types of information from disclosure. The most common example is privilege review, but this also encompasses trade secrets and other forms of confidential, protected, or even embarrassing information, such as personally identifiable information (PII) or confidential supervisory information (CSI).
  • Knowledge Generation. The goal here is learning what stories the documents can tell us and discovering information that could prove useful to our case. A common example of this is searching and reviewing documents received in a production from an opposing party or searching a collection for documents related to specific issues or deposition witnesses. Continue reading

TAR in the Courts: A Compendium of Case Law about Technology Assisted Review

Magistrate Judge Andrew Peck

Magistrate Judge Andrew Peck

It is less than three years since the first court decision approving the use of technology assisted review in e-discovery. “Counsel no longer have to worry about being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted review,” U.S. Magistrate Judge Andrew J. Peck declared in his groundbreaking opinion in Da Silva Moore v. Publicis Groupe.

Judge Peck did not open a floodgate of judicial decisions on TAR. To date, there have been fewer than 20 such decisions and not one from an appellate court.

However, what he did do — just as he said — was to set the stage for judicial acceptance of TAR. Not a single court since has questioned the soundness of Judge Peck’s decision. To the contrary, courts uniformly cite his ruling with approval.

That does not mean that every court orders TAR in every case. The one overarching lesson of the TAR decisions to date is that each case stands on its own merits. Courts look not only to the efficiency and effectiveness of TAR, but also to issues of proportionality and cooperation.

What follows is a summary of the cases to date involving TAR. Each includes a link to the full-text decision, so that you can read for yourself what the court said. Continue reading

How Corporate Counsel are Integrating E-Discovery Technologies to Help Manage Litigation Costs

The newsletter Digital Discovery & e-Evidence just published an article by Catalyst founder and CEO John Tredennick, “Taking Control: How Corporate Counsel are Integrating eDiscovery Technologies to Help Manage Litigation Costs.” In the article, John explains why savvy corporate counsel are using the multi-matter repository and technology assisted review to manage cases and control costs. Continue reading

How Much Can I Save with CAL? A Closer Look at the Grossman/Cormack Research Results

As most e-discovery professionals know, two leading experts in technology assisted review, Maura R. Grossman and Gordon V. Cormack, recently presented the first peer-reviewed scientific study on the effectiveness of several TAR protocols, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery,” to the annual conference of the Special Interest Group on Information Retrieval, a part of the Association for Computing Machinery (ACM).

download-pdfPerhaps the most important conclusion of the study was that an advanced TAR 2.0 protocol, continuous active learning (CAL), proved to be far more effective than the two standard TAR 1.0 protocols used by most of the early products on the market today—simple passive learning (SPL) and simple active learning (SAL). Continue reading

The Seven Percent Solution: The Case of the Confounding TAR Savings

SevenPercentSolution

“Which is it to-day,” [Watson] asked, “morphine or cocaine?”

[Sherlock] raised his eyes languidly from the old black-letter volume which he had opened. 
“It is cocaine,” he said, “a seven-per-cent solution. Would you care to try it?”

-The Sign of the Four, Sir Arthur Conan Doyle, (1890)

Back in the mid-to-late 1800s, many touted cocaine as a wonder drug, providing not only stimulation but a wonderful feeling of clarity as well. Doctors prescribed the drug in a seven percent solution of water. Although Watson did not approve, Sherlock Holmes felt the drug helped him focus and shut out the distractions of the real world. He came to regret his addiction in later novels, as cocaine moved out of the mainstream.

This story is about a different type of seven percent solution, with no cocaine involved. Rather, we will be talking about the impact of another kind of stimulant, one that saves a surprising amount of review time and costs. This is the story of how a seemingly small improvement in review richness can make a big difference for your e-discovery budget. Continue reading

Measuring Recall in E-Discovery Review, Part One: A Tougher Problem Than You Might Realize

A critical metric in Technology Assisted Review (TAR) is recall, which is the percentage of relevant documents actually found from the collection. One of the most compelling reasons for using TAR is the promise that a review team can achieve a desired level of recall (say 75% of the relevant documents) after reviewing only a small portion of the total document population (say 5%). The savings come from not having to review the remaining 95% of the documents. The argument is that the remaining documents (the “discard pile”) include so few that are relevant (against so many irrelevant documents) that further review is not economically justified. Continue reading

Another Court Formally Endorses the Use of Technology Assisted Review

Given the increasing prevalence of technology assisted review in e-discovery, it seems hard to believe that it was just 19 months ago that TAR received its first judicial endorsement. That endorsement came, of course, from U.S. Magistrate Judge Andrew J. Peck in his landmark ruling in Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012), adopted sub nom. Moore v. Publicis Groupe SA, No. 11 Civ. 1279 (ALC)(AJP), 2012 WL 1446534 (S.D.N.Y. Apr. 26, 2012), in which he stated, “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.”

Other courts have since followed suit, and now there is another to add to the list: the U.S. Tax Court. Continue reading