Author Archives: John Tredennick

John Tredennick

About John Tredennick

A nationally known trial lawyer and longtime litigation partner at Holland & Hart, John founded Catalyst in 2000 and is responsible for its overall direction, voice and vision.Well before founding Catalyst, John was a pioneer in the field of legal technology. He was editor-in-chief of the multi-author, two-book series, Winning With Computers: Trial Practice in the Twenty-First Century (ABA Press 1990, 1991). Both were ABA best sellers focusing on using computers in litigation technology. At the same time, he wrote, How to Prepare for Take and Use a Deposition at Trial (James Publishing 1990), which he and his co-author continued to supplement for several years. He also wrote, Lawyer’s Guide to Spreadsheets (Glasser Publishing 2000), and, Lawyer’s Guide to Microsoft Excel 2007 (ABA Press 2009).John has been widely honored for his achievements. In 2013, he was named by the American Lawyer as one of the top six “E-Discovery Trailblazers” in their special issue on the “Top Fifty Big Law Innovators” in the past fifty years. In 2012, he was named to the FastCase 50, which recognizes the smartest, most courageous innovators, techies, visionaries and leaders in the law. London’s CityTech magazine named him one of the “Top 100 Global Technology Leaders.” In 2009, he was named the Ernst & Young Entrepreneur of the Year for Technology in the Rocky Mountain Region. Also in 2009, he was named the Top Technology Entrepreneur by the Colorado Software and Internet Association.John is the former chair of the ABA’s Law Practice Management Section. For many years, he was editor-in-chief of the ABA’s Law Practice Management magazine, a monthly publication focusing on legal technology and law office management. More recently, he founded and edited Law Practice Today, a monthly ABA webzine that focuses on legal technology and management. Over two decades, John has written scores of articles on legal technology and spoken on legal technology to audiences on four of the five continents. In his spare time, you will find him competing on the national equestrian show jumping circuit.

How Many Documents in a Gigabyte? Revisiting an E-Discovery Mystery

Document_PileRecently, Bob Ambrogi, our director of communications, published a post called “Our 10 Most Popular Blog Posts of 2015 (So Far).” To my surprise, one of my 2011 posts topped the list: “Shedding Light on an E-Discovery Mystery: How Many Documents in a Gigabyte?” Another on the same topic ranked fourth: “How Many Documents in a Gigabyte? An Updated Answer to that Vexing Question.

Hmmm. Clearly, a lot of us are interested in knowing the answer to this question. I have received a number of comments on both posts (both in writing and in conversation), which always makes the writing worthwhile. The RAND people told me they also found my findings of interest when they were putting together their study on e-discovery costs. Continue reading

Latest Grossman-Cormack Research Supports Using Review Teams for TAR Training

blog_apple_and_booksA key debate in the battle between TAR 1.0 (one-time training) and TAR 2.0 (continuous active learning) is whether you need a “subject matter expert” (SME) to do the training. With first-generation TAR engines, this was considered a given. Training had to be done by an SME, which many interpreted as a senior lawyer intimately familiar with the underlying case. Indeed, the big question in the TAR 1.0 world was whether you could use several SMEs to spread the training load and get the work done more quickly.

SME training presented practical problems for TAR 1.0 users—primarily because the SME had to look at a lot of documents before review could begin. You started with a “control” set, often 500 documents or more, to be used as a reference for training. Then, the SME needed to review thousands of additional documents to train the system. After that, the SME had to review and tag another 500 documents to document effectiveness of the training. All told, the SME could expect to to look at and judge 3,000 to 5,000 or more documents before the review could start. Continue reading

The Luck of the Irish: TAR Approved by Irish High Court

I do not know if any leprechauns appeared in this case, but the Irish High Court found the proverbial pot of gold under the TAR rainbow in Irish Bank Resolution Corp. vs. Quinn—the first decision outside the U.S. to approve the use of Technology Assisted Review for civil discovery.

The protocol at issue in the March 3, 2015, decision was TAR 1.0 (Clearwell). For that reason, some of the points addressed by the court will be immaterial for legal professionals who use the more-advanced TAR 2.0 and Continuous Active Learning (CAL). Even so, the case makes for an interesting read, both for its description of the TAR process at issue and for its ultimate outcome. Continue reading

Killing Two Birds With One Stone: Latest Grossman/Cormack Research Shows that CAL is Effective Across Multiple Issues

No actual birds were harmed in the making of this blog post!

Since the advent of Technology Assisted Review (aka TAR, predictive coding or computer-assisted review), one of the open questions is whether you have to run a separate TAR process for each item in a document request. As litigation professionals know, it is rare to have only one numbered request in a Rule 34 pleading. Rather, you can expect to see scores of requests (typically as many as the local rules allow). Continue reading

Why are People Talking About CAL? Because it Lets Lawyers Get Back to Practicing Law

I have been on the road quite a bit lately, attending and speaking at several e-discovery events. Most recently I was at the midyear meeting of the Sedona Conference Working Group 1 in Dallas, and before that I was a speaker at both the University of Florida’s 3rd Annual Electronic Discovery Conference and the 4th Annual ASU-Arkfeld E-Discovery and Digital Evidence Conference.

Infographic_CAL_HighRes copyIn my travels and elsewhere, I continue to see a marked increase in talk about the new TAR 2.0 protocol, Continuous Active Learning (CAL). I have been seeing increasing interest in CAL ever since the July 2014 release of the Grossman/Cormack study, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery.” Continue reading

Using Continuous Active Learning to Solve the ‘Transparency’ Issue in TAR

Technology assisted review has a transparency problem. Notwithstanding TAR’s proven savings in both time and review costs, many attorneys hesitate to use it because courts require “transparency” in the TAR process. 

Specifically, when courts approve requests to use TAR, they often set the condition that counsel disclose the TAR process they used and which documents they used for training. In some cases, the courts have gone so far as to allow opposing counsel to kibitz during the training process itself. Continue reading

Your TAR Temperature is 98.6 — That’s A Pretty Hot Result

Our Summit partner, DSi, has a large financial institution client that had allegedly been defrauded by a borrower. The details aren’t important to this discussion, but assume the borrower employed a variety of creative accounting techniques to make its financial position look better than it really was. And, as is often the case, the problems were missed by the accounting and other financial professionals conducting due diligence. Indeed, there were strong factual suggestions that one or more of the professionals were in on the scam.

As the fraud came to light, litigation followed. Perhaps in retaliation or simply to mount a counter offense, the defendant borrower hit the bank with lengthy document requests. After collection and best efforts culling, our client was still left with over 2.1 million documents which might be responsive. Neither time deadlines nor budget allowed for manual review of that volume of documents. Keyword search offered some help but the problem remained. What to do with 2.1 million potentially responsive documents? Continue reading

A TAR is Born: Continuous Active Learning Brings Increased Savings While Solving Real-World Review Problems

In July 2014, attorney Maura Grossman and professor Gordon Cormack introduced a new protocol for Technology Assisted Review that they showed could cut review time and costs substantially. Called Continuous Active Learning (“CAL”), this new approach differed from traditional TAR methods because it employed continuous learning throughout the review, rather than the one-time training used by most TAR technologies.

Barbra Streisand in ‘A Star is Born’

Their peer-reviewed research paper, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery,” also showed that using random documents was the least effective method for training a TAR system. Overall, they showed that CAL solved a number of real-world problems that had bedeviled review managers using TAR 1.0 protocols.

Not surprisingly, their research caused a stir. Some heralded its common-sense findings about continuous learning and the inefficiency of using random seeds for training. Others challenged the results, arguing that one-time training is good enough and that using random seeds eliminates bias. We were pleased that it confirmed our earlier research and legitimized our approach, which we call TAR 2.0. Continue reading

Measuring Recall in E-Discovery Review, Part Two: No Easy Answers

In Part One of this two-part post, I introduced readers to statistical problems inherent in proving the level of recall reached in a Technology Assisted Review (TAR) project. Specifically, I showed that the confidence intervals around an asserted recall percentage could be sufficiently large with typical sample sizes as to undercut the basic assertion used to justify your TAR cutoff.

download-pdfIn our hypothetical example, we had to acknowledge that while our point estimate suggested we had found 75% of the relevant documents in the collection, it was possible that we found only a far lower percentage. For example, with a sample size of 600 documents, the lower bound of our confidence interval was 40%. If we increased the sample size to 2,400 documents, the lower bound only increased to 54%. And, if we upped our sample to 9,500 documents, we got the lower bound to 63%.

Even assuming that 63% as a lower bound is enough, we would have a lot of documents to sample. Using basic assumptions about cost and productivity, we concluded that we might spend 95 hours to review our sample at a cost of about $20,000. If the sample didn’t prove out our hoped-for recall level (or if we received more documents to review), we might have to run the sample several times. That is a problem.

Is there a better and cheaper way to prove recall in a statistically sound manner? In this Part Two, I will take a look at some of the other approaches people have put forward and see how they match up. However, as Maura Grossman and Gordon Cormack warned in “Comments on ‘The Implications of Rule 26(g) on the Use of Technology-Assisted Review’” and Bill Dimm amplified in a later post on the subject, there is no free lunch. Continue reading

How Much Can I Save with CAL? A Closer Look at the Grossman/Cormack Research Results

As most e-discovery professionals know, two leading experts in technology assisted review, Maura R. Grossman and Gordon V. Cormack, recently presented the first peer-reviewed scientific study on the effectiveness of several TAR protocols, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery,” to the annual conference of the Special Interest Group on Information Retrieval, a part of the Association for Computing Machinery (ACM).

download-pdfPerhaps the most important conclusion of the study was that an advanced TAR 2.0 protocol, continuous active learning (CAL), proved to be far more effective than the two standard TAR 1.0 protocols used by most of the early products on the market today—simple passive learning (SPL) and simple active learning (SAL). Continue reading