A critical metric in Technology Assisted Review (TAR) is recall, which is the percentage of relevant documents actually found from the collection. One of the most compelling reasons for using TAR is the promise that a review team can achieve a desired level of recall (say 75% of the relevant documents) after reviewing only a small portion of the total document population (say 5%). The savings come from not having to review the remaining 95% of the documents. The argument is that the remaining documents (the “discard pile”) include so few that are relevant (against so many irrelevant documents) that further review is not economically justified. Continue reading
I am sad to report that Browning Marean passed away last Friday. He will be sorely missed by his partners at DLA Piper, his clients and his many friends and colleagues. I am proud to say that I have been friends with Browning for many years and count myself in his fan club. We go back to the early days of Catalyst and before that even. The time was too short.
Browning served on the Catalyst Advisory Board for the past two years and was always quick to help whenever I asked. You couldn’t ask for a better sounding board or friend.
Many have already posted their thoughts and regrets about the loss of Browning, including our friend Craig Ball, who as usual made the case as eloquently as possible (Browning Marean 1942-2014). Thanks to Chris Dale as well for his comments, Goodbye Old Friend: Farewell to Browning Marean, and photo gallery. And to Ralph Losey: Browning Marean: The Life and Death of a Great Lawyer. And Tom O’Connor: Browning Marean: A Remembrance.
Browning and I go back to the early days, before there was an “E” in front of discovery. He told me once that he got his start on the speaking circuit after hearing one of my talks. It inspired him to see a lawyer up there talking about litigation technology, he said. Having watched Browning leave me in the dirt with his speaking prowess, I was both honored and pleased to have played a small part in getting him going.
I had the privilege of being with Browning on the dais, at conferences and in quiet evening meals from Hong Kong to London and many places in between. Had I realized time was short, there are so many things I would have wanted to say. Alas, that seldom happens and it didn’t here. He wrote me a few weeks ago to say he expected to be back on his feet in September. How I wish that were still true.
Browning: You touched a lot of people over your too few years and made the world a better place. We carry on in your honor.
Rest in peace old friend.
Last month, two of the leading experts on e-discovery, Maura R. Grossman and Gordon V. Cormack, presented a peer-reviewed study on continuous active learning to the annual conference of the Special Interest Group on Information Retrieval, a part of the Association for Computing Machinery (ACM), “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery.”
In the study, they compared three TAR protocols, testing them across eight different cases. Two of the three protocols, Simple Passive Learning (SPL) and Simple Active Learning (SAL), are typically associated with early approaches to predictive coding, which we call TAR 1.0. The third, continuous active learning (CAL), is a central part of a newer approach to predictive coding, which we call TAR 2.0. Continue reading
Maura Grossman and Gordon Cormack just released another blockbuster article, “Comments on ‘The Implications of Rule 26(g) on the Use of Technology-Assisted Review,’” 7 Federal Courts Law Review 286 (2014). The article was in part a response to an earlier article in the same journal by Karl Schieneman and Thomas Gricks, in which they asserted that Rule 26(g) imposes “unique obligations” on parties using TAR for document productions and suggested using techniques we associate with TAR 1.0 including: Continue reading
This past weekend I received an advance copy of a new research paper prepared by Gordon Cormack and Maura Grossman, “Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery.” They have posted an author’s copy here.
The study attempted to answer one of the more important questions surrounding TAR methodology: Continue reading
The purpose of the article was to report on several successful uses of technology-assisted review. While that was interesting, my attention was drawn to another aspect of the report. Three of the case studies provided data shedding further light on that persistent e-discovery mystery: “How many documents in a gigabyte?” Continue reading
[This article originally appeared in the Winter 2014 issue of EDDE Journal, a publication of the E-Discovery and Digital Evidence Committee of the ABA Section of Science and Technology Law.]
Although still relatively new, technology-assisted review (TAR) has become a game changer for electronic discovery. This is no surprise. With digital content exploding at unimagined rates, the cost of review has skyrocketed, now accounting for over 70% of discovery costs. In this environment, a process that promises to cut review costs is sure to draw interest, as TAR, indeed, has.
Called by various names—including predictive coding, predictive ranking, and computer-assisted review—TAR has become a central consideration for clients facing large-scale document review. It originally gained favor for use in pre-production reviews, providing a statistical basis to cut review time by half or more. It gained further momentum in 2012, when federal and state courts first recognized the legal validity of the process. Continue reading
Predictive Ranking, aka predictive coding or technology-assisted review, has revolutionized electronic discovery–at least in mindshare if not actual use. It now dominates the dais for discovery programs, and has since 2012 when the first judicial decisions approving the process came out. Its promise of dramatically reduced review costs is top of mind today for general counsel. For review companies, the worry is about declining business once these concepts really take hold.
While there are several “Predictive Coding for Dummies” books on the market, I still see a lot of confusion among my colleagues about how this process works. To be sure, the mathematics are complicated, but the techniques and workflow are not that difficult to understand. I write this article with the hope of clarifying some of the more basic questions about TAR methodologies. Continue reading
On Jan. 24, Law Technology News published John’s article, “Five Myths about Technology Assisted Review.” The article challenged several conventional assumptions about the predictive coding process and generated a lot of interest and a bit of dyspepsia too. At the least, it got some good discussions going and perhaps nudged the status quo a bit in the balance.
One writer, Roe Frazer, took issue with our views in a blog post he wrote. Apparently, he tried to post his comments with Law Technology News but was unsuccessful. Instead, he posted his reaction on the blog of his company, Cicayda. We would have responded there but we don’t see a spot for replies on that blog either. Continue reading
For an industry that lives by the doc but pays by the gig, one of the perennial questions is: “How many documents are in a gigabyte?” Readers may recall that I attempted to answer this question in a post I wrote in 2011, “Shedding Light on an E-Discovery Mystery: How Many Docs in a Gigabyte.”
At the time, most people put the number at 10,000 documents per gigabyte, with a range of between 5,000 and 15,000. We took a look at just over 18 million documents (5+ terabytes) from our repository and found that our numbers were much lower. Despite variations among different file types, our average across all files was closer to 2,500. Many readers told us their experience was similar. Continue reading