Ask Catalyst: How Much Storage Do I Need for 600,000 Scanned Documents? (Or, How Many TIFFs in a GB?)

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]  

Ask_Catalyst_JT_How_Much_Storage_Needed-05We received this question:

How much data storage do I need to store 600,000 scanned documents?

Today’s question is answered by John Tredennick, founder and CEO.
Continue reading

Video: How Contextual Diversity in TAR 2.0 Keeps You from Missing Key Pockets of Documents

blog_contextual_diversity_videoHow do you know what you don’t know when using technology assisted review? As I discussed in a recent post, this is a classic problem when searching a large volume of documents. You could miss documents, topics or terms in a collection simply because you don’t know to search for them.

Contextual Diversity is the solution to that problem. A proprietary TAR 2.0 tool built into Insight Predict, it continuously and actively explores unreviewed documents for concepts or topics that haven’t been seen, ensuring you’ve looked into all corners of the collection. Continue reading

Ask Catalyst: What Does It Mean to Search By ‘AnyText’ in Insight?

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]  

Ask_Catalyst_PD_What_Does_It_Mean_to_Search_By_AnyTextWe received this question:

In Catalyst Insight, what does it mean to search by AnyText?

Today’s question is answered by Patty Daly, managing director, training.
Continue reading

Ask Catalyst: Is There a Fail-Safe To Prevent A Party from Skewing Results for Responsive/Non-responsive Documents?

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]

We received this question: Ask_Catalyst_TG_How_to_Decide_Percentage

If the parties are collaborating on what is a responsive/non-responsive document in order to train the system, is there a fail-safe that keeps one party from inappropriately skewing the results?

Today’s question is answered by Thomas Gricks, managing director of professional services. Continue reading

Ask Catalyst: How Do I Decide The Percentage At Which To Cut Off Search?

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]

We received this question: Ask_Catalyst_TG_How_to_Decide_Percentage

If I am the producing party, on what basis do I decide the percentage at which I’m cutting off the search for relevant documents? Does that have to be agreed upon by the parties?

 

 

 

 

 

Today’s question is answered by Thomas Gricks, managing director of professional services.  Continue reading

Catalyst’s Chief Scientist Questions Validity of Patent Case Against kCura

Jeremey Pickens

Jeremey Pickens

A small company made big news in the e-discovery world last week when it filed a series of patent infringement lawsuits against kCura, developer of the Relativity search and review platform, and several of kCura’s partners, alleging violation of a patent for concept-based visual presentation of search results.

In addition to kCura, the plaintiff, Blackbird Technologies, has filed separate lawsuits against Innovative Discovery, UnitedLex Corporation, System One Holdings, Advanced Discovery, Xact Data Services, TransPerfect, LDiscoveryand EvD Inc. (now a subsidiary of Ubic.) The lawsuits were all filed June 7 in the U.S. District Court in Delaware. Continue reading

Ask Catalyst: What Are The Thresholds for Using Technology Assisted Review?

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]
Ask_Catalyst_Jeremy_Pickens_Thresholds_for_TAR-03

We received this question:

What are the thresholds (in numbers of docs) at which your company will recommend the use of predictive coding? Would this be case dependent or just a percentage of documents (e.g. 100 out of 1,000 documents giving us 10%)?

Today’s question is answered by Dr. Jeremy Pickens, senior applied research scientist. Continue reading

Ask Catalyst: If TAR 2.0 Discourages Using Random Documents for Training, Aren’t the Results Biased?

[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]Ask_Catalyst_Tom_Gricks_Biased_Results

We received this question:

TAR 2.0 seems to discourage using randomly selected documents for training. Doesn’t this bias the results? How do I know what I don’t know?

Today’s question is answered by Thomas Gricks, managing director of professional services.  Continue reading

How Many Documents in a Gigabyte: 2016 Edition

How_Many_DocsReaders of our blog will know that I have a continuing interest in answering the perennial e-discovery question: “How many native documents are in a gigabyte?” I started thinking about this in 2011 and published my first article on the subject based on analysis of 18 million files. Challenging industry assumptions, which ran from 5,000 to as many as 15,000, I concluded that the average across all files—based on that sample—was closer to 2,500. Continue reading