In discovery specifically, and in legal practice generally, the role of electronically-stored information (ESI) and new technology has grown exponentially over the past decade. As a result, it has become a practical reality that effective legal practice and effective discovery requires some level of technology literacy and competence, and since 2012, that practical reality has been slowly transforming into a formal requirement.
Our monthly legal eDiscovery news round-up features the FTC prioritizing Privacy Shield renewal, continued adoption of the lawyers’ duty of technical competence, and more GDPR news, as well as recent cases and new XDD content.
Just as a search or a TAR tool is making a series of binary classification decisions, so too are your human reviewers, and the quality of those reviewers’ decisions can be assessed in a similar manner to how you assessed the quality of a search classifier. Depending on the scale of your review project, employing these assessment methods can be more efficient and informative.
Sampling can be used to test your search classifiers – whether keyword searches, TAR software, or other tools – by calculating their recall (efficacy) and precision (efficiency). Doing so requires a previously-reviewed control set, contingency tables, and some simple math.
Beyond estimating prevalence, there are other opportunities to replace informal sampling of unknown reliability with formal sampling of precise reliability. Imagine iteratively refining searches for your own use, or negotiating with another party about which searches should be used, armed with precise, reliable information about their relative efficacy. Using sampling to test classifiers can facilitate this.
Now that we understand the necessary sampling concepts, let’s apply those concepts to our candy contest and figure out how many red hots we think are in the jellybean jar. In order to do so, we will need to identify our sampling frame, select our desired confidence level, and select our desired confidence interval.
In order to use sampling to estimate how many red hots are mixed into the jellybean jar, we need to understand some basic sampling concepts, including: sampling frame, prevalence, confidence level, and confidence interval, as well as how each affects required sample size.
Despite years of discussion in the eDiscovery industry about the power and importance of sampling techniques, many practitioners remain unfamiliar with what they can accomplish with them and when, outside of TAR, they might do so. There are opportunities across the phases of an eDiscovery project to replace guesses based on anecdotal evidence with actual estimates based on formal sampling.
Over the course of this survey, we have taken an in-depth look at a dozen cases and touched briefly on eleven others. Together, these twenty-three cases have provided us with an effective overview of the technology-assisted review case law landscape, including what’s been settled and what remains unresolved. We conclude our survey with a discussion of the five key takeaways we can derive from our review of these cases.
Before we conclude our survey of technology-assisted review cases, there are a handful of additional cases worth noting that did not warrant full posts of their own. Specifically, there are three more early cases, three more recent cases, and five international cases worth noting.
The next prominent technology-assisted review case requiring our attention is Winfield v. City of New York (S.D.N.Y. Nov. 27, 2017). The case wound up before the Magistrate Judge for the resolution of a variety of discovery disputes, including the Plaintiffs’ challenges to the City’s TAR process and their attempts to compel additional transparency regarding that process.