Despite years of discussion in the eDiscovery industry about the power and importance of sampling techniques, many practitioners remain unfamiliar with what they can accomplish with them and when, outside of TAR, they might do so. There are opportunities across the phases of an eDiscovery project to replace guesses based on anecdotal evidence with actual estimates based on formal sampling.
In order to use sampling to estimate how many red hots are mixed into the jellybean jar, we need to understand some basic sampling concepts, including: sampling frame, prevalence, confidence level, and confidence interval, as well as how each affects required sample size.
Now that we understand the necessary sampling concepts, let’s apply those concepts to our candy contest and figure out how many red hots we think are in the jellybean jar. In order to do so, we will need to identify our sampling frame, select our desired confidence level, and select our desired confidence interval.