Explore

The Nitty-Gritty and Other Reduplications, Production Fundamentals Series Part 2

2 / 6

A multi-part series on the fundamentals eDiscovery practitioners need to know about the preparation and production of ESI

In “The Final Countdown,” we discussed the importance of production and the primary production formats.  In this Part, we review other important production format considerations.


Beyond just deciding on your optimal combination of paper, near-paper, native, and near-native production options, there are a range of more-detailed options for you to consider.  Among the most important are options related to load files, metadata, redactions, numbering and endorsements, and paper integration.

Load Files

As we discussed, many productions are accompanied by a load file that contains information about the various documents and images being produced and that makes it possible for those materials to be imported together into a document review platform.  These load files can provide links to native or near-native files, to rendered images, and to extracted text files, and they can contain a variety of fields of metadata and extracted data for each document.

Load files are essentially large spreadsheets themselves, though their specific formatting requirements and applicable field delimiters vary some from system to system.  For example, Summation originated the “DII” load file format and Ipro originated the “LFP” load file format.  For this reason, it is important for the parties to be on the same page about what load file format(s) are going to be required.  

Decisions will also need to be made (or negotiated) regarding what fields the load file should include, how they should be labeled, and what custom fields – if any – should be created.  For example, a field might be included documenting the request number(s) in response to which each document is being produced, or a field might indicate the documents to which a protective order applies.

Metadata

As we’ve discussed, metadata has tremendous value, both as potential evidence (e.g., revealing when and by who something was modified) and as the basis of many filtering, sorting, and searching options within document review tools.  Thus, deciding (or negotiating) what metadata fields (and other extracted data fields), if any, will be included in your production (when producing non-natively) will have both evidentiary and usability impacts.

The EDRM organization’s model XML load file includes the following standard metadata and extracted data fields:

  • File Elements
    • FileName, FilePath, FileSize, Hash
  • Metadata Tags – All Documents
    • Language, StartPage, EndPage, ReviewComment
  • Metadata Tags – Messages
    • From, To, CC, BCC, Subject, Header, DateSent, DateReceived, HasAttachments, AttachmentCount, Attachment Names, ReadFlag, ImportanceFlag, MessageClass, FlagStatus
  • Metadata Tags – Files
    • FileName, FileExtension, FileSize, DateCreated, DateAccessed, DateModified, DatePrinted, Title, Subject, Author, Company, Category, Keywords, Comments

In addition to deciding what metadata and other extracted data gets included in your production, decisions may also need to be made (or negotiated) about what names will be used for those fields and what formats will be used for the values in them.  Custom fields too may need to be discussed.  For example, should there be a master date field?  If so, what hierarchy of other date fields should be used to generate that value?  In what time zone should all dates and times be normalized?

Redactions

As we discussed, primary production format affects your ability to perform redactions within documents.  Generally speaking, native and near-native files cannot be effectively redacted, while near-paper and paper productions can.  The availability of effective redactions is one of the reasons for the continued popularity of near-paper, image-based productions.

When preparing a production that will involve redactions, you will need to consider how redactions should appear on the page, including whether redaction type (privilege, PII, etc.) affects appearance or requires a label.  Additionally, if extracted document text is being provided (to facilitate searching), the extracted text for documents bearing redactions will have to be either excluded or replaced.  It can be replaced by performing optical character recognition (OCR) on the page images rendered for the document with the redactions applied.

Numbering and Endorsements

Also as we discussed, primary production format affects your options with regard to numbering and endorsements.  Paper and near-paper productions allow for per-page Bates numbering, while native and near-native formats generally only allow for per-file numbering to be applied.  Combination approaches require coordinating per-page numbering for some documents with per-file numbering for others.

Other endorsements, such as confidentiality warnings or protective order language work the same way.  Paper and near-paper productions can have consistent endorsements applied in the headers and/or footers of each page, while native and near-native productions cannot.  For native and near-native productions (and, often, for near-paper productions too), a custom load file field may be created that documents confidential status, protective order applicability, and other endorsement content for each document.

Paper Integration

Another element to consider is how you will handle natively paper materials collected during discovery along with all of your ESI.  Rather than producing such materials in paper format, you have the option of incorporating them into your electronic production.  This can be accomplished by scanning the documents into page images, performing OCR to extract the available text for searching, and manually entering relevant “metadata” values (e.g., bibliographic and source information).  Depending on their volume, you may have already taken these steps prior to review of the paper records.


Upcoming in this Series

In the next Part, Who Gets to Decide Production Options, we will continue our review of production fundamentals with a discussion of what the Federal Rules of Civil Procedure have to say about production formats.


About the Author

Matthew Verga

Director, Education and Content Marketing

Matthew Verga is an electronic discovery expert proficient at leveraging his legal experience as an attorney, his technical knowledge as a practitioner, and his skills as a communicator to make complex eDiscovery topics accessible to diverse audiences. An twelve-year industry veteran, Matthew has worked across every phase of the EDRM and at every level from the project trenches to enterprise program design. He leverages this background to produce engaging educational content to empower practitioners at all levels with knowledge they can use to improve their projects, their careers, and their organizations.

Whether you prefer email, text or carrier pigeons, we’re always available.

Discovery starts with listening.

(877) 545-XACT / or / Subscribe for Updates