![]() |
|
|
What is Electronic Data Discovery? Companies produce piles of paper and electronic documents, on CDs, backup tapes, hard disks, and other media. These files are encoded in many different internal formats, and require different application programs that read and understand these file formats. The goal of EDD is to give the user access to the data. This can be done in many ways. Repository Methods.Traditional Method: Print each document or file, index the hard copies, and have your staff sift through the papers. This is a good solution for initial data discovery, but is labor-intensive and does not provide keyword searches and data analysis. Manual Method: Collect all the documents into a repository, and require each end-user to install all the application programs that understand the different file formats. Keyword searching is missing from this method. Licensing and installing all the applications for each file format on each end-user's machine can be a maintenance problem, as is training each user to use all of these applications, just to be able to read the documents. For example, if one department produces documents using Adobe Photoshop, you would include Adobe Photoshop in your list of software products every user should install and learn how to use. This option becomes complex and expensive quickly. Automated Method: Extract and index the information in these files, indexing the text, making electronic or digital images of what the printed document would look like, and optionally keeping the original documents with the data. If a source electronic document is a bitmap, it can be OCRed to extract text. For other electronic files, special drivers are needed that understand the file format and can extract relevant text and data. End-users can search the documents by keywords or other sophisticated methods (document similarity, for example), and can view what the printed document would look like on their workstation without installing the applications that understand each different file format. MetaData.The additional information in a document, such as author, from email address, date sent, can also be extracted and indexed, either as if it was part of the document text, as separate fields, or both. Quality Control.The manual processes of restoring from CDs, tapes, and other media, of cataloging documents, and other processes (especially if the shop processes documents manually instead of using automated tools) are subject to human mistakes. High-quality vendors include quality-control checks to make sure documents do not fall through the cracks. Automated methods and processes reduce the number of failure points, often producing results with fewer errors. EDD Today.Most companies are not familiar with EDD processes, and do not realize that they need EDD. The companies that currently use EDD mostly outsource the work to service bureaus. EDD services can include:
Who uses EDD?
Applications include litigation support, data warehousing, and document management. With litigation support, turnaround time and completeness are important.
|
||
|