Judge Peck and Predictive Coding at the Carmel eDiscovery Retreat

US Magistrate Judge Andrew Peck’s keynote speech at the Carmel Valley eDiscovery Retreat was one of the clearest statements yet by a judge that the use of new technology like predictive coding is an acceptable way to conduct search in appropriate civil litigation cases. Though necessarily limited to US courts in terms of direct influence, what he said applies in any jurisdiction requiring electronic discovery. What are you waiting for?

I know what you are waiting for: you think that one day a judge will deliver an opinion or a judgment (depending on your jurisdiction) which says in terms that a particular kind of technology is approved by the court. I know that, because I keep reading articles which say that predictive coding has not been approved, and such statements make sense only if there is a realistic expectation that approval might both be forthcoming and binding outside the court and matter in which it is given. Perhaps you have a mental picture of the occasion: “It is the opinion of this court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure / Civil Procedure Rules / [Insert name of local rules] and, furthermore, that the software provided by [insert your favourite vendor here] is the software of choice in this court”.  Perhaps the judge will go on to praise the car in which he or she drove to work, offer an endorsement for the floor polish used in the court, and give a quick puff, as it were, for his or her favourite brand of cigarette. IT’S NOT GOING TO HAPPEN.

There are various reasons for this, apart from possible breaches of judicial ethical rules. How was the application used? By whom? For what kind of case? What alternatives were or should have been considered? What did it cost and what did it save?

There is something pernicious in this waiting for the impossible. It means that many lawyers and their clients are not focussing on what modern technology can do for their cases but on what some notional judge might think. They are not informing themselves, developing faith in their own judgment and backing that faith with their fees and their actions, but making do with the old way of doing things until a judge has said it is all right to do something different.

One of my plans for a summer which seems to have gone already was to go back through my accounts of the many judicial panels and webinars which I have heard and seen in the last twelve months to pull out the many positive references to technology made by respected judicial figures on both sides of the Atlantic. I might yet get to this, but Judge Peck’s speech at Carmel covered all the necessary ground.

US Magistrate Judge Andrew PeckJudge Peck began by taking us back to the days when manual review – then, of course, the only possible method of review – acquired its “mystical” status as the “gold standard”. We all felt that we were pretty good at it, he said, particularly first thing in the morning, but the studies showed that, as with the gold rush, there was not as much gold as one might think. The 1985 Blair and Maron study An Evaluation of Retrieval Effectiveness for a Full- Text Document-Retrieval System showed that reviewers felt they had found 70% of the relevant documents where they had in fact found only 20%. The study Document Categorization in Legal E-Discovery: Computer Classification vs. Manual Review by Herb Roitblat, Anne Kershaw and Patrick Oot showed that computer review was at least as accurate as manual review, with 70% agreement between methods.

Maura Grossman and Gordon Cormack’s paper on the Richmond Journal of Law & Technology (JOLT) Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, XVII RICH. J.L. & TECH. 11 (2011) and the TReC Legal Track studies disposed of the idea that exhaustive manual review did an adequate job. The studies showed that technology-assisted review yields better results with lower effort. The Roitblat / Kershaw / Oot study ends with this: On every measure, the performance of the two computer systems was at least as accurate (measured against the original review) as that of a human re-review. Redoing the same review with more traditional methods as was done during the re-review had no discernible benefit.

Judge Peck took us through his own William A. Gross Constr. Assocs., Inc. v. American Mfrs. Mut. Ins. Co. No. 07 Civ. 10639, 2009 WL 724954 (S.D.N.Y. March 19, 2009), the one which begins with the well known “wake-up call to the bar in this district”. The problem in that case, as in others, was not the technology used – keyword search – but the failure to agree a sensible list of keywords. The very long suggested keywords list guaranteed a high recall rate with consequent low precision, and the court had to step in. Despite the “wake-up call”, Judge Peck said, parties continue to use keywords in the blind without cooperation. Cooperation requires more than simply agreeing to run the list put up by the other side.

If 2010 was the year of proportionality, Judge Peck said, it seems that 2011 is the year of predictive coding. He explained this as meaning (and it is his meaning which counts here, not any one provider’s definition) that a senior lawyer, using a random sample or pre-filtered set of documents decides whether they are responsive or not responsive; one goes through the process several times with senior people until the computer is sufficiently trained to apply those conclusions across a wider set or the whole. Some systems, he said, merely discriminate between relevant and non-relevant documents, while others prioritise them for review on a scale from 0 to 100. Such a system might be used to find keywords or for quality testing as well as for making primary selections. The idea is that lawyers do not spend their clients’ money reviewing irrelevant or low-rated documents.

This, he said, turns the traditional review process on its head. Instead of “a factory of contract lawyers” filtering up their selection of documents to senior lawyers, the senior lawyers are doing the initial work using their skill and knowledge in iterative passes until the computer is sufficiently trained.

The reaction of many lawyers is that they do not wish to be the guinea pig when there is no reported decision in which predictive coding (or any technology with similar purpose and effect) has been used. It is possible, Judge Peck said, that there are no decisions because technology of this kind is being used co-operatively, as it should be, and thus not generating grounds for dispute. Perhaps it is being used secretly so that the other side do not know how the decisions are being made (which, I suppose, would indicate a faith in the technology greater than the faith in the court). The most telling single point, in a speech which was full of them, was Judge Peck’s observation that none of the important cases on search – Judge Facciola’s Equity Analytics and O’Keefe, Judge Grimm’s Victor Stanley 1 and his own William A Gross amongst others – has actually endorsed the use of keywords, which every lawyer accepts as a proper means of searching; all these cases had involved criticism of the way that keywords had been used. Why then do we think we will ever get judicial approval for the use of any other technology? [A 2009 article called Wake-Up Call on Slipshod Search Terms by H. Christopher Boehning and Daniel J. Toal helpfully summarises the cases which Judge Peck mentioned]

We know that breathalysers work, Judge Peck said, but if they are not calibrated properly and if the officer did not use them correctly then that is not helpful. The same applies to much of the technology which is now available.

Judge Peck said that he wants to see that the parties have cooperated. If there is argument about the manner of search, he needs to see what process was used as well as the result; the court may, for example, need to know something about the initial training sets used to train the computer. If push comes to shove, he said, a party may have to show the other side more than merely the outcome, weighing the benefits which can follow from such cooperation against the argument that the process is work product. If you are intending to use predictive coding (and this applies to any such technology), it is best to be up-front about it and to get opponents’ agreement and buy-in, at least by telling them what you are proposing to do – “to a certain extent inviting them to a seat at the table”. If agreement is not forthcoming then you need to get the court to rule quickly. What happens then will perhaps depend on which judge gets the first case on the use of this kind of technology. It would not be helpful to the general cause to find a judge who says “I don’t understand this. Go back and use keywords and manual review”.

That, you might say, is exactly the risk we are not prepared to take. I heard another US judge say recently that knowing your judge (in the sense of knowing his or her likes and dislikes, strengths and weaknesses) is an important element in case preparation. Are you saying that you will ignore the technology solutions – not even investigate their capabilities – because of the possibility of coming before some backwoodsman of the bench for whom the telephone is an innovation? If that is not what you are saying, then what exactly is the argument against going equipped to argue (because you will have investigated the options) that one route to relevance is better than another?

Judge Peck said that if lawyers find it helpful to quote a judge then they might refer to an article he wrote with David Lender of Weil, Gotshal & Manges LLP which refers favourably to the use of predictive coding (see 10 Key E-Discovery Issues In 2011: Expert Insight to Manage Successfully published by Huron Legal) and that proponents cite before the court the TReC studies and the JOLT paper as a scientific basis for making selections by the use of technology. He sees no reason for not going forward with the use of this kind of technology, with no obvious downside to its use, and every upside, provided that parties use it transparently and cooperatively.

It is important to be clear that the starting point for Judge Peck and the other judicial advocates of technology is not the technology itself but the “just, speedy and inexpensive” requirement in Rule 1 of the Federal Rules of Civil Procedure. The same is true in the UK – Senior Master Whitaker’s judgment in Goodale v The Ministry of Justice refers to “software that will effectively score each document as to its likely relevance and which will enable a prioritisation of categories within the entire documents set”, the aim being “to produce a manageable corpus for human review… the most expensive part of the exercise”. That judgment relied not on some precedent from a higher court but on the duty of the judge and the parties together to arrive at the best way of doing justice.

You may care to look at my report of a podcast Judges and automated coding tools for electronic discovery in which I took part last year with two of the US Magistrate Judges referred to above, Judge Grimm and Judge Facciola, and with Maura Grossman, she of the influential JOLT paper. Amongst the things reported there are Judge Facciola on the subject of getting the court to listen to you and Judge Grimm saying that technology will always outpace us and that someone has to have the courage to go first. My own contribution to the discussion included this:

I would want to know the cost as well as the search implications of rival approaches; the burden passes to the party who argued for a more expensive route.

It is this which lies at the centre of this discussion. The question to be asked in every case is “what is the most proportionate way of managing this case?” It is possible, in some cases, that manual review by the “factory of contract lawyers” to which Judge Peck referred will be the right approach; though that may seem increasingly improbable as volumes increase and as technology improves, it is nevertheless an option to be priced and considered. No judge is going to say that technology or any particular technology must be used; they want to know that the parties have applied their minds to the best way of doing the job. A technology solution like predictive coding must be one of the options considered by the parties and put forward to the court. Lawyers are good at weighing the arguments for alternatives based on cost and other factors; judges, on the whole, are good at picking up new subjects from a standing start – that is what they spent their professional lives doing. The lawyers’ job is to understand and to articulate the options. In 2011, those have to include technology of the kind discussed approvingly by Judge Peck at Carmel alongside other methods of getting the job done proportiontely.

Last paragraph edited to make it more clear than did the original that third-party review teams are an option to be considered alongside a technology solution. This is consistent with the theme that the duty of court and parties is to find the best solution from a range of options.


About Chris Dale

I have been an English solicitor since 1980. I run the e-Disclosure Information Project which collects and comments on information about electronic disclosure / eDiscovery and related subjects in the UK, the US, AsiaPac and elsewhere
This entry was posted in Discovery, eDisclosure, eDiscovery, Electronic disclosure, Litigation Support, Predictive Coding. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s