I referred a while back to two case studies about the use of the technology known variously as “predictive coding”, “computer-assisted coding” or, more recently, “technology assisted review” or TAR. One of them involved Epiq Systems and the other Millnet. One was a US example involving Baker & McKenzie and the other one came from Eversheds in the UK. I group them together because all four of these names, of service providers and law firms, are familiar ones in the UK. Most of the (by now extensive) literature on the subject of predictive coding involves organisation names which allow non-US lawyers to dismiss the subject as being of no relevance to them. The familiarity of the players in these two case studies may help to dispel this notion, even if one of the cases involves US regulatory proceedings.
The Baker & McKenzie / Epiq IQ Review / Equivio example
I start with an interview in Metropolitan Corporate Counsel with David Laing, a partner in the Washington, DC office of Baker & McKenzie LLP and called Predictive Coding = Great eDiscovery Cost and Time Savings. The application used was Epiq Systems’ IQ Review which is a combination of Equivio’s Relevance software and Epiq’s own applications, pulled together by Epic’s consultancy services.
David Laing first describes how this technology works. He says:
It uses a limited number of senior attorneys familiar with a matter to review a representative statistical sample of the documents. The predictive coding software then applies the results of that statistical sample to the entire database. Predictive coding provides a way to prioritize documents for review.
His context is very large cases involving both high volumes and tight deadlines as well as an opponent, the Department of Justice, with the motive, the power and the means to be extremely fussy about what they are sent. The DOJ was, Laing says, “completely satisfied with the response and raised no questions about it”.
The traditional idea is that speed comes at the expense of accuracy, and the use of keyword searches followed by manual review has been taken uncritically by many to be the recognised way of arriving at an acceptable standard of accuracy. Laing says of the traditional approach that “the only way you can provide a recommendation to a client to sign under penalty of perjury a declaration that document production is complete is if an attorney puts an eye on each document”. Both the inadequacy of keywords and manual review and its cost are by now matters of record.
I will leave you to read David Laing’s description of his firm’s use of Epiq’s IQ review and its close interrelationship with lawyer input. The jobs in question were so large that traditional methods were plainly out of the question. The principle – of using lawyer skills plus sophisticated technology to prioritise the most relevant documents – applies in more everyday cases – who can seriously argue with the benefits of giving the lawyers an early sight of the most relevant documents?
The typical lawyer reaction is to question whether any such system can accurately identify “the most relevant documents”. This can be answered on several levels, all of which really come down to this: even if predictive coding produces results no more accurate than human review (and there are plenty of studies showing that it is very much more accurate), the speed with which it draws its conclusions, the consistency with which it does this, and the multiple opportunities and methods available to cross-check and validate the results, mean that it is very much quicker and easier to confirm and to demonstrate inclusions and omissions and to justify both.
The article concludes with a summary of current thinking on defensibility with references to Judge Peck’s recent article Search, Forward (free registration required). You may care also to read my report of Judge Peck’s Carmel speech. David Laing is undoubtedly correct in his assessment that lawyers, courts and regulators are waiting for a definitive ruling as that predictive coding is acceptable. His positive experience of dealing with the DOJ, however, seems more relevant than the vain hope that some judicial Moses will come down from the mountains with stone tablets bearing an unequivocal endorsement. Defensibility follows from general acceptance; more lawyers are showing willingness to make use of this technology for their own purposes and are saving time and money with no discernible loss of quality. We need a few more to have the guts to back their own judgements.
The Eversheds / Millnet / Equivio example
The same theme appears in the exercise which Eversheds undertook with Millnet, again using Equivio>Relevance, and which is described in the Orange Rag of 23 November in an article by Dominic Lacy and Jamie Tanner of Eversheds and James Moeskops of Millnet under the title Predicting the future of disclosure. It is the best example I have seen of a step-by-step description of a predictive coding exercise, covering the methods used, the decision “where to make the cut” between those to be included and those provisionally excluded, the careful audit made by sampling, and the costs implications and limitations. The article emphasises that the exercise was lawyer-led and that its effectiveness turns as much on the ability of the person responsible for the sampling as on the technology. The reference to “iterative adjustments to improve the effectiveness of the predictive coding results” conceals something which is often impossible with human review – the opportunity to change one’s mind as a result of sampling and re-run the process to take account of changed parameters. You cannot easily do that with manual review without going back over the review exercise if, for example, the issues change or a hitherto unregarded batch of documents changes the focus of the review.
The article ends with a quotation from Senior Master Whitaker’s judgment in Goodale v the Ministry of Justice and with references to the Practice Direction 31B CPR which make it clear that it is the duty of the parties to consider and discuss the “tools and techniques” to be used subject to the overriding principles of reasonableness and proportionality.
US lawyers and their clients live in dread of what appears to them to be the arbitrary imposition of sanctions and other adverse consequences of omission leading to an apparently incomplete discovery exercise. This fear is largely misconceived statistically having regard to the fewness of sanctions decisions where good faith is not in question. The rules in England and Wales impose no lesser duty to give disclosure properly, albeit on a narrower basis and with wider express scope in the rules to limit disclosure on a transparent and co-operative basis. The key in both jurisdictions is to understand the technology, the strengths and limitations, the costs and savings, and the QA processes, both human and technical, used to validate results.
In both jurisdictions, and in others, the starting point is a simple calculation: what will it cost to adopt one route, what will it cost to go down another, and what is the margin between them? Every assessment made thereafter takes account of that gap, as well as factors such as the value of the claim. If predictive coding offers a cheaper route to an acceptable result, then the burden passes to the opponent to explain why it should not be used.