It is of course a good thing that the eDiscovery software market offers competing solutions to clients. Competition means choice; it also means that software providers must strive to keep invention up and costs down.
It is equally true that choice can become paralysing. Most of us have stared at shelves full of near-identical consumer products and tried to evaluate their competing claims and price differentials; we have the same difficulty when choosing providers of services, as lawyers, insurance companies and the like try to persuade us, with a finite vocabulary, that their service is the one you should buy. For many of us (and it is certainly true of me) the mind closes down eventually and I end up buying none of them.
Predictive coding, and the wider range of technology-assisted review tools of which it is part, add extra layers: we do not just have rival providers with competing interfaces, technology and marketing, but we also have debates as to how the technology should be used, with (in some cases rather fierce) battles breaking out between the more technologically-sophisticated practitioners. Lawyers, perhaps unsurprisingly, take refuge in muttering about “black boxes” and asking questions like “what will the judge think?”
What the judge is actually interested in is whether parties are “finding the documents that matter”. I put that expression in quotation marks because it is, sensibly, the last line of an article by Bob Tennant, Executive Chairman of Recommind, called The year is new. Is predictive coding 3.0?
Recommind has been a leader in this market for many years, Influencing thought as well as technological development. It describes its predictive coding technology as “continuous learning”, which it contrasts with “stabilisation models”. Continuous learning is explained in an article on the Recommind blog called Continuous learning in predictive coding: more than just nomenclature, by Alexis Clark which begins, as I have done, by talking about the confusing range of choices open to buyers.
I don’t intend to get into that debate here. The reason why I point you to Bob Tennant’s article is that it sets out clearly three basic steps to be undertaken in conducting any type of search for eDiscovery purposes, analyse, identify and validate.
As to analysis, one point which appears clearly from Bob Tennant’s article is that predictive coding is not an all-or-nothing, standalone tool. He suggests that you:
Leverage all the analytical tools at your disposal—including keyword search, phrase extraction, concepts and concept search, communication visualizations, metadata-based filtering, estimation sampling, and more.
That ties in with a subsequent paragraph emphasising the ends rather than the means. Bob Tennant says:
Machine learning is, after all, just one part of an efficient review strategy. And review efficiency is ultimately about spending more time with relevant content and less with the irrelevant. This is how you can quickly find the documents that make a difference in your case.
On validation, those who are worried about the question “What will the judge think?” might like to look at this Recommind paper Predictive coding: evaluating success in determining completeness.
You need to understand something of all this, and Bob Tennant’s article pitches it at the right level. You do not, however, need as deep technical knowledge as some of the articles on this subject seem to imply. What you do need to do is have a look at these solutions and not be paralysed by indecision because the choices seem too complex.