Keyword searching for e-disclosure documents is not like using Google

There is no one-size-fits-all answer when deciding what keywords (and what else apart from keywords) to use to arrive at the “right” set of documents for disclosure. You have to educate yourself to know what the court expects. There is more to it than finding Paris Hilton with Google.

It comes as a surprise to many that the UK Civil Procedure Rules include a reference to anything so sophisticated as keyword searches. Paragraph 2A.5 of the Practice Direction to Part 31 CPR says this:

It may be reasonable to search some or all of the parties’ electronic storage systems. In some circumstances, it may be reasonable to search for electronic documents by means of keyword searches (agreed as far as possible between the parties) even where a full review of each and every document would be unreasonable. There may be other forms of electronic search that may be appropriate in particular circumstances.

We were discussing this paragraph last night at a meeting of Master Whitaker’s drafting group, in the context of the proposed new e-Disclosure Practice Direction. The point at issue (or one of the points from a meeting lasting four and a half hours) was the need to sanction – indeed, to require in an appropriate case – the use of technology, whilst not implying that technology is all you need.   One issue is that the use of keywords is only one of the many technology solutions which may be applied to the task of finding the “right” set of documents – “right” being a neutral term which I use deliberately here (as we cannot do in the rules) to connote compliance with the definition of a disclosable document in a way which is proportionate.  Our wording must cover developments in search technology which are as yet unknown. Another issue is that technology alone, however sophisticated, is rarely, if ever, enough. You need a brain and the instructions for using it in this context.

An example came up which you may want to use when you next try to explain to a judge that keyword searching is not quite like searching for “Paris Hilton” in Google. This was not, as you may think, a cheap crack at something which judges can relate to, but an example heard by Master Whitaker and Vince Neicho of Allen & Overy when they were in Hong Kong recently. A very large set of documents returned by other keyword searches was massively reduced by using this young lady’s name as a negative keyword, that is, removing from the set anything which included her name (I understand that she made a popular educational video, much admired by students of inter-personal relationships, and that links to this were included in many e-mails). The story illustrates that a single word or phrase can dramatically alter the results of a search. It would be wrong to remove every e-mail from or to key custodians which referred (perhaps in passing) to Paris Hilton, and just as wrong to leave them all in. Some skill, and an understanding of the context, is needed to leave in those which serve a real purpose in the case whilst eliminating those which consist only of a bare reference to Ms Hilton.

This is not necessarily an innate skill. Furthermore, the lack of such skills is not the only reason why parties may use (or avoid using, publicly at least) certain keywords. You might steer your opponent towards documents which he might otherwise not find amongst all the others which you have properly disclosed. Is there a duty to point out to an opponent that his proposed keyword list will omit something worth seeing (no, not Paris Hilton again)?. The failure may simply be one of judgement rather than of competence or of professional duty.

Parties have a duty to co-operate in disclosure as in everything else and there is, as the extract above from the Part 31 Practice Direction shows, an express duty to try and agree keywords – see Digicel (St Lucia) v Cable & Wireless at paragraph 72 et seq, and in particular paragraphs 80 and 81, for an example of a judge really applying his mind to the choosing the right keywords in the context of Paragraph 2A.2 of the PD. Paragraph 80 reads:

If one were to adopt the “leave no stone unturned” approach to disclosure then one would be more ready to add key words to those originally used by the Defendants. However, it will usually be wrong in principle to adopt that approach and, in my judgment, it would be wrong to adopt that approach in the circumstances of this case. One therefore has to consider the proportionality of adding an additional key word. For that purpose one has to form some sort of view as to the possible benefit to the Claimants of adding the key word and the possible burden to the Defendants of doing so. The burden to the Defendants will principally consist of the burden of manually reviewing a large number of irrelevant documents.

Digicel was not about competence – the main point at issue was not the Defendants’ ability to make keyword searches but their judgment in choosing them and their breach of the co-operation requirements of PD 31 CPR. The rules must, amongst other targets, provide a requirement for parties to try and agree the keywords and some encouragement to judges to form a view, as Morgan J did in Digicel, as to what is right – which may (as in that case) result in a list which did not exactly match either of those put before him

By chance, the subject of keywords, and the competence to use them, came up in an article written by the US commentator Ralph Losey at the week-end called Inspector Clouseau and the insights of Judge Facciola and Malcolm Gladwell suggest a bright future for e-discovery lawyers. If that is more like a précis than a title, you can at least deduce from it that Losey and I run in parallel tracks – US Magistrate Judge Facciola comes up frequently in my writing, Malcolm Gladwell’s book Outliers has made an appearance in my pages, and I am a keen promoter of the idea that lawyers who know how to handle electronic documents have a good career ahead of them.

I will leave you to read the article, but it is worth picking out one or two quotations from it which are particularly relevant to the subject on which we spent so much time last night. In a case called William A. Gross Const. Associates, Inc. v. American Mfrs. Mut. Ins. Co., Judge Peck (who spoke at the London IQPC conference last year) said this:

This case is just the latest example of lawyers designing keyword searches in the dark, by the seat of the pants, without adequate (indeed, here, apparently without any) discussion with those who wrote the emails. Prior decisions from Magistrate Judges in the Baltimore-Washington Beltway have warned counsel of this problem, but the message has not gotten through to the Bar in this District.

He goes on:

Of course, the best solution in the entire area of electronic discovery is cooperation among counsel. This Court strongly endorses The Sedona Conference Cooperation Proclamation.

Electronic discovery requires cooperation between opposing counsel and transparency in all aspects of preservation and production of ESI. Moreover, where counsel are using keyword searches for retrieval of ESI, they at a minimum must carefully craft the appropriate keywords, with input from the ESI’s custodians as to the words and abbreviations they use, and the proposed methodology must be quality control tested to assure accuracy in retrieval and elimination of “false positives.” It is time that the Bar – even those lawyers who did not come of age in the computer era – understand this.

There is more in the article on competence and in particularly on what US magistrate Judge John Facciola has said about the need to understand what is involved in handling large quantities of documents professionally, competently and cost-efficiently. Those of you who think that this is just something which Americans do will think again after reading Digicel.

Some of Judge Facciola’s pithy (and often spiky) words have appeared in my pages before, but the most telling passage quoted in the article is not from one of his many Opinions but from an interview. Try this as a telling way of suggesting that lawyers and judges need to equip themselves with some understanding of what is, after all, a basic fact of modern commercial life

I can’t imagine that 100 years ago lawyers were saying, “Gee, I don’t want to learn any of that stuff about airplanes, because God, I don’t know anything about airplanes,” or 200 years ago, “I don’t know anything about railroads.” One of the great things about being a lawyer and one of the reasons society looks to us for leadership in this area is that we have shown a remarkable ability to adapt the law to the changing society around us. That’s what we’re supposed to be doing. How do we get an exemption from that? There is no answer. There is no magic wand that [we] have that we’re going to push over you and tomorrow you’re going to walk into the office and understand this stuff. You’re going to have to get the books out. You’re going to have to talk to people. You’re going to have to educate yourself in every way. I don’t know, and that’s just as true of judges as it is of lawyers.

Ralph Losey adds “There is no shortcut. You have got to pay your dues or hire somebody else who has”.

The 56 million hits which are garnered by a Google search for “Paris Hilton” give me an idea for illustrating some of the difficulties of searching, which I will come back to in due course.


About Chris Dale

I have been an English solicitor since 1980. I run the e-Disclosure Information Project which collects and comments on information about electronic disclosure / eDiscovery and related subjects in the UK, the US, AsiaPac and elsewhere
This entry was posted in CPR, Discovery, eDisclosure, eDiscovery, Electronic disclosure, Litigation Support, Part 31 CPR. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s