Useful lessons from a Nuix webinar on email archives

NuixI don’t know about you, but my heart doesn’t exactly race at the thought of listening for an hour to a webinar on email archives. On the face of it, email archives fall into the category known as “worthy but dull” – important, vital even, but not the stuff that dreams are made on, even for someone whose professional definition of “interesting” is as esoteric as mine. I listened anyway – and came away informed not only about the difficulties posed by email archives but about the arguments to use when you come across them in eDiscovery.

Nuix’s promotional material for its recent webinar Right, from the start, modestly understated its scope, and by a wide margin. We got the promised explanation of the limitations of native archive search and extraction tools, and Michael Lappin, Nuix’s Director of Archiving Technology, gave us a crisp and clear explanation of the reasons why in-house and external lawyers – not just IT people – need to understand what those limitations are. What we got in addition, however, was an immensely useful explanation from Therese Craparo of Reed Smith of the strategic and tactical benefits which follow when lawyers are able to explain to opponents and the court how those technical implications feed into eDiscovery arguments about scope, timelines and cost. The messages about proportionality and burden were well worth listening to irrespective of your direct involvement with email archives.

Nuix CTO Stephen Stewart moderated with his usual calm focus. If you don’t want my report of the webinar and would rather listen to the whole thing, you will find it here. The slides, which include case references and case studies which supplement the narrative, can be found here.


Leaving aside the IT and broader information management considerations, there are two levels at which the deficiencies of email archives matter to eDiscovery lawyers (I use that term to embrace all those concerned with litigation, with regulatory compliance, with reacting to regulatory intervention, and with internal investigations of all kinds).

One is to do with the completeness of a search and the speed with which it can be accomplished. The other is the ability to explain to opponents and the court what these difficulties are and to make arguments about scope, timelines and proportionality which bring a result – an agreement or an order – which reflects reality; that same set of skills is needed to challenge unacceptable positions taken by opponents.

There is therefore a direct link between the limitations of email archives and arguments to be presented to an opponent, court or regulator. The Nuix webinar addressed both the problem and the strategic and tactical response to the problem.

What is the technical problem with some email archives?

Michael Lappin took us briskly through the problem. An archive, he said, is a disk-based repository behind an organisation’s firewall in which are collected documents from Exchange, Lotus Notes, SharePoint and other sources. The data is compressed and held in proprietary formats. An archive’s primary purpose was not recovery of files meeting specific criteria, and the generally rudimentary tagging and search tools were not adequate for speedy and accurate searches.

Although many users set out with good intentions about the implementation and use of retention schedules, very few actually did this. One source was exchanged for another, with the priority given to freeing up space in the user tools and keeping them running efficiently; there was never time and money enough to remediate the archives either on migration or in reaction to a trigger event.

Some of these systems, sold as “set it and forget it” systems, became unusable, in practice, for any purpose, let alone eDiscovery. Index corruption meant that searches, believed to be over the full corpus, were missing great chunks of it; databases became too big to handle and the whole thing ground to a halt when eDiscovery teams started doing searches; metadata-only indexes meant that email content was never indexed at all, something eDiscovery users may not have been aware of. The same searches over the same content yielded different results, causing lawyers concern over past eDiscovery exercises as well as current ones.

Understanding, managing and explaining the email archive problem in eDiscovery

Therese Craparo of Reed Smith looked at the problem from the position of a lawyer managing eDiscovery demands in circumstances where much of the potentially discoverable data lay in the sort of data archive described by Michael Lappin. Expectations had been raised by two things: one was the increased demands following the 2006 Amendments to the Federal Rules of Civil Procedure and the Zubulake opinions; the other was the raised expectations derived from improvements in the search technology becoming available for eDiscovery, altering the general idea as to what was reasonably accessible.

The early arguments about whether data was reasonably accessible involved backup tapes, but a far bigger problem emerged as it became clear that millions of potentially discoverable emails were sitting in email archives which had been designed to be good at keeping stuff but not so good at getting out. Therese Craparo said most of the arguments were not legal ones but practical ones.

Arguments about undue burden are increasingly overlaid with questions of proportionality, derived both from case law and from the pending new rules. Courts want facts and metrics. It is no good just saying “I don’t know if we can get the data out of the archives”, or relying on unsupported assertions that it would be too expensive to try. Expert evidence is needed on the number of servers and users, and the volume of data, together with realistic estimates of cost and timing.

It is necessary to get the judge to realise that this is not just lawyers making excuses but real problems. It is often helpful, Therese Craparo said, to give some metrics from past cases showing what volumes users have and what it costs to extract data from each user for a given period. It helps to be able to say to the judge “We have done this before – these are the numbers and this is what it is going to cost me”. It is also helpful to be well-enough informed to give alternative proposals. It is helpful if you can say “We cannot do this, but we can do that, and that will give you all you need for the purposes of the case”. The aim is to appear as the reasonable one, so that an opponent’s demands appear unrealistic and unreasonable.

The burden is on the demanding party to show the importance of the data they require relative to the burden placed on giving party. They should be forced to explain why they need this data or that data. Costs-sharing should be considered – a giving party which is on top of its metrics can put itself in a position where it can say “We have nothing to hide and we are happy to give this data, whatever we think of its value, but we think that the plaintiffs should contribute to the cost”.

The standard is reasonableness not completeness

Therese Craparo emphasised that you do not have to search data just because you can if the other side is getting what they need. The automatic reaction of some people to a discovery request is that they should try and comply with it, without stopping to consider whether it is possible to argue that the other side is getting what they need without that hard-to-access data. The standard, she said, is reasonableness not completeness.

Defeating a burden argument

A similar reliance on facts and metrics is required when other parties run a burden argument against you. One should make them focus on the detail in place of generalised assertions and, if necessary, you can bring in an expert who can look at the other side’s proposals and say “We can do it better”. That, of course, is secondary to addressing the question whether the data is necessary at all.

What is reasonable today?

As technology involves, new tools become available which can ease a task which was hitherto thought unduly onerous. It affects the credibility of a party who says that a heavy burden is involved if the other side can show that there is a new way of accessing the data.

It is essential for lawyers to know what new technology exists, what it costs and what it can be used for, not least because they might come up against an opponent who does know these things. Therese Craparo says that she keeps abreast of developments in various ways – she maintains links with her firm’s IT people because the problem is not just about lawyers any more; she attends webinars and conferences; she urged lawyers to listen to the vendors – you don’t have to buy their products, she said, and it can “feel annoying at times when you are busy”, but vendors can give you insight into technology which can solve your clients’ problems.


I listened to this webinar because I was tipped off that a very large number of people had signed up for it, implying a large demand for information about a subject which is not going to go away. The technical detail about email archives was both interesting and important; the bonus lay in the considerable airtime given to Therese Craparo to explain how difficulties with hard-to-access email archives can be managed by those with the right skills, knowledge and, where necessary, outside support from people who can answer the question “Is there a better way of dealing with this?”

There are links to a download of this webinar, and to its slides, near the top of this article. I think you would gain value, as I did, from listening to the recording. The web page about Nuix Intelligent Migration includes the video 3 billion cans of beans, one of the best eDiscovery videos in the market.


About Chris Dale

I have been an English solicitor since 1980. I run the e-Disclosure Information Project which collects and comments on information about electronic disclosure / eDiscovery and related subjects in the UK, the US, AsiaPac and elsewhere
This entry was posted in Discovery, eDisclosure, eDiscovery, Electronic disclosure, Nuix. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s