This is one of a set of posts about the content and the discussion at ILTA 2014 in Nashville. Originally intended as a single post, the result was too long for that and I decided to split them up. See also ILTA 2014 – the context and the logistics.
Big data and analytics are two distinct subjects, both of which were well covered at ILTA. I group them together for these purposes because they were the joint subject of a panel discussion in which I took part. Organised by UBIC, the discussion was called Advanced analytics for the legal profession – big data challenges, analytic solutions and thoughts for the future. My co-panellists were Gerard Britton of Topiary Discovery and Yoshikatsu Shirai, Chief Client Technology Officer at UBIC, and the moderator was UBIC’s Paul Starrett.
My fellow panellists will forgive me if I focus on what I said, partly because I have the benefit of my preparatory notes, and partly because my part was deliberately intended as a summary.
The conventional Big Data discussion is all about big volumes managed by big law firms for big cases for big clients; for most lawyers that tends to make it sound like someone else’s problem. An ILTA audience is more broadly-based than most and it seemed important to me to, as it were, democratise the subject.
Big Data is not just a synonym for “lots of data”. Conventionally, it is said to comprise volume, velocity and variety; people tend to forget the fourth “v”, value. Armed with the right analytical tools, one can convert such data from being not just a record of what has happened but a source of prediction of what might happen, using what UBIC calls “behavioural informatics”.
The Internet of things is moving into the industrial Internet, with our homes, cities, stores and industrial machines generating tens or hundreds of thousands of pieces of information. It is not just the big firms who need to understand how to interrogate this data – how long will it be before you need to ask clients or opponents, perhaps in a very ordinary case, to give access to test data or running data relating to a plane crash or some breach or injury caused by engine failure? You should at least remember to ask – the fact that very ordinary companies now have big databases humming away in the background containing potentially relevant data is easily overlooked.
The title of our session, I observed, linked analytics and volume, and it is right to say that we cannot manage large volumes about analytics. We can, however, use analytics usefully over relatively small volumes.
I gave an example given to me by Vince Neicho of Allen & Overy. They are not infrequently sent a DVD of data and asked, effectively, “to identify the problem and advise”. The analytical tools which they use for much bigger litigation, and which they are accustomed to using, gives the lawyers a quick jump into smaller cases.
Gerard Britton scaled this up. Unlike litigation, when one at least knows what the issues are, it is not always easy to discern the depth and breadth of the scope of a regulator’s investigation. Analytical tools give lawyers the benefit of getting there first, before the regulator. We can go further: given the ability which these tools give to predict behaviour, they have an important role in compliance – if it is better to get to data before the regulator does, so it is even better still to identify a problem before it becomes the subject of a regulatory investigation. The problem, Gerard Britton said, is that whilst discovery can always get budget, budget is not so readily obtained for compliance.
A theme which recurs across several of these post-ILTA articles is this: where are we to find the people who have, or who can acquire, the necessary skills? If the data scientist is suddenly the most attractive person around, the next most important is the recruitment consultant who can help you find the data scientist.