What’s next for non-personal data protections? A legal primer (open data note 1 of 2)

United Kingdom

The fair trade between society and creators of intellectual property

Most intellectual property laws justify the strong monopolistic powers they vest in creators by simultaneously granting free rights to society to exploit the creations, at some point in the future. The balance must be just right: too little time and too weak rights to recoup the benefits and a creator might not think it is worth their investment; too much time to profit from rights that are too strong and it is society which is being short-changed. This is why patent, design right and copyright protections only subsist for a finite period before their expiry.  This gives creators a time-limited monopoly over their creation before the general public, libraries, museums, as well as their competitors, can make use of it to advance society or for any other wish.

We set out below the finely-balanced position between copyright owners and society and contrast this with the more questionable trade between database owners and society. What becomes clear is that even if database rights were a useful legal incentivisation tool in Europe thirty years ago, it is certainly not free from doubt that they should continue to earn their right to remain on the statute books today.

Take Jane Austen’s 1815 novel Emma. After Austen’s death in 1817, the copyright to various of her works were purchased by an English publisher in 1832 and, according to an analysis by Annika Bautz, the copyright to her novel Emma expired in 1857.  Since then, Emma has been adapted by the BBC in 1948, 1960, 1972 and 2009, by NBC in 1954 and 1957, by CBS in 1960 and by ITV in 1996.  It has also been adapted by film in the 1995 film Clueless set in Beverly Hills, in a 1996 American comedy starring Gwyneth Paltrow and Ewan McGregor, in the 2010 Indian rom-com Aisha, and, most recently by Focus Features in their 2020 film directed by Autumn de Wilde.  There has also been various stage and theatrical adaptations, modern retellings in fiction (including ‘Emma Ever After’ in 2018 and ‘Emma and the Vampires’ in 2010) and even a YouTube series ‘Emma Approved’ which accumulated over 3 million views within a year. The point is, the copyright in the underlying story of Emma has expired so each of these adaptations are free to use the work as they wish. Any new work in each adaptation then receives its own copyright protection.

We discuss below that, in contrast with database rights where there can be indirect infringements, prior to the expiry of copyright protection, it is arguable how much of a character or an expressed idea must be copied for there actually to be a copyright infringement as such protection typically extends only to the expression of ideas, not copying ideas or insubstantial amounts of protected works.

The expiry of copyright protection prevents an owner from perpetually exploiting exclusivity, instead it provides an opportunity for others to make a living or simply to enjoy the work for free.  Out-of-copyright works are free in the public domain to be copied verbatim or adapted as desired. The length of time before expiry is, therefore, a contentious topic for both those seeking to retain their exclusive rights and those in the broader society wishing to make use of them. Against this backdrop, there are several examples of campaigns or legislative interventions (each with varying degrees of success) to shorten or prolong the copyright time period. A notable example is the unique legislative exception devised in the UK’s Copyright, Designs and Patents Act 1988 (the “CDPA 1988”) so that Great Ormond Street Hospital, a renowned London children’s hospital, could perpetually benefit from royalties to  the ‘Peter Pan’ play by J M Barrie (and any adaptation), notwithstanding its UK copyright expiring in 1987.  This has reportedly raised the hospital many millions over the years and is credited to the successful campaign of former UK prime minister Jim Callaghan (encouraged by his wife, then the chairwoman of the hospital’s special trustees).  In most of the world outside of the UK, except in the US where copyright to the play exists until 2023, the work is free from royalties.  Across the Atlantic, a frequently-cited copyright protection controversy is the US Copyright Term Extension Act of 1998 (also known as the Sonny Bono Copyright Extension or, contemptuously, as the Mickey Mouse Protection Act given that it prolongs the period before Mickey Mouse enters the public domain). The Sonny Bono Act granted a 20-year extension in 1998 to existing copyrighted works in the United States, resulting in new works owned by an individual being copyright-protected for the life of the author plus 70 years, which is in line with the European position. 


General copyright protections can be contrasted with the protections granted to databases. Cast your mind back to June 1992. Mark Zuckerberg was eight years old and Microsoft’s Windows 3.0 had just celebrated its first birthday. June 1992 was also the month when Europe first proposed to provide legal protection to databases. This proposal later became Directive 96/9/EC (the “Database Directive”).  The Database Directive harmonises the copyright regime for databases across the European Union and was implemented in the UK in the Copyright and Rights in Databases Regulations 1997 (the “CRD 1997”), intending, as the ECJ notes in British Horseracing Board v William Hill Organisation (C-203/02), ‘to promote and protect investment in data storage and processing systems which contribute to the development of an information market’.

Databases are defined in current EU legislation as ‘a collection of independent works, data or other materials which are arranged in a systematic or methodical way and are individually accessible by electronic or other means’.  Put simply, whilst databases are typically lists, forms or tables holding information, they also include collections of data on websites, intranets, encyclopaedias, purchase order systems, back-office inventory systems, document management systems and even a PDF of a spreadsheet (despite a PDF being akin to a photograph).

Databases can store non-personal data and/or personal data.  Personal data is any information relating to an identified or identifiable individual and is protected in the EU by the General Data Protection Regulation.  Individuals have rights of information, access, rectification, erasure, restriction, portability and objection in respect of their personal data.  Non-personal data, on the other hand, does not relate to an identified or identifiable individual or could initially be personal data which is now anonymous data.  Non-personal data, which can be equally valuable as personal data, could include weather forecasts, vital for energy production and shipping or supply chain logistics, and data on the performance or maintenance requirements of commercial planes or industrial machinery, vital for safety and economic efficiency.  Our discussion here focuses on database rights granted to non-personal data, rather than the additional personal data protections afforded to personal data within databases (and, for that matter, any other potential rights deriving from contract law, the law of confidence or another basis).

Databases are afforded protection in Europe through:

  1. EU-originated or EU-altered national copyright laws. For instance, the Copyright, Designs and Patents Act 1988 (the “CDPA 1988”) in the UK contains amendments to give effect to EU obligations under the Database Directive, protecting databases that are original literary work; and/or

  2. EU-wide sui generis database rights pursuant to the Database Directive if there is a substantial investment in obtaining, verifying or presenting the database’s contents.

Copyright protection for databases

To attract copyright protection as a literary work under the CDPA 1988, the work must be original (that is, as sourced from the EU-wide Database Directive, the ‘author’s own intellectual creation’ by reason of the selection and arrangement of data), systemically or methodically arranged, properly regarded as ‘a collection of independent’ works or data and ‘individually accessible’.   Whilst this EU-derived provision protects the structure of the database, databases may also be protected nationally by copyright under the CDPA 1988 if they in fact constitute a ‘table or compilation’.  These terms are undefined and open to interpretation, but the Oxford English Dictionary defines ‘table’ as ‘a collection of data organized in a notional set of rows and columns’ and a ‘compilation’ is typically seen as a collection of material arranged, selected or collected by an author.  As with other copyright protections, because of EU-wide measures to harmonise copyright protection terms under Directive 93/98/EEC (the Copyright Duration Directive), any such protection for these literary works subsists for 70 years from the end of the calendar year in which the author dies.

Database rights

The CRD 1997, applying the Database Directive, defines databases as: ‘a collection of independent’ materials or works or data, ‘individually accessible’ and ‘arranged in a systematic or methodical way’.  It grants unique or ‘sui generis’ database rights where there is a ‘substantial investment in obtaining, verifying or presenting the contents of the database’.

Databases can, therefore, benefit from database rights despite not being original (a word meaning ‘originate from the author’ rather than ‘novel like a patentable invention’) – originality is a prerequisite for copyright protection. What is needed instead is less ‘sweat of the brow’ and more ‘silver from the bank’ to satisfy the ‘substantial investment’ hurdle.  More pertinently, these rights can also protect data within databases if the database itself is a product of ‘substantial investment in obtaining, verifying or presenting [its] contents’.  These protections prevent infringers from ‘extracting’ or even ‘re-utilising’ a substantial part of the contents of the database without consent (or from repeatedly and systematically extracting or re-utilising an insubstantial part amounting to a substantial part) (as explored below). This gives the rightsholder the ability to exclusively exploit the creation and maintenance of the database and data within the database.

Remember casting your mind back to June 1992, a time long before Facebook and other large technology platforms were established but when Europe first conceived building blocks for the Database Directive?  The fact that this seems a long time ago is important. Whilst database rights may have seemed sensible in June 1992, are they still appropriate now?

Are database rights for non-personal data appropriate given their duration and strength?

Unlike the 70-year period for copyright protection, database rights would appear to expire over four times more quickly, after just 15 years.  This seems like a significantly weaker right; however, it is worth considering the temporal, ephemeral nature of much data and the ease with which the rights can be made ‘evergreen’.  Once the database is free for all to use after the expiry of these 15 years, the data is very likely past its use by date. Returning this ‘expired data’ to society provides society with little value at all and simply allows the rightsholder massive advantages for those first fifteen years.

Moreover, databases can benefit from multiple new 15-year terms if a substantial change occurs to the database.  A substantial change includes a substantial new investment and may result in additions, deletions or alterations to the database.  Therefore, the Database Directive effectively protects the data within databases for a very long period if the database is a product of ‘substantial investment in obtaining, verifying or presenting [its]contents’ and repeated substantial changes.  The fifteen years, unlike the twenty for a patent, really is the minimum duration not its maximum. Incremental changes will simply prolong the database right’s life.  Given the prevalence of dynamic databases and the regularity in which many databases are updated or altered, this has real practical appeal to the makers of databases whose exclusive rights are extended.  This, undoubtedly, seems contrary to the spirit of a limited exclusivity period and hinders the public’s options to eventually make use of the data as they wish.  Is it fair to tip the balance away from society towards the rightsholder in this manner?

Are database rights for non-personal data appropriate given the ease in which they subsist?

This perceived advantage to rightsholders should be considered alongside the ease of obtaining and enforcing their rights.  To enjoy a database right, there must be a ‘substantial investment in obtaining, verifying or presenting the contents of the database’.  The rightsholder, who must be a European Economic Area national or resident unlike the rights of a copyright owner by virtue of the Berne Convention, may not necessarily be the actual database maker, but could be the person who takes the decision to make and invest in the database creation.

The ‘substantial investment’ condition seems broad with investment defined as ‘financial, human or technical resources’ whilst substantial is in terms of ‘quantity or quality or … both’.  Qualitative investment could be intellectual effort. However, ‘obtaining, verifying or presenting’ does suggest a degree of effort is required and theoretically this limb has been restricted by case law over recent years.  For instance, British Horseracing Board v William Hill Organisation (Case C-203/02) confirmed that effort must be expended in finding or ‘obtaining’ independently created materials and forming a database, as opposed to investing in the materials which then form a database.  This directs the focus to finding (or ‘obtaining’), checking (or ‘verifying’) or displaying (or ‘presenting’) the database, rather than creating the materials that actually form a database.  It also draws a distinction between pre-existing data and newly created data, with the latter unlikely to benefit from database rights.  Put simply, investment must be in the database itself, not the data.

This judicial limit on the scope of ‘substantial investment’ prevents organisations benefiting from database rights where their investment targets the materials which eventually make up the database or, as occurred in Football Dataco v Yahoo! UK (C-604/10) (in which members of this firm acted), where only trivial effort was spent obtaining, verifying or presenting the data.  By limiting ‘substantial investment’ to cover the actual data storage and processing systems, not the creation of materials which are capable of subsequently being collected in a database, the ECJ made a clear policy decision based on the purpose of the Database Directive to promote storage systems, not the underlying data.  This may seem to deprive some organisations of this form of protection who generally create the data in their databases during the ordinary course of their business and, therefore, do not make substantial investments in obtaining, verifying or, if standard categorisations are used, presenting the data.  However, Laddie on the Modern Law of Copyright (5th edn) notes that the courts would likely ‘strive to avoid this unattractive result and that databases created in this way are likely to receive protection’. Moreover, these organisations could simply allocate a discrete portion of the general running costs of their business to the obtaining, verifying or presenting of that data, provided this discrete, ringfenced investment is not trivial.

Moreover, distinguishing between creating and verifying or presenting data is not easy. In Directmedia v Albert-Ludwigs-Universität Freiburg (C-304/07) a substantial investment was found despite the substantial investment being expended in the ‘collection’ (as well as the ‘verification’ and ‘presentation’) of the data.  In Football Dataco v (1) Stan James; and (2) Sportradar [2013] EWCA Civ 27, where members of this firm represented Stan James, the UK Court of Appeal found that inputting live scores from football matches did fall within the substantial investment limb, despite arguments made that this constituted data creation. The Court made a comparison between recording this football data and ‘a scientist who takes a measurement [who] would be astonished to be told that she was creating data’, considering that the data here was obtained, not created. The Court’s reluctance to deny the subsistence of a database right in part derives from policy considerations as they noted that ‘[the Database Directive] is concerned with creating a commercial right so as to encourage the creation of valuable databases’.

The Football Dataco case also considered that database rights will likely subsist where ‘substantial investment’ is directed to the collection of pre-existing data but some parts of the data within the database is created as part of that process.  It referenced a hypothetical illustration of an academic recording in a database all of Charles Dickens’ references to law and lawyer (i.e. pre-existing data) but adding commentary to those entries (i.e. new data).  Unfortunately, however, drawing a distinction between obtaining pre-existing data and creating new data is challenging with the growing prevalence of AI and machine-generated data gathering in today’s data economy together with the ease in which anybody can create or use data.

Advancements in computing makes ‘obtaining, verifying or presenting’ data much easier and cheaper now than it was in 1992. Data visualisation and insight tools, such as Microsoft Power BI, IBM Watson Analytics and Tableau Desktop, democratise the analysis and presentation of data for those who can afford the licence fees, whilst the internet allows anybody to easily publish their data.

Are database rights for non-personal data appropriate given the ease of enforcement?

Enforcing database rights once they are found to subsist is equally straightforward.  The infringer must only extract (i.e. permanently or temporarily transfer) or re-utilise (e.g. make publicly available, rent or transmit) a substantial part of the contents of the data without the rightsholder’s consent. An infringement also occurs where there are repeated and systematic extractions or re-utilisations of insubstantial parts of the contents of the database without consent, amounting to a substantial part.  This prevents infringers taking the ‘little and often’ approach to extraction and re-utilisation.

The ECJ confirmed in British Horseracing Board v William Hill (Case C-203/02) that ‘extraction’ and ‘re-utilisation’ must be given a very broad interpretation as the Database Directive stipulates that they can be via ‘any means or in any form’, allowing for indirect, as well as prescriptive, means of extraction and re-utilisation.  Indirect means could be to create a new database or could involve re-utilising or extracting the results of a rightsholder’s investment but not directly from the original database.  The ECJ’s reasoning was that the Database Directive ‘afford[s] protection to the maker of the database and guarantee[s] a return on his investment in the creation and maintenance of the database’, making the purpose of the extraction or whether another competing database is created irrelevant.  Practical difficulties are posed when viewing or searching electronic databases on-screen as ‘extraction’ necessarily occurs with data being temporarily transferred and stored locally.  ‘Re-utilisation’ was considered and construed broadly in Innoweb BV v Wegener ICT Media (C-202/12) and in Football Dataco v Sportradar (Case C-173/11). In the latter case, the ECJ stated ‘the concept of ‘re-utilisation’ … must … be understood broadly, as extending to any act, not authorised by the maker of the database … of distribution to the public of the whole or a part of the contents of the database … The nature and form of the process used are of no relevance’.  It is clear, therefore, that the interpretation of ‘extraction’ and ‘re-utilisation’ favours the rightsholder.

This acceptance of indirect extraction or re-utilisation for database infringements can be contrasted with the typical position regarding copyright protections which do not protect ideas not permanently expressed or sufficiently insubstantial amounts of protected works.  Copyrightable works must be fixed in permanent form.  By way of illustration, the UK Court of Appeal found no copying of the expression of ideas, and therefore no copyright infringement, by Dan Brown in his novel The Da Vinci Code despite it taking ideas from central themes of a 1982 non-fiction book.

A qualitative or quantitative assessment is undertaken to determine whether a ‘substantial part’ of the contents of the database have been ‘extracted’ or ‘re-utilised’ with the quantitative determination assessing the amount of information used against the whole database whilst the qualitative assessment considers the scale of the investment in obtaining, verifying or presenting the contents of the database.

Even defending database infringements claims is tricky with limited permitted acts. Unlike copyright, there is no fair dealing database infringement exception for the purposes of public criticism, review and reporting news/current events.  The limited permitted acts for database infringements only include (amongst others): fair use (where a publicly available database is extracted by a lawful user for the purpose of illustration for teaching and research but not for commercial purposes and the source is indicated), deposit libraries, Parliamentary and judicial proceedings, and Royal Commissions and statutory inquiries and public records open to public inspection.

The ease of establishing subsistence and infringement must be combined with the concerns that database rights allow rightsowners to enjoy an unlimited monopoly over their raw data, particularly where the data’s value is time-bound or contained in a frequently updated dynamic database.  This prevents society from benefiting from the spoils of academia and industry who often rely on open data for their research and development. Our view, therefore, is that these 1992-conceived protections apply more broadly now than appropriate and do not provide a fair trade to society.

Will Europe reimagine database rights?

In December 2005, the European Commission published its first evaluation of database rights under the Database Directive, aiming to assess whether the aims of the Database Directive had been achieved and whether database rights have adversely affected competition.  This evaluation found that, despite the Database Directive aiming to stimulate database production, the number of databases produced in Europe in 2004 had fallen back to pre-Database Directive levels.  It did, however, conclude that leaving the Database Directive unchanged was appropriate, noting that judicial intervention by the CJEU could curtail the scope of the rights to limit concerns that it negatively affects competition.

The European Commission reached substantially the same conclusion in its second evaluation of the Database Directive in April 2018, which aimed to assess the effectiveness, efficiency, relevance, coherence and EU-added value of the Database Directive against its following three aims: (1) to harmonise database protections; (2) stimulate investment in databases; and (3) safeguard the balance between database makers and users.  As in 2005, the European Commission found no proven impact from the Database Directive on the production of databases in Europe.  It noted that, because of CJEU decisions from 2004 (including Fixtures Marketing v Oy Veikkaus (C-46/02), Fixtures Marketing v Svenska Spel (C-338/02) British Horseracing Board v William Hill (C-203/02) Fixtures Marketing v OPAP (C-444/02)), the scope of the database rights is restricted to the database itself and not more broadly to the data economy, with most websites and machine-generated databases (such as data automatically produced by Internet of Things devices) falling outside its scope.  This, the evaluation considers, limits any potential negative effects of stifled competition or unintended data lock-up.  Whilst limited reform of the database rights was deemed disproportionate, the Commission did recognise that debate remains about the scope of the rights and their application to the broader data economy needs monitoring, noting that any meaningful policy intervention would need to be substantial.

With no equivalent database rights outside of Europe, the data economy’s huge advancement since the Database Directive was conceived and the current regime’s apparent lack of a fair trade to society, an alternative approach to database rights seems appropriate and imminent.  In our second article, we explore what is now needed for data.  We will consider the concept and appropriateness of ‘open data’, the idea that data should be freely available to all and the current proposal that Europe’s possible Data Act 2021 could revise the Database Directive. We will also consider whether it may in fact be more appropriate to safeguard data with a combination of contractual arrangements, access controls, implied licensing and repealing database rights. Given the applicability of the current copyright regime to data, we postulate that it is appropriate to adopt this approach whilst encouraging accessibility, scalability and economic growth through embracing open data. We assert that encouraging an open data society, crucial to a competitive economy and evolving society, is made more important than ever in light of the current Covid-19 pandemic.