top of page

Responsible Innovation: Governance of Big Data in the ‘Institutional Void’

In traditional science the boundaries, rules and regulations for what is and isn’t responsible practice are well etched out. Anything contained within the blurry middle-ground is then debated and juried through a systematic procedure. With the increase in technology also comes the increase in power for the scientist to produce both great benefits, and harms, to mankind and the environment. Thus the debate on scientific responsibility has intensified and broadened since the latter part of the 19th century (Jonas, 1984). Scientist’s responsibilities and mantra has always been in the discovery and communication of truths (Stilgoe et al, 2013) which traditionally trumps wider moral responsibility if the two come into conflict (Douglas, 2003). An issue brought about by fast-paced innovation, is that emerging novel technologies often don’t fall into traditional categorisations or frameworks of what’s ethical and what’s not. Hajer (2003) calls this uncertainty an “institutional void” as the traditional structures and rules implemented in science cannot govern emerging technologies. Being able to govern innovation is an important part of responsible innovation (Stilgoe et al, 2013). Responsible Innovation is “a process that seeks to promote creativity and opportunities for science and innovation that are socially desirable and undertaken in the public interest” (EPSRC, 2016). This essay will first use social media as a demonstration of the difficulties of governing an area which falls into Hajer’s ‘institutional void’. The essay will then discuss the need for tighter governance of big data in order to ensure responsible innovation. Finally a potential governance strategy ‘The New Deal on Data’ (Pentland, 2014) will be discussed in light of governance issues.

Governing the ‘Institutional Void’: Social Media

Social media’s instantaneous nature and interconnectivity allows content to spread fast and wide. Therefore content posted on social media has the potential to cause serious damage. This is known as a ‘digital wildfire’ (Webb et al, 2015). Governing digital wildfires is a challenge, as existing governance strategies don’t “map onto” digital wildfires (Webb et al, 2015). Action can only be taken retrospectively, once harm has already been done. Damaging content (e.g. Tweets) may not appear harmful, unless placed within a specific context (i.e. the digital wildfire), making it harder for atomized detection of inflammatory material (Webb et al, 2015). Furthermore most of the content is not illegal, as people have a right to freedom of speech. Social media sites therefore mostly run a self-governance policy[1], under which very few tweets are reported to police, and even fewer taken to court (Webb et al, 2015). The problem with adopting self–governance policy, is the governance processes themselves also need to be governed. Online shaming, for example, can cause more harm than the original harmful content, which can be unjust if the crime doesn’t fit the punishment.

Awan (2014) calls for online Islamophobia to be treated the same as street level Islamophobia. However the laws and regulations needed to do this have not yet caught up with the online world. For example, there is no legal definition of cyberbullying within UK law (The Cyber Smile Foundation, 2015). Penalties issued from social media sites for a breach in their terms and conditions are minimal, i.e. suspension of account, or the deletion of a post. When police do get involved in digital wildfires, they only target individuals who are big protagonists (i.e. over 500 followers on Twitter) (Webb et al, 2015). In summary, there are many holes in the governance of social media, with traditional structures not sufficing in cases like digital wildfires. Thus this case study highlights the difficulties of governing within the ‘institutional void’.

Big Data and the Modern World

The explosion of ‘datafication’ in modern life has the potential to bring many benefits for humans including more personalised services, health monitoring, fraud detection, and more efficient use of resources. However the growing quantities of data also pose an inherent threat to personal and national security. With this movement also brings ethical questions, for example who the data belongs to, and what responsibilities different players have – moral and legal. Similar to social media, the scale and fast stream of the data poses problems for governance. It’s estimated that every day humans generate 2.5 quintillion bytes of data (IBM, 2011). The increase in data collected is accelerated by the growth of the Internet of Things[2] estimated, as of May 2015, at 4.9 billion connected objects (a 30% increase from 2014) (MIT, 2015). Data has been described as the “new oil of the internet” by Alex Pentland, Toshiba professor of Media Arts and Sciences at MIT (2014). It’s highly valuable, and those who have it in abundance can profit. However its potential power and future consequences are unknown. “It’s a new asset class, a new value, a new money. And we don’t have the regulations to treat it like the value class it is” (Pentland, 2014).

Governance and Data Security

The scale of big data can bring great knowledge, power, and money to companies utilizing it. However what makes big data attractive is also what makes it dangerous. Managing and exploiting large amounts of data securely has proven to be difficult: “Companies of all sizes and in virtually every industry are struggling to manage the exploding amounts of data” (MIT, 2015). Data security is vital, and yet surprisingly misunderstood. The majority of money is spent on securing the network (Oracle, 2013), despite breaches to servers occur more commonly than network breaches (MIT, 2015). Furthermore many companies are using outdated security approaches (MIT, 2015). Thus on a technical level, many companies are ill-equipped to be safely and responsibly storing big data.

The threat to personal data is not merely theoretical. A survey by CyberEdge (2014) found that 60% of the security practitioners reported successful cyberattacks. The threat was brought to the public’s attention last year as 15 million records were hacked from T-Mobile (The Guardian, 2015). The threat to personal security comes unequivocally attached to big data use as “the larger the concentration of sensitive personal data, the more attractive a database is to criminals, both inside and outside a firm” (Ramirez, 2013). Data has its place in modern society, but it needs protection in order to stay. A survey found that “44% of organizations have no formal data governance policy” (Rand Secure Archive, 2013). This emphasises the need for higher level enforcement. Therefore a ‘safety-guarantee scheme’ and government imposed restrictions on the type and amount of data companies can hold at once, need to be considered and empirically tested.

Who’s Responsibility?

In the governance of social media, Awan (2003) calls for a “multifaceted and international approach” when tackling online Islamophobia. Similarly Webb et al (2015) highlights the importance of collaborative work between existing social media governance structures. This collaboration and shared responsibility seems intuitive as by its nature, innovation is shared (Richardson, 1999). The unpredictable quality of innovation, which renders it hard to govern, is “inherently linked to its collective nature” (Hellström, 2003). Shared responsibility can lead to greater and wider governance if each player assumes and practices under that responsibility. However Beck (2000) argues that the complex systems inherently linked to innovation can cause “organised irresponsibility”.

Researchers in science are instructed to perform under an ethical framework in the form of the Universal Ethical Code for Scientists (Government Office of Science, 2007). This framework constitutes ‘Rigour, Respect and Responsibility’. Part of Rigour includes taking ‘steps to prevent corrupt practices and professional misconduct’. If researchers are using big data, this would include ensuring the storage and management of the data securely (i.e. under the data protection act, 1988). Under Respect the ethical code states ‘ensure that your work is lawful and justified’. When handling big data this would refer to the gaining of informed consent from the consumer to use their data, and complying to terms and conditions – i.e. not forwarding data to other companies. Finally, part of Responsibility states ‘seek to discuss the issues that science (and innovation) raises for society’. Therefore it is not just the scientist’s responsibility to use big data responsibly themselves, but also to highlight the issues to do with security and potential abuse of big data - and then communicate this to the consumer.

More relevant to the use of big data, an Ethical Awareness Framework has been created to help companies and researchers to develop their own ethical policies on data storage and analytics (Chessell, 2014). Although this is a positive movement, ethical codes are merely guidelines and are not enforced. They can also be interpreted differently by people with different opinions and motives. Researcher’s often rely on commercial companies for their data, and subsequently are often employed by, or have some contract where the company expects something in return. Consequently an enforcement of regulations is needed, as in business there is no one to ensure ethical codes are being adhered to.

Data Ownership

As well as security, another ethical issue surrounding big data, is that of ownership. Currently the assumption is that the companies who collect the data own it (MIT, 2015). However many, including Pentland (2014), believes this is a misconception and should not be the case. Pentland argues that rules are needed to determine who owns the data, else “consumers will revolt, and regulators will swoop down”. Pentland (2014) has proposed his own strategy of governance on big data, with the hope to “define the ownership of data and control its flow” to prevent the abuse and exploitation of the power big data holds. Currently there is little to protect consumers other than, like social media, terms and conditions – which, how often are read in depth before people click ‘I agree’? Furthermore many companies make data collection or cookies a necessity for the consumer to use their services. To avoid giving away data completely, the consumer would be unfairly limited in the services and technologies available to them. Many companies offer incentives for customers to give away their data, such as loyalty card points, making the scheme more attractive. This could be conceptualised as a bribe; or, as Pentland sees it, an indicator that the data belongs to the consumer (as it and can be exchanged for something the consumer wants in return). Pentland (2014) claims “people are OK about sharing data if they believe they’ll benefit from it”. Transparent loyalty schemes can generate a successful business model built on trust and mutual dependency - which can subsequently improve brand loyalty.

Big Data Governance: The ‘New Deal on Data’

Pentland, and many like him, are concerned that if big data is not regulated, people will start “selling data out the back, and criminals using it for some enterprise that affects critical systems, and people dying as a result” (Pentland, 2014). To prevent potential future data disasters, Pentland proposed a ‘New Deal on Data’ described as:

“A rebalancing of the ownership of data in favor of the individual whose data is collected. People would have the same rights they now have over their physical bodies and their money…. The New Deal would give people the ability to see what’s being collected and opt out or opt in… Transparency is key.” (Pentland, 2014)

The New Deal on data has been trialled in Trento, Italy. Full transparency has shown to increase trust and for consumers to “recognize the value in sharing” (Pentland, 2014) with consumers giving away access to more data under the New Deal rules. Companies are of course dubious and reluctant to comply to the New Deal as it threatens many business models. Pentland (2014) admits that “some businesses may disappear” but denies that the economy will be worse off as a result: “the economy will be healthier if the relationship between companies and consumers is more respectful… that’s much more sustainable and will prevent disasters.” Legislation is emerging from Pentland’s lobbying of the New Deal, including Consumer Privacy Bill of Rights (2015), and the EU’s data protection directives (MIT, 2015). New legislation to help protect consumers is undoubtedly positive. However, similar to the debate on greater social media governance verses the preservation of free speech - will tighter data governance hamper innovation? Perhaps in the adjustment period, but in the long-run data use will be more sustainable, making it possible for innovation to occur in a healthy non-exploitative environment. In order for this to occur there needs to be some sort of data infrastructure “We need data banks. We need data auditing” (Pentland, 2014). In summary, data use/collection needs to be tightly regulated, the way we regulate money or anything that holds value, otherwise that value will be lost.

Research in social media (i.e. Awan, 2014; Webb et al, 2015) has highlighted an obstacle big data regulation will also come up against. When dealing with a fast expanding or evolving technology, the time it takes to get new reforms and legislations through, means governance will always be ‘one-step-behind’. EPSRC (2016) guidelines for responsible innovation (Anticipate, Reflect, Engage and Act: AREA) are calling for the government to be proactive in anticipating the consequences and future issues surrounding innovative and emerging technologies (Stitgoe et al, 2013). Although this is a good practice, anticipations are no more than educated guesses, and so legislation built on anticipations may be equally as redundant as outdated legislation.


Companies and researchers alike are excited by the power of big data, and the potential benefits it can bring. Much less thought is given to the potential destruction that can be caused. Therefore it’s up to the government to bring in legislation to ensure customers data is being protected, and challenge company’s rights to ownership. Innovation occurs due to the support of people who believe whole heartedly in the success of the product or procedure. Thus when governing emerging technologies, people involved in its creation are naturally going to be biased towards anticipating its success, and not the potential negative consequences for society. Therefore responsible innovation legislation needs to enforce independent evaluations which are anticipatory of not just the successes of the emerging technology, but also the potential damages.

Ethical frameworks suggest that researchers have a role in encouraging responsible innovation. Research using big data is a novel and exciting technique for many fields. However this is threatened if researchers and companies are not respecting the data and the customer’s rights. As Pentland (2014) suggests, if nothing precautionary is done, drastic measures will have to be taken should a disaster occur –where a ‘switch-off’ of all data is not unimaginable. Therefore it’s scientist’s responsibility to protect, not just the data they’re handling, but all data. This should be done by lobbying for procedures to be put in place to ensure others are also responsible. To conclude, big data needs to be taken out of the ‘institutional void’ by specific governance structures being put in place. This will ensure big data use remains in line with the latter part of the definition of responsible innovation: “…innovation that (is) socially desirable and undertaken in the public interest” (EPSRC).


Awan, I. (2014). Islamophobia and Twitter: A Typology of Online Hate Against Muslims on Social Media.,(2), 133-150.

Beck, U. (2000). Risk society revisited: theory, politics and research programmes., 211-229.

Chessell, M. (2014). Ethics for big data and analytics. IBM. Retrieved from

CyberEdge (2014). “Cyber threat Defense Report”. CyberEdge Group

Douglas, H., (2003). The moral responsibilities of scientists (tensions between autonomy and responsibility, 59–68.

EPSRC (retrieved Jan 2016). “Framework for Responsible Innovation”. Retrieved from

Government Office for Science (2007). ‘Rigour, respect, responsibility: A Universal Ethical Code for Scientists’.

Hajer, M. (2003). Policy without polity? Policy analysis and the institutional void.,(2), 175-195.

Hellström, T. (2003). Systemic innovation and risk: technology assessment and the challenge of responsible innovation. ,369–384.

IBM Study. (2011, Oct 21). “StorageNewsletter: Every Day We Create 2.5 Quintillion Bytes of Data.” Retrieved from

Jonas, H., (1984). The Imperative of Responsibility. University of Chicago Press, Chicago.

MIT Technology Custom Review (2015, Jul). Securing the Big Data Life Cycle. Produced in partnership with Oracle.

Retrieved from

Narayanan, A., & Shmatikov, V. (2008, May). Robust de-anonymization of large sparse datasets. In(pp. 111-125). IEEE.

Oracle (2013). “An Inside-Out Approach to Enterprise Security,” Oracle/CSO Custom Solutions Group white paper.

Pentland, A. (2014). Interviewed in “With Big Data Comes Big Responsibility”. November 2014

Ramirez, E. (2013). Technology Policy Institute’s 2013 Aspen Forum. Federal Trade Commission.

Rand Secure Archive (2013). “Rand Secure Archive Releases North American Survey Results on Data Governance,” Rand Secure Archive, a division of Rand Worldwide.

Richardson, H. S. (1999). Institutionally divided moral responsibility. Social Philosophy and Policy, 16(02), 218-249.

Stilgoe, J., Owen, R., & Macnaghten, P. (2013). Developing a framework for responsible innovation.,(9), 1568-1580.

The Cyber Smile Foundation (2015). Legal perspective. Retrieved from

The Guardian (2015, Oct 1). “Experian hack exposes 15 million people's personal information.” Retrieved from

Webb, H., Jirotka, M., Stahl, B. C., Housley, W., Edwards, A., Williams, M., ... & Burnap, P. (2016). Digital wildfires: hyper-connectivity, havoc and a global ethos to govern social media.,(3), 193-201.


[1] Self-governance policy on social media involves online users controlling the spread of harmful content by ignoring, reporting, criticising, or blocking content (Webb et al, 2015).

[2] The Internet of Things is made up of everyday objects equipped with sensors, which allow the recording, sending, and receiving of data over the Internet, in the absence of human intervention (MIT, 2015).

bottom of page