Many of the world’s largest companies have R&D departments which spend fortunes developing new and improved products. Pharmaceuticals is perhaps the highest value of such industries, but similar work goes on in consumer products (shampoos, washing powders), electronics, crop treatments, paints and coatings, etc.
Such companies often find R&D is best done collaboratively — working with universities, suppliers, and companies in related industries often in dedicated shared spaces outside the company walls — a process called R&D externalisation. This allows them to access a wider range of expertise, share risk, make their business more flexible, and reduce overheads.
But greater openness carries greater risks. In data-driven R&D environments, ensuring the company investing in the research benefits from the IP produced means managing data carefully. Once they start introducing different parties and new locations, handling that hugely valuable data becomes a very different game.
Considerable thought needs to be given to striking the balance between access to research data and security. An understanding of data ownership, the value of different R&D data, how researchers use it, as well as the associated IT infrastructure, is vital to this process.
Agreeing who owns the data
One of the first questions regarding shared data is of ownership. When research is done in-house, ownership is usually clear. When partners, shared spaces, or co-funded technology are involved, it is less so. Parties must define who owns the final output, as well as data collected throughout the project, and methods of extracting and analysing it. Failing to do so could mean that you don’t fully own the IP. This is more a legal than an IT issue, but is important to how data is subsequently managed, stored and shared.
Understand your data in detail
Once you’ve agreed who has rights to data, you need to make sure no-one outside that agreement ends up with it. There are always risks from those who would steal data, but equally a competitor sharing your space may innocently adopt your data into their research.
Frameworks for dealing with such complex data sets must start by mapping out what type of data will be generated and the criticality of that data. This must be properly coded so appropriate security decisions can be made.
At the top end is anything that could affect share price, which should never leave a small, highly secure circle. Not far below that is data which could give competitors advantage, such as experimental data for a new product. This needs to be shared with the team, but not beyond. Further down the chain is raw data, which is often meaningless out of context, and below that, data to be made public.
Inevitably much information will contain multiple levels, and some data will change criticality depending on context and time. For example you could probably publish a string of measurements on viscosity without risk, but the same numbers alongside the original hypothesis could be very valuable to competitors.
This whole process is an extremely complex task of understanding and assessing data and must be carried out by people who thoroughly understand both the nature of data and the scientific or technological processes that are being studied. Only once different types of data have been properly understood and coded according to criticality, can appropriate security be put in place.
Technology for R&D collaboration
Once you’ve understood the data, the next step is choosing the right technology. This is challenge of implementing technology which fosters collaboration without risk. This means balancing security and ease of use – if sharing critical data is too arduous, people will cut corners.
Different layers are needed for different levels of criticality. Raw data for example can be shared via fairly open cloud services designed for sharing this type of data between sites. Hypotheses and contextualised results should be held much more securely, with restricted access.
Once your organisation starts the process of deperimeterisation, you will need to supplement standard security measures – anti-virus, malware scanners, firewalls, etc. – with easy to use, robust and appropriate user-authentication mechanisms. A single password might be appropriate for accessing raw data, but you will want to protect contextualised results with second or third factor authentication.
Know how people interact with data
The final challenge is understanding how people use data. Externalisation means that data which once never left lab walls must now be worked on in shared spaces and carried between locations. Researchers used to ultra-secure labs therefore need to take on new levels of responsibility and adopt new behaviours.
An approach that we have used is to develop profiles of types of employees – researchers, project managers, etc – and build up a picture of their movements. For example where they carried data storage devices, and how they interacted with different systems. This is a good basis to then make informed recommendations against each partner’s existing information security policies.
Examples of sensible recommendations to emerge from this process have included: ensuring there are adequate private spaces, and clear rules for what can be done with each category of data. Users of the shared spaces must have clear answers to questions such as: Can I save this type of data on a shared instrument PC for a short while? Can I use the email account provided by our partner organisation for my organisation’s data? How do I transport 1GB of raw data back to my home laboratory?
Whilst in many ways R&D externalisation is an information security challenge, it is far more complex than simply stopping data leaking. The task requires a detailed understanding of complex scientific data, how that data could be used, how researchers engage with it, how it could be transported and in what situations it presents a risk. Only then is it possible to create systems that allow it to be worked on in the right format by the right people, whilst protecting it from the wrong people.