CASE STUDY: SCRYPTA AND BIG DATA IN THE SCIENTIFIC FIELD

Scrypta
7 min readJul 12, 2019

--

Technology Big Data and Blockchain: what results can the union of these two innovations in the field of scientific research produce?

We live in an age where an incredible number of data is produced. The evolution of mobile devices, increasingly widespread, is destined to increase exponentially the volume of data generated each year.

However, Big Data is not just “a lot of data”. What really characterizes them are the various ways in which they are produced and transmitted between different social sectors. Moreover, data are often not even structured (easily definable and limited in the tables), but they can be presented in the form of documents, metadata, geographical positions, values ​​detected by IoT sensors and numerous other forms, from semi-structured to completely unstructured .

The cost of maintaining this multitude of data at traditional cloud storage vendors is very high. A further problem could arise in the sharing steps: the data could be lost, corrupted, not updated or acquired by unauthorized third parties.

Many of the digitally collected data concerning ourselves are gathered by companies and research agencies, and the processes by which they are interpreted and disseminated are very complex. It means that data can be partial, corrupt or useless if too old.

Those who create databases tend to adapt them to their own methodological and conceptual preferences, thus generating archives that mostly contain adapted data with similar preferences. It is also clear that those who are personally in charge of creating databases would be more able to use these infrastructures to their advantage. Furthermore, the insufficient acquisition of data sharing authorizations can have significant consequences such as the lowering of the quality of the statistical analysis, caused by the lack of updated or crossed data.

Data management evolves to the point that it could also affects scientific research: the data can be extracted, evaluated and transformed into a source of valuable information. In fact, the possibilities of production and data collection have multiplied, thanks also to the speed of dissemination of knowledge among researchers and between different disciplines, opening up new horizons of collaboration and new unexplored frontiers of research.

But the idea that Big Data is revolutionizing the scientific method — if not the whole world — has been wrongly spread. This error stems from the idea that thanks to these new technologies we can know everything and therefore there is no longer any need to think about what the true sources of our knowledge are: the data are so many that they can directly tell the reality for what it is .

What we can actually observe today is that we live in the era of post-truth, of fake news, whose motto is “there are no facts, only interpretations” (cit. Friedrich Nietzsche of the Framumi Postumi 1885–1887)

Data is part of the process of constructing scientific knowledge, and big data is an additional resource that can be exploited, without distorting the scientific method.

Sabina Leonelli (professor of philosophy and history of science at the University of Exeter, since 2014 she has been the Erc project “Data Science” principal investigator) deals with understanding how big data fit into scientific practice and methodology and how they are employed to change the relationship between technoscience and society. A part of her project deals precisely with following the path that the data cover from their generation to their use, passing through all the intermediate stages of analysis and re-elaboration.

It is interesting to bring back an excerpt of an interview to prove what has been said so far:

“The data sources that are circulated, to which people and researchers themselves have access, are very limited and represent only a small part of them, selected on criteria that very often have more to do with those who finance this production of the data rather than that type of knowledge you want to generate.

All the great philosophers have always recognized that there are no “raw” data, that is data not already mediated that gives us information about the world”.

We can therefore state that data are not produced independently of human interpretation but are the result of the choices made by individual research groups.

“When a researcher produces data, he does it through particular instruments and apparatuses, which are built on the basis of very precise theoretical principles and which hold in themselves the trace of all these perspectives. As a result, all the data comes from a certain type of conceptual perspective. Even the way they are organized is often based on theories and expectations of how they could be used”.

Another very important aspect is that these data, precisely because they are collected by heterogeneous systems, cannot be processed with traditional database management techniques. Data are increasingly unstructured, they varie very quickly, both in quantity but also in type, so it is necessary to think about unstructured databases.

“Very often it is not clear who is then adopting data that are for example created in a scientific laboratory, every time a genetic sequence is produced; or in a doctor’s office, whenever observations are made on what a doctor sees in a patient. What happens to these data once they are put online, for example, by a database or internet service of any kind, such as an internet site?

This is the path that is very important to trace. This means focusing first of all on the way in which data is mobilized through digital technologies and understanding how these works, how they relate to each other. For example, this is very complex with genetic data because there are now thousands of internet sites and databases that absorb this data, exchange them with each other, enrich them and classify them in various ways, and then maybe pass them to another service, to another sector. Trying to trace all these movements is a very complex process. And even more complex is to reconstruct the ways in which the data that are available in these digital worlds are taken and used for specific purposes”.

This is where Blockchain can help Big Data.

When it comes to managing big data in scientific research, the most important challenges to face are security, shareability and interoperability. If the information is isolated and stored on multiple systems that do not allow the regular exchange of information, the data become scarce.

The blockchain grants the possibility to share, track and make the data accessible to all the subjects participating in the chain without the possibility of error and corruption, allowing instant transactions without risks and at low cost.

Big Data can take advantage of an additional level of security thanks to Blockchain DLT (Distributed Ledger Technology) technology. Unlike traditional methods, information in the network is secure and cannot be changed. Furthermore, Big Data archiving can be more structured and transparent. Performing data analysis becomes much more efficient and easier. Using Blockchain technology can also help detect fraud or errors. In fact, it is possible to trace transactions and data from the origins and detect any anomalies.

What Scrypta proposes is to provide a reliable solution for managing and sharing Big Data through blockchain applications. The algorithmic protocols allow the data to be collected, stored, analyzed and crossed with a guarantee of security and without any personal information being revealed, ensuring adequate interoperability between infrastructures for an efficient exchange of data.

Scrypta is designing a platform specifically created for academic and scientific research that uses blockchain technology. Researchers will be able to record a chain of permanent, valid and immutable records in real time and make them available for all scientific and academic products from the earliest stages of research, including citation / attribution operations.

Using the platform, researchers will be able to demonstrate ownership of their studies and their very existence, expand access to their scientific and academic work, provide and receive “real-time” attributions for new jobs more quickly, as well as build and demonstrate all their academic contributions.

Huge amounts of data are needed to conduct scientific research. Researchers focus on these information sets and conduct regular tests under different circumstances to generate reports, statistics and effectiveness reports. Based on these reports, the data are studied in order to effectively analyze the results obtained.

In order to make scientific evidence more equitable and transparent, researchers will be able to use specific dApps interconnected with Scrypta’s blockchain technology to produce safe, impartial and indelible studies.

The documents created and used in this process, such as informed consent, research plans, regulations and study protocol, will be marked by timestamps. This means that the documents will have a trustworthy proof, accompanied by specific details concerning their creation. Moreover, thanks to the Scrypta blockchain, the archiving processes, the data collected, the studies carried out and the scientific evidence will be safe from fraud, data loss and accidents. Furthermore, it will be possible to significantly reduce the audit costs.

Many of the data collected and used globally can come from the scientific community, but it is also possible to integrate them with data from other sources, such as the general public, which thus becomes an active subject in the field of research. Just think of those data that are collected every day through the use of so-called “health apps”, applications that collect data on users’ habits and health.

It is of fundamental importance to raise people’s awareness of the use of these technologies as they can transform the relationship between science and society.

SCRYPTA - Adaptive BlockchainWebsite: www.scryptachain.org 
Foundation: https://scrypta.foundation
Block Explorer: https://chainz.cryptoid.info/lyra
Official Github: https://github.com/scryptachain
Twitter: https://twitter.com/scryptachain
Discord: https://discord.me/scryptachain
Telegram: https://t.me/scryptachain_official
e-mail: info@scryptachain.org

--

--

Scrypta
Scrypta

No responses yet