Big Data in the study of marine biodiversity was the focus of Dr.C.Arvanitidis’ lecture

The existence of research infrastructures can lead to great change in the way science is conducted and be a catalyst for the transition to mega-science, science that puts forward hypotheses which are not only local but global.

‘Big data, new media, new methods of management, analysis and interpretation: Achievements and Challenges’ was the topic of the 2nd lecture in the 'Big Data, New Media, Documentation Issues: Learning from Pioneering Initiatives' series, given by Dr. C.Arvanatidis, Head of Research, Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research (HCMR).

In his interesting lecture on 20 March 2019 at the National Hellenic Research Foundation, C.Arvanitidis talked about the production, collection, storage and processing of Big Data in the study of marine biodiversity, both in Greece and globally, making reference to his experience as  co-ordinator of the research infrastructure  LifeWatchGreece, and as national representative of LifeWatch ERIC, the corresponding European infrastructure.

The term ‘biodiversity’ means the variety that is present in one species and the diversity between species and ecosystems. Thus at least three disciplines are involved in its study:  genetics, taxonomy and ecology. The fact that three different scientific communities with different research cultures are involved in this research makes the study of biodiversity an extremely complex process. The reality scientists face, however, becomes even more complex if we consider that we live in a time when the growth of technology has made it possible to record large amounts of data associated with life on the planet.

At this point, the existence of research infrastructures becomes very important for the collection and storage of data. What constitutes a research infrastructure? In the case of studying biodiversity, C.Aranitidis explained, the infrastructure consists of data collected from observations used to make records, the systems used for storing data, research networks, and of course the people who perform the work needed to operate an infrastructure. The existence of such infrastructure can cause great change in the way science is conducted and be the catalyst for the transition to mega-science, science that puts forward hypotheses which are not only local but global. The virtual laboratory plays an important role in this transition. To better explain what a virtual laboratory is he used the example of a CT scanner, similar to that used in a hospital, only used for samples of very small size, with dimensions which may be just a few millimeters.

‘With this device and the use of X-ray, it is possible to obtain a large quantity of imaging for each sample studied. These illustrations, if combined, can give a three-dimensional digital image of the animal which is the sample under study, while giving us the opportunity to observe both the outside of the body (morphology) and the inside (anatomy). In other words, we perform a virtual animal dissection without touching the animal itself. But for this to happen thousands of downloads are needed for a single particular sample.’

The result of this virtual experiment can then be added to a database and be accessible not only to scientists but all those interested. Besides all the other changes it can bring to the research culture and collaboration between scientists, this particular manner of research means a change in the type of scientific documentation. From the material type, covering all the organism samples found in libraries and attached to a locality (museum, laboratory etc.), we go to cybertype, which is generated in a virtual laboratory and can be in a digital database accessible from any corner of the planet.

According to Dr. Arvanitidi, despite the great value for the study of biodiversity, the production process of Big Data, has major difficulties. ‘For starters we have said that it requires very large storage capacities, if we consider, for example, that the data generated by scanning a single sample size may have a few hundred gigabytes. There have been 650 scans performed as part of LifeWatch Greece but it has only been possible to ‘upload’ 17 to the infrastructure storage system. The second difficulty involves the size of the resulting data in conjunction with the time needed to obtain the files from a researcher downloading from home or the office. When we talk about records of hundreds of gigabytes or even the terabyte, the difficulty of their recovery becomes evident.’

Dr. Arvanitidis concluded his lecture by saying that Big Data in biodiversity has significantly changed the way in which research is conducted in this scientific area. It may offer scientists considerable potential, but also creates new challenges, which scientists have identified and are working hard to overcome.

The lecture series the '“Big Data, New Media, Documentation Issues-Learning from Pioneering Initiatives' is being organised by the National Documentation Centre's Scientific Board with the support of Interdepartmental Programme of Graduate Studies in 'Science, Technology, Society-Science and Technology Studies' National and Kapodistrian University of Athens (

The lectures will interest students, researchers, scientists across disciplines, representatives of the public and private sectors, and anyone engaged in the transformation of big data into valuable documentation. Each lecture will be followed by discussion with the audience. Attendance is free of charge (no registration is required) and attendees will receive a certificate of participation.

The final lecture in the series will be in English on 8 May 2019 (18.00-20.00) when William Allen, Fellow by Examination in Political and Development Studies, Madgalen College, University of Oxford, Research Officer, Centre on Migration, Policy, and Society (COMPAS), will talk about 'The Politics of Big Data, Migration, and Mobility’.