Another field that will greatly benefit from the Big Data revolution in the upcoming years is the environment itself. There is an enormous potential for smart technology in the energy sector, where sensors and intelligent data analysis can help in optimizing energy consumption.


Forest, meadows and arable soils are natural stores of carbon and have an important role to play in climate change. As natural carbon stores, they can sequester carbon from the atmosphere, thereby counteracting the effects of global warming. However, the amount of carbon they are able to store depends on the environment around them. Moreover, if the temperature continue to rise, it is possible that woods and soils would begin to release more greenhouse gases into the atmosphere while storing less and less carbon.


Investigating precisely how climate changes impact the carbon storage capacity of forest, meadows and soils is the goal of Swiss and European projects like ICOS. In the framework of these projects, the lasts measurement techniques are used, which generate an almost uninterrupted stream of high-resolution data, such as wind speed and gas concentration (measured around 20 times per second). The resulting data are then assessed for quality, combined with climatic data, and ultimately used to calculate the fluxes of greenhouse gases between the ecosystems and the atmosphere.

Thanks to these new techniques of measurement, scientists have a mass of new data that further interacts in complex ways. Yet, with this stream of new data comes a bunch of new challenges. In particular, if we want to understand the impact of climate change on ecosystems, we need to find a way to connect all the different data together. This is not an easy task since the data can be so divergent that they cannot simply be reconciled without further processing. That is the reason why, a worldwide collaboration is needed to standardize the measurements, as well as the data processing and data management systems.

Even so, the huge quantities of data involved and the hundreds of variables and dimensions to discover are still a sticking point in research, especially in biological and environmental domains. There is a limit to what the human brain can process, and to what can be achieved with traditional statistical methods. Obtaining fresh insights from the complex data sets requires new techniques of data analysis that involves machine learning, where algorithms search for recurring patterns in the data.

The issue of data analysis is how to draw together large quantities of data to offer efficient and accurate machine learning techniques as a service for researchers or, with other words, how to offer insight-as-a-service. This is one of the goals of the Swiss Data Science Center (SDSC), co-launched by ETH Zurich and EPFL in February 2017. The goal of the SDSC is to create a bridge between scientists who are producing large quantities of data and those who are developing new data systems and techniques to analyze this data.

Overall, scientific questions inspired by real-life application is what drives forward the development and innovation of new techniques in data management and data analysis.