Hello, my name is Ye Hu
at the University of Bonn,
and today, I'm going to talk about Emerging Big Data in Medicinal Chemistry.
In particular, promiscuity analysis is chosen as an exemplary topic
or an interesting application for compound data mining.
And this presentation is co-authored with me and my professor, Dr. Jurgen Bajorach.
As we have noticed,
the big data phenomena has affected essentially all areas of life.
What exactly is big data and how do we define big data?
In 2001, an analyst, Laney
from an information technology firm, Gartner introduced a definition
which states big data is high-volume,
high-velocity and/or high-variety information assets that demand cost-effective,
innovative forms of information processing that enable enhanced insight,
decision making and process automation.
On the basis of this definition, volume,
velocity, and variety would represent characteristic features
or criteria of big data which are often cited as '3 Vs'.
After over a decade,
the '3 Vs' criteria were extended to '5 Vs'.
Veracity and value were added, indicating that one needs to ensure that
the data are correct and/or the analysis performed on the data is also correct.
All these available data would create a lot of value.
More recently, the '7 Vs' criteria were introduced.
Two more 'Vs', visualization and variability were included.
Visualization is actually the hard part of big data.
Making a large amount of data invisible
and intuitive in a comprehensive manner is not easy at all.
And the big data are often extremely variable.
There is, of course, no reason to limit big data criteria to only those 'Vs'.
However, they will represent the whole of big data issues.