I'm Pietro Franceschi and I work at
the Bioinformatic Units and the Research Innovation Centre, Fondazione Mach.
So this is the second part of my lecture on the processing of LCMS metabolomics data.
In the second part I will touch some of
the aspects that are more related to XCMS and then I'll have
a general discussion about designing experiments in
metabolomics and what you should take into account when you design your experiments.
At the end of the previous lecture we were
discussing how important it is doing peak picking.
Basically, you have to look for metabolites, features.
In your data you look for peaks in the two-dimensional space,
that is, two-dimensional,
in one dimension you have
the m/z mass spectrometric dimension and the other is the chromatolytic dimension.
Basically, you should be already clear to everyone why peak picking is important.
The question is, how you do it?
The first naive idea doing peak picking is the one that is
implemented in matched filter and here I'm showing the plot of how this is performed,
so this is a type of peak picking.
This is an algorithm of peak picking and the paper where it is presented I think is
the one where the authors of XCMS has been presenting there too.
What you see in the plot on
the left is actually an extracted ion trace where you have a peak.
So you have a peak around 3,600 something,
there is a peak that has been found or is visible in
the mass slice between 268.1 and 268.2 m/z.
All around you have noise. The basic idea to find a peak,
the simplest idea, is to have a filter function,
to have a model of the peak,
it is the one presented above,
and move it over the extracted ion trace until you get a good match.
Basically, what the software is doing is probing
the extracted ion trace with
the filter function and looking where the superimposition is good.
You see this in that central plot and then when you have a good superimposition,
it will say, yes, I have a peak and the blue area
is the one that I will use as an intensity of that peak.
This filter function can be designed depending on
the characteristics of your chromatography and I think authors here
are showing a second derivative Gaussian filter
because they want to really well know
where are the boundaries of the peak. And this will
be something that is really important afterwards.