How is my data normalized in Spectronaut™?

How is my data normalized in Spectronaut™?

Why normalization?

The aim of data normalization in proteomics is to correct for the variability that is not coming from the biological system itself but from the experimental process, mainly multistep sample preparation, and LC-MS instrumentation, especially if the samples were processed over long period of time or measured using different instruments. This variance can cause a bias, affecting the biological conclusions.

Normalization algorithms available in Spectronaut

Spectronaut allows to perform normalization according to one of two available algorithms, local and global normalization. Global normalization is normalizing individual runs based on the global (experimental) quantity metric. That global quantity metric could be median, average or geometric mean of the precursors selected for normalization. The algorithm will center peptide ratios towards that overall experiment metric quantity. Global normalization applies one normalization factor to each precursor measured within a given run.
If you prefer so, Spectronaut can also apply local normalization which is based on the Local Regression Normalization described by Callister et al. 2006. The assumption in local normalization is that the systematic bias observed in each of the run depends nonlinearly on abundances of measured precursors. The algorithm will therefore perform multitude of linear regression normalizations for grouped precursors and will apply to them separate normalization factors.

Default normalization in Spectronaut

Spectronaut default settings or BGS Factory Settings applies normalization to the data to minimize the effect of the variability generated by the sample preparation and the LC MS performance. This default normalization is based on the assumption that the samples used are similar, meaning the majority of the precursors within the samples are not regulated and that, for those which are, there is a similar number of peptides up and down regulated.

The default normalization is performed on the precursor level across all experimental samples (cross-run normalization). The normalized precursors quantities are subsequently used to derive peptides and protein groups quantities, without additional normalization at those levels. 

By default, Spectronaut will choose between local and global normalization algorithms based on the number of runs in an experiment. If the number of experimental runs does not exceed 500 (n<500), local normalization will be used, and global normalization will be applied in larger experiments. 

Alternative to default normalization

As usual, the default settings are suitable for most but not all experimental setups. The user can select local or global normalization independently of the experimental size. For most experiments both local and global normalizations perform in a similar manner. However, local normalization, due its more complex algorithm, might require higher computational power and time, especially in larger experiments. Some examples of experiments where local normalization could perform better are those where proteomes of multiple species are mixed and analyzed together, or in samples with very low complexity.

Normalization can also be turned off. This would be advisable, for example when analyzing samples with different levels of complexity.

You can change the default normalization strategy in three ways, as explained in the article "How do I change the default settings?" here. Briefly, before running the analysis, you can:

1. Create and save a custom schema with your suitable normalization option by going into the Settings Perspective (Figure 1), or

2. Change the option while setting up the experiment in the Experiment Setup window (Figure 2).

Figure 1. Saving a custom schema in the Settings Perspective


Figure 2. Changing the analysis settings on the Experiment Setup window

3. In an already analyzed dataset, you can right click on the experiment tab and choose Settings (Figure 3). In the Quantification node, you can change the normalization strategy and click confirm to apply the change. Using this option, you can also quickly check how the changed normalization strategy will affect quantification of protein groups and peptides as well as results of differential analysis.


Figure 3. Changing the settings to an already analyzed dataset

Further normalization options include selection of the precursors used for normalization. For example, if the experiment is performed on mixed species samples, the user can choose to normalize on only one of the species by selecting the corresponding FASTA file. In addition, if multiple libraires are used for extraction, a library-based precursor filter could be also applied. Finally, a modification-based filter can be applied. This filter allows normalization on the precursors that carry the specified modification. That workflow can be especially useful for PTM enriched samples. 
By default, the normalization algorithm will select 10000 precursor profiles, ranked by level of completeness. However, the completeness of precursor profiles used for normalization could be further specified by the user. For example, selecting Q value complete filter will allow normalization based on only such precursors that were consistently identified across entire experiment.

Effect of normalization

To see the effect of the normalization, Spectronaut creates a pair of plots where you can visualize your quantitative data throughout your runs before and after normalization. You can find them in the Post Analysis Perspective, Analysis Overview node and Normalization (figure 4).


Figure 4. The Normalization node shows the effect of normalization on your data. The left side shows boxplots of precursor responses before normalization for each run. The right side shows boxplots of the same precursor responses after normalization

Moreover, in the Report Perspective, Spectronaut will report normalized precursor quantity (EG.Quantity) as well as applied normalization factor (EG.NormalizationFactor). The same normalization factor will be applied to show how the fragment quantities changed (F.NormalizedPeakArea; F.NormalizedPeakHeight) compared to the raw measurements (F.PeakArea; F.PeakHeight).
Finally, Spectronaut will report which of the precursors were used for normalization, which is shown in the report column “EG.UsedInNormalizationSet”. 

Normalization of fractionated samples

One important aspect regarding normalization are those cases in which the samples are pre fractionated. In such cases, normalization should not be done throughout all runs, but by fraction. This should be annotated accordingly in the Conditions Editor when you set up your experiment (figure 5).


Figure 5. Annotating the fractions on the Conditions Editor is required for proper normalization



Callister SJ, Barry RC, Adkins JN, Johnson ET, Qian W, Webb-Robertson B-JM, Smith RD, and Lipton MS (2006) Normalization Approaches for Removing Systematic Biases Associated with Mass Spectrometry and Label-Free Proteomics. J Proteome Res 5:277–286.