11  Additional information

11.1 Citation

Please cite this book as:

TODO: add citation once published

Please cite the msqrob2 package as:

Goeminne L, Gevaert K, Clement L (2016). “Peptide-level Robust Ridge Regression Improves Estimation, Sensitivity, and Specificity in Data-dependent Quantitative Label-free Shotgun Proteomics.” Molecular & Cellular Proteomics, 15(2), 657-668. doi:10.1074/mcp.m115.055897.

If you opt for a summarisation-based workflow, you can also cite:

Sticker A, Goeminne L, Martens L, Clement L (2020). “Robust Summarization and Inference in Proteome-wide Label-free Quantification.” Molecular & Cellular Proteomics, 19(7), 1209-1219. doi:10.1074/mcp.ra119.001624.

If you use TMT-based workflows, please cite

Vandenbulcke S, Vanderaa C, Crook O, Martens L, Clement L. Msqrob2TMT: Robust linear mixed models for inferring differential abundant proteins in labeled experiments with arbitrarily complex design. Mol Cell Proteomics. 2025;24(7):101002.

References

Gatto, Laurent, Ruedi Aebersold, Juergen Cox, et al. 2023. “Initial Recommendations for Performing, Benchmarking and Reporting Single-Cell Proteomics Experiments.” Nat. Methods 20 (3): 375–86.
Goeminne, Ludger J E, Kris Gevaert, and Lieven Clement. 2016. “Peptide-Level Robust Ridge Regression Improves Estimation, Sensitivity, and Specificity in Data-Dependent Quantitative Label-Free Shotgun Proteomics.” Mol. Cell. Proteomics 15 (2): 657–68.
Huang, Ting, Meena Choi, Manuel Tzouros, et al. 2020. MSstatsTMT: Statistical Detection of Differentially Abundant Proteins in Experiments with Isobaric Labeling and Multiple Mixtures.” Mol. Cell. Proteomics 19 (10): 1706–23.
O’Brien, Jonathon J, Anil Raj, Aleksandr Gaun, et al. 2024. “A Data Analysis Framework for Combining Multiple Batches Increases the Power of Isobaric Proteomics Experiments.” Nat. Methods 21 (2): 290–300.
Plubell, Deanna L, Phillip A Wilmarth, Yuqi Zhao, et al. 2017. “Extended Multiplexing of Tandem Mass Tags (TMT) Labeling Reveals Age and High Fat Diet Specific Proteome Changes in Mouse Epididymal Adipose Tissue.” Mol. Cell. Proteomics 16 (5): 873–90.
Ramond, Elodie, Gael Gesbert, Ida Chiara Guerrera, et al. 2015. “Importance of Host Cell Arginine Uptake in Francisella Phagosomal Escape and Ribosomal Protein Amounts.” Mol. Cell. Proteomics 14 (4): 870–81.
Savitski, Mikhail M, Gavain Sweetman, Manor Askenazi, et al. 2011. “Delayed Fragmentation and Optimized Isolation Width Settings for Improvement of Protein Identification and Accuracy of Isobaric Mass Tag Quantification on Orbitrap-Type Mass Spectrometers.” Anal. Chem. 83 (23): 8959–67.
Segers, Alexandre, Cristian Castiglione, Christophe Vanderaa, et al. 2025. omicsGMF: A Multi-Tool for Dimensionality Reduction, Batch Correction and Imputation Applied to Bulk- and Single Cell Proteomics Data.” bioRxiv, March, 2025.03.24.644996.
Shen, Xiaomeng, Shichen Shen, Jun Li, et al. 2018. IonStar Enables High-Precision, Low-Missing-Data Proteomics Quantification in Large Biological Cohorts.” Proc. Natl. Acad. Sci. U. S. A. 115 (21): E4767–76.
Staes, An, Teresa Mendes Maia, Sara Dufour, et al. 2024. “Benefit of in Silico Predicted Spectral Libraries in Data‑independent Acquisition Data Analysis Workflows.” Journal of Proteome Research 23 (6): 2078–89. https://doi.org/10.1021/acs.jproteome.4c00048.
Sticker, Adriaan, Ludger Goeminne, Lennart Martens, and Lieven Clement. 2020. “Robust Summarization and Inference in Proteome-Wide Label-Free Quantification.” Mol. Cell. Proteomics 19 (7): 1209–19.
Vandenbulcke, Stijn, Christophe Vanderaa, Oliver Crook, Lennart Martens, and Lieven Clement. 2025. Msqrob2TMT: Robust Linear Mixed Models for Inferring Differential Abundant Proteins in Labeled Experiments with Arbitrarily Complex Design.” Mol. Cell. Proteomics 24 (7): 101002.

11.2 Data sets

We refer here the data sources used in the book:

11.2.1 E. Coli LFQ spike-in data set

Original study: Shen X, Shen S, Li J, Hu Q, Nie L, Tu C, et al. (2018) Ionstar enables high-precision, low-missing-data proteomics quantification in large bio- logical cohorts. Proc. Natl. Acad. Sci. U.S.A. 115, E4767–E4776

Reanalysis study: Sticker A, Goeminne L, Martens L, Clement L. Robust Summarization and Inference in Proteome-wide Label-free Quantification. Mol Cell Proteomics. 2020;19(7):1209-1219.

Link to data: https://github.com/statOmics/msqrob2data/dda/

Used in Chapter 1 and Chapter 4.

11.2.2 TMT spike-in data set

Original study: Huang T, Choi M, Tzouros M, Golling S, Pandya NJ, Banfai B, et al. MSstatsTMT: Statistical Detection of Differentially Abundant Proteins in Experiments with Isobaric Labeling and Multiple Mixtures. Mol Cell Proteomics. 2020;19(10):1706-1723.

Reanalysis study: Vandenbulcke S, Vanderaa C, Crook O, Martens L, Clement L. Msqrob2TMT: Robust linear mixed models for inferring differential abundant proteins in labeled experiments with arbitrarily complex design. Mol Cell Proteomics. 2025;24(7):101002.

Data source: MassIVE repository (RMSV000000265)

Link to data from archive: https://zenodo.org/records/14767905

Used in Chapter 3.

11.2.3 Francisella data set

Link to data: https://github.com/statOmics/msqrob2data/

11.2.4 Heart data set

Link to data: https://github.com/statOmics/msqrob2data/

11.2.5 Mouse diet data set

TODO

11.3 License

Creative Commons Licence
This material is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. You are free to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) for any purpose, even commercially, as long as you give appropriate credit and distribute your contributions under the same license as the original.

11.4 Session Info

The following packages have been used to generate this document.

sessionInfo()
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Brussels
tzcode source: internal

attached base packages:
[1] stats4    grid      stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] bookdown_0.46               tidyr_1.3.2                
 [3] scater_1.38.0               scuttle_1.20.0             
 [5] SingleCellExperiment_1.32.0 SummarizedExperiment_1.40.0
 [7] Biobase_2.70.0              GenomicRanges_1.62.1       
 [9] Seqinfo_1.0.0               IRanges_2.44.0             
[11] S4Vectors_0.48.0            BiocGenerics_0.56.0        
[13] generics_0.1.4              MatrixGenerics_1.22.0      
[15] matrixStats_1.5.0           patchwork_1.3.2            
[17] MsDataHub_1.10.0            impute_1.84.0              
[19] ggrepel_0.9.6               ggplot2_4.0.2              
[21] ExploreModelMatrix_1.22.0   dplyr_1.2.0                
[23] ComplexHeatmap_2.26.1       BiocFileCache_3.0.0        
[25] dbplyr_2.5.2                BiocParallel_1.44.0        

loaded via a namespace (and not attached):
  [1] DBI_1.2.3            gridExtra_2.3        httr2_1.2.2         
  [4] rlang_1.2.0          magrittr_2.0.4       shinydashboard_0.7.3
  [7] clue_0.3-66          GetoptLong_1.1.0     otel_0.2.0          
 [10] compiler_4.5.2       RSQLite_2.4.6        png_0.1-8           
 [13] vctrs_0.7.1          pkgconfig_2.0.3      shape_1.4.6.1       
 [16] crayon_1.5.3         fastmap_1.2.0        XVector_0.50.0      
 [19] promises_1.5.0       rmarkdown_2.30       ggbeeswarm_0.7.3    
 [22] purrr_1.2.1          bit_4.6.0            xfun_0.56           
 [25] beachmat_2.26.0      cachem_1.1.0         jsonlite_2.0.0      
 [28] blob_1.3.0           later_1.4.7          DelayedArray_0.36.0 
 [31] irlba_2.3.7          parallel_4.5.2       cluster_2.1.8.2     
 [34] R6_2.6.1             RColorBrewer_1.1-3   limma_3.66.0        
 [37] Rcpp_1.1.1           iterators_1.0.14     knitr_1.51          
 [40] Matrix_1.7-4         httpuv_1.6.16        tidyselect_1.2.1    
 [43] viridis_0.6.5        abind_1.4-8          rstudioapi_0.18.0   
 [46] yaml_2.3.12          doParallel_1.0.17    codetools_0.2-20    
 [49] curl_7.0.0           lattice_0.22-9       tibble_3.3.1        
 [52] shiny_1.13.0         withr_3.0.2          KEGGREST_1.50.0     
 [55] S7_0.2.1             evaluate_1.0.5       circlize_0.4.17     
 [58] ExperimentHub_3.0.0  Biostrings_2.78.0    pillar_1.11.1       
 [61] BiocManager_1.30.27  filelock_1.0.3       DT_0.34.0           
 [64] foreach_1.5.2        shinyjs_2.1.1        BiocVersion_3.22.0  
 [67] scales_1.4.0         xtable_1.8-8         glue_1.8.0          
 [70] tools_4.5.2          AnnotationHub_4.0.0  BiocNeighbors_2.4.0 
 [73] ScaledMatrix_1.18.0  cowplot_1.2.0        AnnotationDbi_1.72.0
 [76] colorspace_2.1-2     beeswarm_0.4.0       BiocSingular_1.26.1 
 [79] vipor_0.4.7          rsvd_1.0.5           cli_3.6.6           
 [82] rappdirs_0.3.4       viridisLite_0.4.3    S4Arrays_1.10.1     
 [85] gtable_0.3.6         rintrojs_0.3.4       digest_0.6.39       
 [88] SparseArray_1.10.8   rjson_0.2.23         htmlwidgets_1.6.4   
 [91] farver_2.1.2         memoise_2.0.1        htmltools_0.5.9     
 [94] lifecycle_1.0.5      httr_1.4.8           GlobalOptions_0.1.3 
 [97] statmod_1.5.1        mime_0.13            bit64_4.6.0-1       
[100] MASS_7.3-65