Load Libraries
library(TargetDecoy)
library(RCurl)
library(mzID)
Download data in
working directory
download.file(
url = "https://raw.githubusercontent.com/statOmics/PDA22GTPB/data/identification/pyroSwissprot.mzid",
destfile = "pyroSwissprot.mzid"
)
Load Data in R
path2File <- "pyroSwissprot.mzid"
mzidSwissprot <- mzID(path2File)
## reading pyroSwissprot.mzid... DONE!
Launch the Shiny
Gadget
Explore the results for search eninge scores to find correct names of
search engine scores in the mzID.
evalTargetDecoys(mzidSwissprot)
Evaluate target decoy
assumptions
Peptide Shaker
evalTargetDecoys(
object = mzidSwissprot,
decoy = "isdecoy",
score = "peptideshaker psm score",
log10 = FALSE)
We observe that
the histogram shows that Peptide Shaker gives a very good
separation between good targets and bad targets.
The shape of the decoy peptideshaker PSM scores distribution
seems to be similar to that of bad target PSM scores.
The PP-plot shows that bad PSM hits are more likely to go to
target sequences than to decoy sequences! The ratio of decoys on targets
is not a good estimate of the expected fraction of bad target hits that
are returned.
MSGF+
evalTargetDecoys(
object = mzidSwissprot,
decoy = "isdecoy",
score = "ms-gf:specevalue",
log10 = TRUE)
The plots show that the distribution of the MSGF+ PSM scores are
nicely bimodal.
The separation between good target PSM scores and bad target PSM
scores is less pronounced than for peptide shaker. So it is beneficial
to include other engines with peptideshaker.
We do not see deviations from the target decoy
assumptions.
Omssa
evalTargetDecoys(
object = mzidSwissprot,
decoy = "isdecoy",
score = "omssa:evalue",
log10 = TRUE)
The separation between good target PSM scores and bad target PSM
scores is less pronounced for omssa than for peptide shaker. So it is
beneficial to include other engines with peptideshaker.
We do not see deviations from the target decoy
assumptions.
X!tandem
evalTargetDecoys(
object = mzidSwissprot,
decoy = "isdecoy",
score = "x!tandem:expect",
log10 = TRUE)
The total number of decoys does not seem to be a good estimate
for the total number of bad target PSM hits.
Indeed, it seems to be more likely that a bad PSM match is
matching to a target than a decoy sequence!
It is recommended to remove X!tandem as a candidate search engine
in peptide shaker in the search against Swissprot.
The reason why the search with X!Tandem is problematic is due to the
two pass search strategy that is performed by X!tandem. In the first
phase a rapid search is performed, which does not allow for
modifications nor for miss cleavages. In a second phase, a new search is
conducted solely against the identified peptides in the first phase, but
now by using a more complex strategy that allows for missed cleavages
and post translational modifications. Performing the refined search
against the smaller population of candidate peptides from the first
phase greatly reduces the computational complexity, however, it comes at
the cost that the TDA assumptions are violated. Indeed, in the second
pass low scoring PSMs can switch to a modified PSM, which seems to be
the case for many decoy hits from the first phase. Many of these
switched to modified target PSMs, however, remain to have a relative low
score and are likely to be bad target PSMs. The number of decoy matches
is no longer representative for the number of bad target matches. This
example shows that the use of a second pass strategy can be very
detrimental for the FDR estimation using the TDA approach. - This is so
problematic that combining X!tandem with other engines in peptide shaker
results in a break down of the target decoy assumption for
peptideshaker.
LS0tCnRpdGxlOiAiVHV0b3JpYWw6IEV2YWx1YXRlIHB5cm9jb2NjdXMgc2VhcmNoZXMgdXNpbmcgU3dpc3Nwcm90IGFuZCBwZXB0aWRlIHNoYWtlciIKYXV0aG9yOiAKICAtIG5hbWU6IExpZXZlbiBDbGVtZW50CiAgICBhZmZpbGlhdGlvbjoKICAgIC0gR2hlbnQgVW5pdmVyc2l0eQpvdXRwdXQ6IAogICAgaHRtbF9kb2N1bWVudDoKICAgICAgY29kZV9kb3dubG9hZDogdHJ1ZQogICAgICB0aGVtZTogZmxhdGx5CiAgICAgIHRvYzogdHJ1ZQogICAgICB0b2NfZmxvYXQ6IHRydWUKICAgICAgaGlnaGxpZ2h0OiB0YW5nbwogICAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUKbGlua2NvbG9yOiBibHVlCnVybGNvbG9yOiBibHVlCmNpdGVjb2xvcjogYmx1ZQotLS0KCiMgTG9hZCBMaWJyYXJpZXMgCgpgYGB7cn0KbGlicmFyeShUYXJnZXREZWNveSkKbGlicmFyeShSQ3VybCkKbGlicmFyeShteklEKQpgYGAKCiMgRG93bmxvYWQgZGF0YSBpbiB3b3JraW5nIGRpcmVjdG9yeQoKYGBge3J9CmRvd25sb2FkLmZpbGUoIAogIHVybCA9ICJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vc3RhdE9taWNzL1BEQTIyR1RQQi9kYXRhL2lkZW50aWZpY2F0aW9uL3B5cm9Td2lzc3Byb3QubXppZCIsCiAgZGVzdGZpbGUgPSAicHlyb1N3aXNzcHJvdC5temlkIgogICkKYGBgCgojIExvYWQgRGF0YSBpbiBSCgpgYGB7cn0KcGF0aDJGaWxlIDwtICJweXJvU3dpc3Nwcm90Lm16aWQiCm16aWRTd2lzc3Byb3QgPC0gbXpJRChwYXRoMkZpbGUpCmBgYAoKIyBMYXVuY2ggdGhlIFNoaW55IEdhZGdldCAKCkV4cGxvcmUgdGhlIHJlc3VsdHMgZm9yIHNlYXJjaCBlbmluZ2Ugc2NvcmVzIHRvIGZpbmQgY29ycmVjdCBuYW1lcyBvZiBzZWFyY2ggZW5naW5lIHNjb3JlcyBpbiB0aGUgbXpJRC4KCmBgYHtyIGV2YWw9RkFMU0V9CmV2YWxUYXJnZXREZWNveXMobXppZFN3aXNzcHJvdCkKYGBgCgojIEV2YWx1YXRlIHRhcmdldCBkZWNveSBhc3N1bXB0aW9ucyAKCiMjIFBlcHRpZGUgU2hha2VyCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkU3dpc3Nwcm90LCAKICBkZWNveSA9ICJpc2RlY295IiwgCiAgc2NvcmUgPSAicGVwdGlkZXNoYWtlciBwc20gc2NvcmUiLAogIGxvZzEwID0gRkFMU0UpCmBgYAoKV2Ugb2JzZXJ2ZSB0aGF0IAoKLSB0aGUgaGlzdG9ncmFtIHNob3dzIHRoYXQgUGVwdGlkZSBTaGFrZXIgZ2l2ZXMgYSB2ZXJ5IGdvb2Qgc2VwYXJhdGlvbiBiZXR3ZWVuIGdvb2QgdGFyZ2V0cyBhbmQgYmFkIHRhcmdldHMuIAoKLSBUaGUgc2hhcGUgb2YgdGhlIGRlY295IHBlcHRpZGVzaGFrZXIgUFNNIHNjb3JlcyBkaXN0cmlidXRpb24gc2VlbXMgdG8gYmUgc2ltaWxhciB0byB0aGF0IG9mIGJhZCB0YXJnZXQgUFNNIHNjb3Jlcy4KCi0gVGhlIFBQLXBsb3Qgc2hvd3MgdGhhdCBiYWQgUFNNIGhpdHMgYXJlIG1vcmUgbGlrZWx5IHRvIGdvIHRvIHRhcmdldCBzZXF1ZW5jZXMgdGhhbiB0byBkZWNveSBzZXF1ZW5jZXMhIFRoZSByYXRpbyBvZiBkZWNveXMgb24gdGFyZ2V0cyBpcyBub3QgYSBnb29kIGVzdGltYXRlIG9mIHRoZSBleHBlY3RlZCBmcmFjdGlvbiBvZiBiYWQgdGFyZ2V0IGhpdHMgdGhhdCBhcmUgcmV0dXJuZWQuCgoKCiMjIE1TR0YrCgoKYGBge3J9CmV2YWxUYXJnZXREZWNveXMoCiAgb2JqZWN0ID0gbXppZFN3aXNzcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gIm1zLWdmOnNwZWNldmFsdWUiLAogIGxvZzEwID0gVFJVRSkKYGBgCgotIFRoZSBwbG90cyBzaG93IHRoYXQKdGhlIGRpc3RyaWJ1dGlvbiBvZiB0aGUgTVNHRisgUFNNIHNjb3JlcyBhcmUgbmljZWx5IGJpbW9kYWwuIAoKLSBUaGUgc2VwYXJhdGlvbiBiZXR3ZWVuIGdvb2QgdGFyZ2V0IFBTTSBzY29yZXMgYW5kIGJhZCB0YXJnZXQgUFNNIHNjb3JlcyBpcyBsZXNzIHByb25vdW5jZWQgdGhhbiBmb3IgcGVwdGlkZSBzaGFrZXIuIFNvIGl0IGlzIGJlbmVmaWNpYWwgdG8gaW5jbHVkZSAgb3RoZXIgZW5naW5lcyB3aXRoIHBlcHRpZGVzaGFrZXIuIAoKLSBXZSBkbyBub3Qgc2VlIGRldmlhdGlvbnMgZnJvbSB0aGUgdGFyZ2V0IGRlY295IGFzc3VtcHRpb25zLiAKCiMjIE9tc3NhCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkU3dpc3Nwcm90LCAKICBkZWNveSA9ICJpc2RlY295IiwgCiAgc2NvcmUgPSAib21zc2E6ZXZhbHVlIiwKICBsb2cxMCA9IFRSVUUpCmBgYAoKCi0gVGhlIHNlcGFyYXRpb24gYmV0d2VlbiBnb29kIHRhcmdldCBQU00gc2NvcmVzIGFuZCBiYWQgdGFyZ2V0IFBTTSBzY29yZXMgaXMgbGVzcyBwcm9ub3VuY2VkIGZvciBvbXNzYSB0aGFuIGZvciBwZXB0aWRlIHNoYWtlci4gU28gaXQgaXMgYmVuZWZpY2lhbCB0byBpbmNsdWRlIG90aGVyIGVuZ2luZXMgd2l0aCBwZXB0aWRlc2hha2VyLiAKCi0gV2UgZG8gbm90IHNlZSBkZXZpYXRpb25zIGZyb20gdGhlIHRhcmdldCBkZWNveSBhc3N1bXB0aW9ucy4gCgojIyBYIXRhbmRlbQoKYGBge3J9CmV2YWxUYXJnZXREZWNveXMoCiAgb2JqZWN0ID0gbXppZFN3aXNzcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gInghdGFuZGVtOmV4cGVjdCIsCiAgbG9nMTAgPSBUUlVFKQpgYGAKCgotIFRoZSB0b3RhbCBudW1iZXIgb2YgZGVjb3lzIGRvZXMgbm90IHNlZW0gdG8gYmUgYSBnb29kIGVzdGltYXRlIGZvciB0aGUgdG90YWwgbnVtYmVyIG9mIGJhZCB0YXJnZXQgUFNNIGhpdHMuIAoKLSBJbmRlZWQsIGl0IHNlZW1zIHRvIGJlIG1vcmUgbGlrZWx5IHRoYXQgYSBiYWQgUFNNIG1hdGNoIGlzIG1hdGNoaW5nIHRvIGEgdGFyZ2V0IHRoYW4gYSBkZWNveSBzZXF1ZW5jZSEgCgotIEl0IGlzIHJlY29tbWVuZGVkIHRvIHJlbW92ZSBYIXRhbmRlbSBhcyBhIGNhbmRpZGF0ZSBzZWFyY2ggZW5naW5lIGluIHBlcHRpZGUgc2hha2VyIGluIHRoZSBzZWFyY2ggYWdhaW5zdCBTd2lzc3Byb3QuIAoKClRoZSByZWFzb24gd2h5IHRoZSBzZWFyY2ggd2l0aCBYIVRhbmRlbSBpcyBwcm9ibGVtYXRpYyBpcyBkdWUgdG8gdGhlIHR3byBwYXNzIHNlYXJjaCBzdHJhdGVneSB0aGF0IGlzIHBlcmZvcm1lZCBieSBYIXRhbmRlbS4gSW4gdGhlIGZpcnN0IHBoYXNlIGEgcmFwaWQgc2VhcmNoIGlzIHBlcmZvcm1lZCwgd2hpY2ggZG9lcyBub3QgYWxsb3cgZm9yIG1vZGlmaWNhdGlvbnMgbm9yIGZvciBtaXNzIGNsZWF2YWdlcy4gSW4gYSBzZWNvbmQgcGhhc2UsIGEgbmV3IHNlYXJjaCBpcyBjb25kdWN0ZWQgc29sZWx5IGFnYWluc3QgdGhlIGlkZW50aWZpZWQgcGVwdGlkZXMgaW4gdGhlIGZpcnN0IHBoYXNlLCBidXQgbm93IGJ5IHVzaW5nIGEgbW9yZSBjb21wbGV4IHN0cmF0ZWd5IHRoYXQgYWxsb3dzIGZvciBtaXNzZWQgY2xlYXZhZ2VzIGFuZCBwb3N0IHRyYW5zbGF0aW9uYWwgbW9kaWZpY2F0aW9ucy4gUGVyZm9ybWluZyB0aGUgcmVmaW5lZCBzZWFyY2ggYWdhaW5zdCB0aGUgc21hbGxlciBwb3B1bGF0aW9uIG9mIGNhbmRpZGF0ZSBwZXB0aWRlcyBmcm9tIHRoZSBmaXJzdCBwaGFzZSBncmVhdGx5IHJlZHVjZXMgdGhlIGNvbXB1dGF0aW9uYWwgY29tcGxleGl0eSwgaG93ZXZlciwgaXQgY29tZXMgYXQgdGhlIGNvc3QgdGhhdCB0aGUgVERBIGFzc3VtcHRpb25zIGFyZSB2aW9sYXRlZC4gSW5kZWVkLCBpbiB0aGUgc2Vjb25kIHBhc3MgbG93IHNjb3JpbmcgUFNNcyBjYW4gc3dpdGNoIHRvIGEgbW9kaWZpZWQgUFNNLCB3aGljaCBzZWVtcyB0byBiZSB0aGUgY2FzZSBmb3IgbWFueSBkZWNveSBoaXRzIGZyb20gdGhlIGZpcnN0IHBoYXNlLiBNYW55IG9mIHRoZXNlIHN3aXRjaGVkIHRvIG1vZGlmaWVkIHRhcmdldCBQU01zLCBob3dldmVyLCByZW1haW4gdG8gaGF2ZSBhIHJlbGF0aXZlIGxvdyBzY29yZSBhbmQgYXJlIGxpa2VseSB0byBiZSBiYWQgdGFyZ2V0IFBTTXMuIFRoZSBudW1iZXIgb2YgZGVjb3kgbWF0Y2hlcyBpcyBubyBsb25nZXIgcmVwcmVzZW50YXRpdmUgZm9yIHRoZSBudW1iZXIgb2YgYmFkIHRhcmdldCBtYXRjaGVzLiBUaGlzIGV4YW1wbGUgc2hvd3MgdGhhdCB0aGUgdXNlIG9mIGEgc2Vjb25kIHBhc3Mgc3RyYXRlZ3kgY2FuIGJlIHZlcnkgZGV0cmltZW50YWwgZm9yIHRoZSBGRFIgZXN0aW1hdGlvbiB1c2luZyB0aGUgVERBIGFwcHJvYWNoLgotIFRoaXMgaXMgc28gcHJvYmxlbWF0aWMgdGhhdCBjb21iaW5pbmcgWCF0YW5kZW0gd2l0aCBvdGhlciBlbmdpbmVzIGluIHBlcHRpZGUgc2hha2VyIHJlc3VsdHMgaW4gYSBicmVhayBkb3duIG9mIHRoZSB0YXJnZXQgZGVjb3kgYXNzdW1wdGlvbiBmb3IgcGVwdGlkZXNoYWtlci4KCgo=