1 Load Libraries

library(TargetDecoy)
library(RCurl)
library(mzID)

2 Download data in working directory

download.file( 
  url = "https://raw.githubusercontent.com/statOmics/PDA22GTPB/data/identification/pyroUniprot.mzid",
  destfile = "pyroUniprot.mzid"
  )

3 Load Data in R

path2File <- "pyroUniprot.mzid"
mzidUniprot <- mzID(path2File)
## reading pyroUniprot.mzid... DONE!

4 Launch the Shiny Gadget

Explore the results for search eninge scores to find correct names of search engine scores in the mzID.

evalTargetDecoys(mzidUniprot)

5 Evaluate target decoy assumptions

5.1 Peptide Shaker

evalTargetDecoys(
  object = mzidUniprot, 
  decoy = "isdecoy", 
  score = "peptideshaker psm score",
  log10 = FALSE)

We observe that

  • the histogram shows that Peptide Shaker gives a very good separation between good targets and bad targets.

  • the histogram and PP show that the distribution of bad target peptideshaker PSM scores have the same distribution of decoy peptideshaker PSM scores.

  • The PP-plot shows that the ratio of decoys on targets is a good estimate of the expected fraction of bad target hits that are returned (\(\hat\pi_b=\frac{\# \text{decoys}}{\#\text{targets}}\) is a good estimate of the fraction of bad target hits). We can thus assume that it is equaly likely that a bad PSM hit will match to a target sequence or a decoy sequence.

  • It is not really required to assess the assumptions of the search engines used by peptide shaker because there are no problems.

  • For completeness we still evaluate the different search engines used by peptide shaker.

5.2 MSGF+

evalTargetDecoys(
  object = mzidUniprot, 
  decoy = "isdecoy", 
  score = "ms-gf:specevalue",
  log10 = TRUE)

  • The plots show that the distribution of the MSGF+ PSM scores are nicely bimodal.

  • The separation between good target PSM scores and bad target PSM scores is less pronounced than for peptide shaker. So it is beneficial to include the other engines with peptideshaker.

  • We do not see deviations from the target decoy assumptions.

5.3 Omssa

evalTargetDecoys(
  object = mzidUniprot, 
  decoy = "isdecoy", 
  score = "omssa:evalue",
  log10 = TRUE)

  • The separation between good target PSM scores and bad target PSM scores is less pronounced for omssa than for peptide shaker. So it is beneficial to include the other engines with peptideshaker.

  • We do not see deviations from the target decoy assumptions.

5.4 X!Tandem

evalTargetDecoys(
  object = mzidUniprot, 
  decoy = "isdecoy", 
  score = "x!tandem:expect",
  log10 = TRUE)

  • The separation between good target PSM scores and bad target PSM scores is less pronounced for X!Tandem than for peptide shaker. So it is beneficial to include the other engines with peptideshaker.

  • The decoy PSM score distribution has the same shape as the PSM score distribution of bad targets.

  • There seems to be some evidence that bad PSM hits are more likely to go to target than to decoy sequences. However, this does not lead to problems when combining X!tandem with other engines in peptide shaker.

LS0tCnRpdGxlOiAiVHV0b3JpYWw6IEV2YWx1YXRlIHB5cm9jb2NjdXMgc2VhcmNoZXMgdXNpbmcgdW5pcHJvdCBhbmQgcGVwdGlkZSBzaGFrZXIiCmF1dGhvcjogCiAgLSBuYW1lOiBMaWV2ZW4gQ2xlbWVudAogICAgYWZmaWxpYXRpb246CiAgICAtIEdoZW50IFVuaXZlcnNpdHkKb3V0cHV0OiAKICAgIGh0bWxfZG9jdW1lbnQ6CiAgICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgICAgdGhlbWU6IGZsYXRseQogICAgICB0b2M6IHRydWUKICAgICAgdG9jX2Zsb2F0OiB0cnVlCiAgICAgIGhpZ2hsaWdodDogdGFuZ28KICAgICAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCmxpbmtjb2xvcjogYmx1ZQp1cmxjb2xvcjogYmx1ZQpjaXRlY29sb3I6IGJsdWUKLS0tCgojIExvYWQgTGlicmFyaWVzIAoKYGBge3J9CmxpYnJhcnkoVGFyZ2V0RGVjb3kpCmxpYnJhcnkoUkN1cmwpCmxpYnJhcnkobXpJRCkKYGBgCgojIERvd25sb2FkIGRhdGEgaW4gd29ya2luZyBkaXJlY3RvcnkKCmBgYHtyfQpkb3dubG9hZC5maWxlKCAKICB1cmwgPSAiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3N0YXRPbWljcy9QREEyMkdUUEIvZGF0YS9pZGVudGlmaWNhdGlvbi9weXJvVW5pcHJvdC5temlkIiwKICBkZXN0ZmlsZSA9ICJweXJvVW5pcHJvdC5temlkIgogICkKYGBgCgojIExvYWQgRGF0YSBpbiBSCgpgYGB7cn0KcGF0aDJGaWxlIDwtICJweXJvVW5pcHJvdC5temlkIgptemlkVW5pcHJvdCA8LSBteklEKHBhdGgyRmlsZSkKYGBgCgojIExhdW5jaCB0aGUgU2hpbnkgR2FkZ2V0IAoKRXhwbG9yZSB0aGUgcmVzdWx0cyBmb3Igc2VhcmNoIGVuaW5nZSBzY29yZXMgdG8gZmluZCBjb3JyZWN0IG5hbWVzIG9mIHNlYXJjaCBlbmdpbmUgc2NvcmVzIGluIHRoZSBteklELgoKYGBge3IgZXZhbD1GQUxTRX0KZXZhbFRhcmdldERlY295cyhtemlkVW5pcHJvdCkKYGBgCgojIEV2YWx1YXRlIHRhcmdldCBkZWNveSBhc3N1bXB0aW9ucyAKCiMjIFBlcHRpZGUgU2hha2VyCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkVW5pcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gInBlcHRpZGVzaGFrZXIgcHNtIHNjb3JlIiwKICBsb2cxMCA9IEZBTFNFKQpgYGAKCldlIG9ic2VydmUgdGhhdCAKCi0gdGhlIGhpc3RvZ3JhbSBzaG93cyB0aGF0IFBlcHRpZGUgU2hha2VyIGdpdmVzIGEgdmVyeSBnb29kIHNlcGFyYXRpb24gYmV0d2VlbiBnb29kIHRhcmdldHMgYW5kIGJhZCB0YXJnZXRzLiAKCi0gdGhlIGhpc3RvZ3JhbSBhbmQgUFAgc2hvdyB0aGF0IHRoZSBkaXN0cmlidXRpb24gb2YgYmFkIHRhcmdldCBwZXB0aWRlc2hha2VyIFBTTSBzY29yZXMgaGF2ZSB0aGUgc2FtZSBkaXN0cmlidXRpb24gb2YgZGVjb3kgcGVwdGlkZXNoYWtlciBQU00gc2NvcmVzLgoKLSBUaGUgUFAtcGxvdCBzaG93cyB0aGF0IHRoZSByYXRpbyBvZiBkZWNveXMgb24gdGFyZ2V0cyBpcyBhIGdvb2QgZXN0aW1hdGUgb2YgdGhlIGV4cGVjdGVkIGZyYWN0aW9uIG9mIGJhZCB0YXJnZXQgaGl0cyB0aGF0IGFyZSByZXR1cm5lZCAoJFxoYXRccGlfYj1cZnJhY3tcIyBcdGV4dHtkZWNveXN9fXtcI1x0ZXh0e3RhcmdldHN9fSQgaXMgYSBnb29kIGVzdGltYXRlIG9mIHRoZSBmcmFjdGlvbiBvZiBiYWQgdGFyZ2V0IGhpdHMpLiBXZSBjYW4gdGh1cyBhc3N1bWUgdGhhdCBpdCBpcyBlcXVhbHkgbGlrZWx5IHRoYXQgYSBiYWQgUFNNIGhpdCB3aWxsIG1hdGNoIHRvIGEgdGFyZ2V0IHNlcXVlbmNlIG9yIGEgZGVjb3kgc2VxdWVuY2UuICAKCi0gSXQgaXMgbm90IHJlYWxseSByZXF1aXJlZCB0byBhc3Nlc3MgdGhlIGFzc3VtcHRpb25zIG9mIHRoZSBzZWFyY2ggZW5naW5lcyB1c2VkIGJ5IHBlcHRpZGUgc2hha2VyIGJlY2F1c2UgdGhlcmUgYXJlIG5vIHByb2JsZW1zLiAKCi0gRm9yIGNvbXBsZXRlbmVzcyB3ZSBzdGlsbCBldmFsdWF0ZSB0aGUgZGlmZmVyZW50IHNlYXJjaCBlbmdpbmVzIHVzZWQgYnkgcGVwdGlkZSBzaGFrZXIuIAoKIyMgTVNHRisKCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkVW5pcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gIm1zLWdmOnNwZWNldmFsdWUiLAogIGxvZzEwID0gVFJVRSkKYGBgCgotIFRoZSBwbG90cyBzaG93IHRoYXQKdGhlIGRpc3RyaWJ1dGlvbiBvZiB0aGUgTVNHRisgUFNNIHNjb3JlcyBhcmUgbmljZWx5IGJpbW9kYWwuIAoKLSBUaGUgc2VwYXJhdGlvbiBiZXR3ZWVuIGdvb2QgdGFyZ2V0IFBTTSBzY29yZXMgYW5kIGJhZCB0YXJnZXQgUFNNIHNjb3JlcyBpcyBsZXNzIHByb25vdW5jZWQgdGhhbiBmb3IgcGVwdGlkZSBzaGFrZXIuIFNvIGl0IGlzIGJlbmVmaWNpYWwgdG8gaW5jbHVkZSB0aGUgb3RoZXIgZW5naW5lcyB3aXRoIHBlcHRpZGVzaGFrZXIuIAoKLSBXZSBkbyBub3Qgc2VlIGRldmlhdGlvbnMgZnJvbSB0aGUgdGFyZ2V0IGRlY295IGFzc3VtcHRpb25zLiAKCiMjIE9tc3NhCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkVW5pcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gIm9tc3NhOmV2YWx1ZSIsCiAgbG9nMTAgPSBUUlVFKQpgYGAKCgotIFRoZSBzZXBhcmF0aW9uIGJldHdlZW4gZ29vZCB0YXJnZXQgUFNNIHNjb3JlcyBhbmQgYmFkIHRhcmdldCBQU00gc2NvcmVzIGlzIGxlc3MgcHJvbm91bmNlZCBmb3Igb21zc2EgdGhhbiBmb3IgcGVwdGlkZSBzaGFrZXIuIFNvIGl0IGlzIGJlbmVmaWNpYWwgdG8gaW5jbHVkZSB0aGUgb3RoZXIgZW5naW5lcyB3aXRoIHBlcHRpZGVzaGFrZXIuIAoKLSBXZSBkbyBub3Qgc2VlIGRldmlhdGlvbnMgZnJvbSB0aGUgdGFyZ2V0IGRlY295IGFzc3VtcHRpb25zLiAKCiMjIFghVGFuZGVtCgpgYGB7cn0KZXZhbFRhcmdldERlY295cygKICBvYmplY3QgPSBtemlkVW5pcHJvdCwgCiAgZGVjb3kgPSAiaXNkZWNveSIsIAogIHNjb3JlID0gInghdGFuZGVtOmV4cGVjdCIsCiAgbG9nMTAgPSBUUlVFKQpgYGAKCgotIFRoZSBzZXBhcmF0aW9uIGJldHdlZW4gZ29vZCB0YXJnZXQgUFNNIHNjb3JlcyBhbmQgYmFkIHRhcmdldCBQU00gc2NvcmVzIGlzIGxlc3MgcHJvbm91bmNlZCBmb3IgWCFUYW5kZW0gdGhhbiBmb3IgcGVwdGlkZSBzaGFrZXIuIFNvIGl0IGlzIGJlbmVmaWNpYWwgdG8gaW5jbHVkZSB0aGUgb3RoZXIgZW5naW5lcyB3aXRoIHBlcHRpZGVzaGFrZXIuIAoKLSBUaGUgZGVjb3kgUFNNIHNjb3JlIGRpc3RyaWJ1dGlvbiBoYXMgdGhlIHNhbWUgc2hhcGUgYXMgdGhlIFBTTSBzY29yZSBkaXN0cmlidXRpb24gb2YgYmFkIHRhcmdldHMuIAoKLSBUaGVyZSBzZWVtcyB0byBiZSBzb21lIGV2aWRlbmNlIHRoYXQgYmFkIFBTTSBoaXRzIGFyZSBtb3JlIGxpa2VseSB0byBnbyB0byB0YXJnZXQgdGhhbiB0byBkZWNveSBzZXF1ZW5jZXMuIEhvd2V2ZXIsIHRoaXMgZG9lcyBub3QgbGVhZCB0byBwcm9ibGVtcyB3aGVuIGNvbWJpbmluZyBYIXRhbmRlbSB3aXRoIG90aGVyIGVuZ2luZXMgaW4gcGVwdGlkZSBzaGFrZXIuIAo=