In this third lab session, we will perform clustering, marker gene detection and cell type annotation on the dataset from Macosko et al..

1 Macosko dataset

In the first lab session (24 November, 2021), we have quantified and pre-processed the droplet single-cell RNA-seq dataset (drop-seq protocol) from the publication by Macosko et al., Cell 161, 1202–1214 (link). In this experiment, Macosko et al. sequenced 49,300 cells from the mouse retina, identifying 39 transcriptionally distinct cell populations. The experiment was performed in 7 batches.

In the previous two session, we already performed several steps:

  1. Import the Macosko dataset as SingleCellExperiment object from the scRNAseq Bioconductor package.

  2. Include ENSEMBL gene identifiers in the rowData

  3. Remove very lowly expressed genes

  4. Remove low quality cells 4.1. Cells with outlying library size 4.2. Cells with outlying transcriptome complexity 4.3. Cells with outlying percentage of mitochondrial reads 4.4. Empty droplets 4.5. Doublets

  5. Normalization 5.1. Compute log-normalized counts 5.2. Compute scaling factor to correct for differences in library size

  6. Feature selection 6.1. Genes with high variance 6.2. Genes with high variance with respect to their mean expression 6.3. Genes with high deviance 6.4. Genes with high variance after variance-stabilizing transformation (VST)

  7. Dimensionality reduction 7.1. Based on two most variable genes from step 6.2. 7.2. PCA 7.3. GLM-PCA 7.4. T-SNE 7.5. UMAP

During this session, we will add the following steps to this workflow :

  • Clustering (graph-based, k-means and hierarchical clustering)
  • Marker gene detection
  • Cell type annotation (supervised and semi-supervised)

To guide you with these next steps, we provide with an Rmarkdown template that you can fill out:

lab3_macoskoTemplate

We also provide the solution to this exercise here (sections clustering, marker gene detection and annotation):

lab3_macoskoWorkflow


LS0tCnRpdGxlOiAnTGFiMzogQ2x1c3RlcmluZywgbWFya2VyIGdlbmUgZGV0ZWN0aW9uIGFuZCBjZWxsIHR5cGUgYW5ub3RhdGlvbicKYXV0aG9yOiAiS29lbiBWYW4gZGVuIEJlcmdlIGFuZCBKZXJvZW4gR2lsaXMiCmRhdGU6ICI3LzEyLzIwMjEiCm91dHB1dDoKICAgIGh0bWxfZG9jdW1lbnQ6CiAgICAgIGNvZGVfZG93bmxvYWQ6IHRydWUgICAgCiAgICAgIHRoZW1lOiBjb3NtbwogICAgICB0b2M6IHRydWUKICAgICAgdG9jX2Zsb2F0OiB0cnVlCiAgICAgIGhpZ2hsaWdodDogdGFuZ28KICAgICAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCi0tLQoKSW4gdGhpcyB0aGlyZCBsYWIgc2Vzc2lvbiwgd2Ugd2lsbCBwZXJmb3JtIGNsdXN0ZXJpbmcsIG1hcmtlciBnZW5lIGRldGVjdGlvbiBhbmQgCmNlbGwgdHlwZSBhbm5vdGF0aW9uIG9uIHRoZSBkYXRhc2V0IGZyb20gTWFjb3NrbyAqZXQgYWwuKi4KCiMgTWFjb3NrbyBkYXRhc2V0CgpJbiB0aGUgZmlyc3QgbGFiIHNlc3Npb24gKDI0IE5vdmVtYmVyLCAyMDIxKSwgd2UgaGF2ZSBxdWFudGlmaWVkIGFuZCAKcHJlLXByb2Nlc3NlZCB0aGUgZHJvcGxldCBzaW5nbGUtY2VsbCBSTkEtc2VxIGRhdGFzZXQgKCoqZHJvcC1zZXEgcHJvdG9jb2wqKikgCmZyb20gdGhlIHB1YmxpY2F0aW9uIGJ5IE1hY29za28gKmV0IGFsLiosIENlbGwgMTYxLCAxMjAy4oCTMTIxNApbKGxpbmspXShodHRwczovL2RvaS5vcmcvMTAuMTAxNi9qLmNlbGwuMjAxNS4wNS4wMDIpLiBJbiB0aGlzIGV4cGVyaW1lbnQsIApNYWNvc2tvICpldCBhbC4qIHNlcXVlbmNlZCA0OSwzMDAgY2VsbHMgZnJvbSB0aGUgbW91c2UgcmV0aW5hLCBpZGVudGlmeWluZyAKMzkgdHJhbnNjcmlwdGlvbmFsbHkgZGlzdGluY3QgY2VsbCBwb3B1bGF0aW9ucy4gVGhlIGV4cGVyaW1lbnQgd2FzIHBlcmZvcm1lZCBpbgo3IGJhdGNoZXMuCgpJbiB0aGUgcHJldmlvdXMgdHdvIHNlc3Npb24sIHdlIGFscmVhZHkgcGVyZm9ybWVkIHNldmVyYWwgc3RlcHM6CgoxLiBJbXBvcnQgdGhlIE1hY29za28gZGF0YXNldCBhcyBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCBmcm9tIHRoZSAKYHNjUk5Bc2VxYCBCaW9jb25kdWN0b3IgcGFja2FnZS4KCjIuIEluY2x1ZGUgRU5TRU1CTCBnZW5lIGlkZW50aWZpZXJzIGluIHRoZSBgcm93RGF0YWAKCjMuIFJlbW92ZSB2ZXJ5IGxvd2x5IGV4cHJlc3NlZCBnZW5lcwoKNC4gUmVtb3ZlIGxvdyBxdWFsaXR5IGNlbGxzIAogIDQuMS4gQ2VsbHMgd2l0aCBvdXRseWluZyBsaWJyYXJ5IHNpemUgCiAgNC4yLiBDZWxscyB3aXRoIG91dGx5aW5nIHRyYW5zY3JpcHRvbWUgY29tcGxleGl0eSAKICA0LjMuIENlbGxzIHdpdGggb3V0bHlpbmcgcGVyY2VudGFnZSBvZiBtaXRvY2hvbmRyaWFsIHJlYWRzIAogIDQuNC4gRW1wdHkgZHJvcGxldHMgCiAgNC41LiBEb3VibGV0cwoKNS4gTm9ybWFsaXphdGlvbiAKICA1LjEuIENvbXB1dGUgbG9nLW5vcm1hbGl6ZWQgY291bnRzIAogIDUuMi4gQ29tcHV0ZSBzY2FsaW5nIGZhY3RvciB0byBjb3JyZWN0IGZvciBkaWZmZXJlbmNlcyBpbiBsaWJyYXJ5IHNpemUKCjYuIEZlYXR1cmUgc2VsZWN0aW9uIAogIDYuMS4gR2VuZXMgd2l0aCBoaWdoIHZhcmlhbmNlIAogIDYuMi4gR2VuZXMgd2l0aCBoaWdoIHZhcmlhbmNlIHdpdGggcmVzcGVjdCB0byB0aGVpciBtZWFuIGV4cHJlc3Npb24gCiAgNi4zLiBHZW5lcyB3aXRoIGhpZ2ggZGV2aWFuY2UgCiAgNi40LiBHZW5lcyB3aXRoIGhpZ2ggdmFyaWFuY2UgYWZ0ZXIgdmFyaWFuY2Utc3RhYmlsaXppbmcgdHJhbnNmb3JtYXRpb24gKFZTVCkKCjcuIERpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiAKICA3LjEuIEJhc2VkIG9uIHR3byBtb3N0IHZhcmlhYmxlIGdlbmVzIGZyb20gc3RlcCA2LjIuIAogIDcuMi4gUENBIAogIDcuMy4gR0xNLVBDQSAKICA3LjQuIFQtU05FIAogIDcuNS4gVU1BUAogCiBEdXJpbmcgdGhpcyBzZXNzaW9uLCB3ZSB3aWxsIGFkZCB0aGUgZm9sbG93aW5nIHN0ZXBzIHRvIHRoaXMgd29ya2Zsb3cgOgogCiAtIENsdXN0ZXJpbmcgKGdyYXBoLWJhc2VkLCBrLW1lYW5zIGFuZCBoaWVyYXJjaGljYWwgY2x1c3RlcmluZykKIC0gTWFya2VyIGdlbmUgZGV0ZWN0aW9uCiAtIENlbGwgdHlwZSBhbm5vdGF0aW9uIChzdXBlcnZpc2VkIGFuZCBzZW1pLXN1cGVydmlzZWQpCiAKVG8gZ3VpZGUgeW91IHdpdGggdGhlc2UgbmV4dCBzdGVwcywgd2UgcHJvdmlkZSB3aXRoIGFuIFJtYXJrZG93biB0ZW1wbGF0ZQp0aGF0IHlvdSBjYW4gZmlsbCBvdXQ6CgpbbGFiM19tYWNvc2tvVGVtcGxhdGVdKC4vbGFiM19tYWNvc2tvVGVtcGxhdGUuaHRtbCkKCldlIGFsc28gcHJvdmlkZSB0aGUgc29sdXRpb24gdG8gdGhpcyBleGVyY2lzZSBoZXJlIChzZWN0aW9ucyBjbHVzdGVyaW5nLAptYXJrZXIgZ2VuZSBkZXRlY3Rpb24gYW5kIGFubm90YXRpb24pOgoKW2xhYjNfbWFjb3Nrb1dvcmtmbG93XSguL2xhYjNfTWFjb3Nrb1dvcmtmbG93Lmh0bWwpCgoKLS0tCgo=