Introduction
We here make use of the publication of Anna Cuomo et al. (last author Oliver Stegle), which we will refer to as the iPSC dataset
. The paper that describes this dataset can be found using this link.
In the experiment, the authors harvested induced pluripotent stem cells (iPSCs) from 125 healthy human donors. These cells were used to study the endoderm differentiation process. In this process, iPSCs differentiate to endoderm cells, a process which takes approximately three days. As such, the authors cultered the iPSCs cell lines and allowed for differentiation for three days. During the experiment, cells were harvested at four different time points: day0 (directly at to incubation), day1, day2 and day3. Knowing the process of endoderm differentiation, these time points should correspond with different cell types: day0 are (undifferentiated) iPSCs, day1 are mesendoderm cells, day2 are “intermediate” cells and day3 are fully differentiated endoderm cells.
This dataset was generated using the SMART-Seq2 scRNA-seq protocol.
The final goal of the experiment was to characterize population variation in the process of endoderm differentiation.
Download data
For this lab session, we will work with a subset of the data, i.e., the data for the first (alphabetically) 15 patients in the experiment. These can be downloaded through the belnet filesender link provided through email, https://filesender.belnet.be/?s=download&token=eb8136df-67d3-4869-b2a9-f65767054e81.
The data original (125 patient) could be downloaded from Zenodo. At the bottom of this web-page, we can download the files raw_counts.csv.zip
and cell_metadata_cols.tsv
and store these files locally. We do not recommend doing this during the lab session, to avoid overloading the system.
Import data
First we read in the count matrix:
sce <- readRDS("/Users/jg/Desktop/sce_15_cuomo.rds") #change to your data path
sce
Obtaining and including rowData
Assess what is currently stored in the rowData
of the SingelCellExperiment object.
Retrieve relevant information form bioMart
. Make sure to select the right values for the dataset
and version
arguments for the useEnsembl
function (these can be retrieved from the Cuomo et al. paper).
Quality control
Calculate QC variables
Use perCellQCMetrics
to compute QC metrics.
Exploratory data analysis
QC using adaptive thresholds
Visualize the cells that are going to be removed. Are you happy with the selection criterion, i.e., does it appear that we are only removing technical artefacts or could we be removing biological signal as well?
To do this, try coloring the “detected” versus “subsets_Mito_percent” plot and “sum” versus “detected” plots based on biologically significant metadata.
Remove empty droplets
What do you think of this step for the analysis of this dataset?
Identifying and removing doublets
What do you think of this step for the analysis of this dataset?
Normalization
LS0tCnRpdGxlOiAnRGF0YSBpbXBvcnQsIHF1YWxpdHkgY29udHJvbCBhbmQgbm9ybWFsaXphdGlvbiBmb3IgdGhlIEN1b21vIGRhdGFzZXQnCmF1dGhvcjogIktvZW4gVmFuIGRlbiBCZXJnZSBhbmQgSmVyb2VuIEdpbGlzIgpkYXRlOiAiMjQvMTEvMjAyMSIKb3V0cHV0OiAKICBodG1sX2RvY3VtZW50OgogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6IHRydWUKLS0tCgojIEludHJvZHVjdGlvbgoKV2UgaGVyZSBtYWtlIHVzZSBvZiB0aGUgcHVibGljYXRpb24gb2YgQW5uYSBDdW9tbyBldCBhbC4KKGxhc3QgYXV0aG9yIE9saXZlciBTdGVnbGUpLCB3aGljaCB3ZSB3aWxsIHJlZmVyIHRvIGFzIHRoZSBgaVBTQyBkYXRhc2V0YC4gVGhlIApwYXBlciB0aGF0IGRlc2NyaWJlcyB0aGlzIGRhdGFzZXQgY2FuIGJlIGZvdW5kIHVzaW5nIHRoaXMgCltsaW5rXShodHRwczovL3d3dy5uYXR1cmUuY29tL2FydGljbGVzL3M0MTQ2Ny0wMjAtMTQ0NTcteikuCgpJbiB0aGUgZXhwZXJpbWVudCwgdGhlIGF1dGhvcnMgaGFydmVzdGVkIGluZHVjZWQgcGx1cmlwb3RlbnQgc3RlbSBjZWxscyAoaVBTQ3MpCmZyb20gMTI1IGhlYWx0aHkgaHVtYW4gZG9ub3JzLiBUaGVzZSBjZWxscyB3ZXJlIHVzZWQgdG8gc3R1ZHkgdGhlIGVuZG9kZXJtIApkaWZmZXJlbnRpYXRpb24gcHJvY2Vzcy4gSW4gdGhpcyBwcm9jZXNzLCBpUFNDcyBkaWZmZXJlbnRpYXRlIHRvIGVuZG9kZXJtIGNlbGxzLAphIHByb2Nlc3Mgd2hpY2ggdGFrZXMgYXBwcm94aW1hdGVseSB0aHJlZSBkYXlzLiBBcyBzdWNoLCB0aGUgYXV0aG9ycyAKY3VsdGVyZWQgdGhlIGlQU0NzIGNlbGwgbGluZXMgYW5kIGFsbG93ZWQgZm9yIGRpZmZlcmVudGlhdGlvbiBmb3IgdGhyZWUgZGF5cy4gCkR1cmluZyB0aGUgZXhwZXJpbWVudCwgY2VsbHMgd2VyZSBoYXJ2ZXN0ZWQgYXQgZm91ciBkaWZmZXJlbnQgdGltZSBwb2ludHM6IApkYXkwIChkaXJlY3RseSBhdCB0byBpbmN1YmF0aW9uKSwgZGF5MSwgZGF5MiBhbmQgZGF5My4gS25vd2luZyB0aGUgcHJvY2VzcyBvZiAKZW5kb2Rlcm0gZGlmZmVyZW50aWF0aW9uLCB0aGVzZSB0aW1lIHBvaW50cyBzaG91bGQgY29ycmVzcG9uZCB3aXRoIGRpZmZlcmVudCAKY2VsbCB0eXBlczogZGF5MCBhcmUgKHVuZGlmZmVyZW50aWF0ZWQpIGlQU0NzLCBkYXkxIGFyZSBtZXNlbmRvZGVybSBjZWxscywgZGF5MgphcmUgImludGVybWVkaWF0ZSIgY2VsbHMgYW5kIGRheTMgYXJlIGZ1bGx5IGRpZmZlcmVudGlhdGVkIGVuZG9kZXJtIGNlbGxzLgoKVGhpcyBkYXRhc2V0IHdhcyBnZW5lcmF0ZWQgdXNpbmcgdGhlICoqU01BUlQtU2VxMioqIHNjUk5BLXNlcSBwcm90b2NvbC4KClRoZSBmaW5hbCBnb2FsIG9mIHRoZSBleHBlcmltZW50IHdhcyB0byBjaGFyYWN0ZXJpemUgcG9wdWxhdGlvbiB2YXJpYXRpb24gaW4gdGhlCnByb2Nlc3Mgb2YgZW5kb2Rlcm0gZGlmZmVyZW50aWF0aW9uLgoKIyBEb3dubG9hZCBkYXRhCgpGb3IgdGhpcyBsYWIgc2Vzc2lvbiwgd2Ugd2lsbCB3b3JrIHdpdGggYSBzdWJzZXQgb2YgdGhlIGRhdGEsIGkuZS4sIHRoZSBkYXRhCmZvciB0aGUgZmlyc3QgKGFscGhhYmV0aWNhbGx5KSAxNSBwYXRpZW50cyBpbiB0aGUgZXhwZXJpbWVudC4gVGhlc2UgY2FuIGJlCmRvd25sb2FkZWQgdGhyb3VnaCB0aGUgKmJlbG5ldCBmaWxlc2VuZGVyKiBsaW5rIHByb3ZpZGVkIHRocm91Z2ggZW1haWwsCmh0dHBzOi8vZmlsZXNlbmRlci5iZWxuZXQuYmUvP3M9ZG93bmxvYWQmdG9rZW49ZWI4MTM2ZGYtNjdkMy00ODY5LWIyYTktZjY1NzY3MDU0ZTgxLgoKVGhlIGRhdGEgb3JpZ2luYWwgKDEyNSBwYXRpZW50KSBjb3VsZCBiZSBkb3dubG9hZGVkIGZyb20gCltaZW5vZG9dKGh0dHBzOi8vemVub2RvLm9yZy9yZWNvcmQvMzYyNTAyNCMuWVdmYWh0bEJ4QjEpLiBBdCB0aGUgYm90dG9tIG9mIHRoaXMKd2ViLXBhZ2UsIHdlIGNhbiBkb3dubG9hZCB0aGUgZmlsZXMgYHJhd19jb3VudHMuY3N2LnppcGAgYW5kIApgY2VsbF9tZXRhZGF0YV9jb2xzLnRzdmAgYW5kIHN0b3JlIHRoZXNlIGZpbGVzIGxvY2FsbHkuIFdlIGRvIG5vdCByZWNvbW1lbmQgCmRvaW5nIHRoaXMgZHVyaW5nIHRoZSBsYWIgc2Vzc2lvbiwgdG8gYXZvaWQgb3ZlcmxvYWRpbmcgdGhlIHN5c3RlbS4KCiMgSW1wb3J0IGRhdGEKCkZpcnN0IHdlIHJlYWQgaW4gdGhlIGNvdW50IG1hdHJpeDoKCmBgYHtyLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFLCBldmFsPUZBTFNFfQpzY2UgPC0gcmVhZFJEUygiL1VzZXJzL2pnL0Rlc2t0b3Avc2NlXzE1X2N1b21vLnJkcyIpICNjaGFuZ2UgdG8geW91ciBkYXRhIHBhdGgKc2NlCmBgYAoKIyBFeHBsb3JlIG1ldGFkYXRhCgpFeHBsb3JhdGlvbiBvZiB0aGUgbWV0YWRhdGEgaXMgZXNzZW50aWFsIHRvIGdldCBhIGJldHRlciBpZGVhIG9mIHdoYXQgdGhlCmV4cGVyaW1lbnQgd2FzIGFib3V0IGFuZCBob3cgaXQgd2FzIG9yZ2FuaXplZC4gSW4gY29udHJhc3Qgd2l0aCB0aGUgcHJldmlvdXMKZGF0YXNldCBieSBNYWNvc2tvIGV0IGFsLiwgd2UgaGVyZSBoYXZlIGEgbGFyZ2UgYW1vdW50IG9mIG1ldGFkYXRhIHRoYXQgd2UgY2FuCndvcmsgd2l0aDsgYW5kIHRoYXQgd2UgbmVlZCB0byBleHBsb3JlLgoKV2hlbiB3ZSB0aGluayBvZiB0aGUgZXhwZXJpbWVudCwgdGhlIGtleSBhc3BlY3RzIGFyZTsKCi0gQXQgd2hpY2ggZGF5IG9mIHRoZSBkZXZlbG9wbWVudGFsIHByb2Nlc3MgdGhlIGNlbGxzIHdlcmUgc2VxdWVuY2VkICh3aGljaApzaG91bGQgYmUgYSBwcm94eSBmb3IgdGhlIGNlbGwgdHlwZSkKCi0gQ2VsbHMgY29tZSBmcm9tIDEyNSAoMTUgaW4gdGhpcyByZWR1Y2VkIGRhdGFzZXQpIGRvbm9ycwoKSW4gYWRkaXRpb24sIHRvIHJlZHVjZSB0ZWNobmljYWwgYXJ0ZWZhY3RzIGFuZCB0byBhbGxvdyBmb3IgYmF0Y2ggY29ycmVjdGlvbiwgCmVhY2ggYmF0Y2ggKCJleHBlcmltZW50IiB2YXJpYWJsZSkgbWF5IGNvbnRhaW4gY2VsbHMgb2YgbXVsdGlwbGUgcGF0aWVudHMvZGF5cy4KCkV4cGxvcmUgaGUgbWV0YWRhdGEuIFRoZSBgdGFibGUoKWAgZnVuY3Rpb24gd2lsbCBjb21lIGluIGhhbmR5IGZvciB0aGlzIChzZWUKdGhlIE1hY29za28gYW5hbHlzaXMpLgoKIyBPYnRhaW5pbmcgYW5kIGluY2x1ZGluZyByb3dEYXRhCgotIEFzc2VzcyB3aGF0IGlzIGN1cnJlbnRseSBzdG9yZWQgaW4gdGhlIGByb3dEYXRhYCBvZiB0aGUgU2luZ2VsQ2VsbEV4cGVyaW1lbnQKb2JqZWN0LgoKLSBSZXRyaWV2ZSByZWxldmFudCBpbmZvcm1hdGlvbiBmb3JtIGBiaW9NYXJ0YC4gTWFrZSBzdXJlIHRvIHNlbGVjdCB0aGUgcmlnaHQKdmFsdWVzIGZvciB0aGUgYGRhdGFzZXRgIGFuZCBgdmVyc2lvbmAgYXJndW1lbnRzIGZvciB0aGUgYHVzZUVuc2VtYmxgIGZ1bmN0aW9uCih0aGVzZSBjYW4gYmUgcmV0cmlldmVkIGZyb20gdGhlIEN1b21vIGV0IGFsLiBwYXBlcikuCgojIEZpbHRlcmluZyBub24taW5mb3JtYXRpdmUgZ2VuZXMKCkZpbHRlciB0aGUgZ2VuZXMgdXNpbmcgcmVsYXZhbnQgY3JpdGVyaWEuIENvbXBhcmUgeW91ciByZXN1bHRzIHdpdGggd2hhdCB3ZQpvYnRhaW5lZCB3aXRoIHRoZSBNYWNvc2tvIGFuYWx5c2lzLiBDYW4geW91IGV4cGxhaW4gd2hhdCB5b3Ugb2JzZXJ2ZT8KCiMgUXVhbGl0eSBjb250cm9sCgojIyBDYWxjdWxhdGUgUUMgdmFyaWFibGVzCgpVc2UgYHBlckNlbGxRQ01ldHJpY3NgIHRvIGNvbXB1dGUgUUMgbWV0cmljcy4KCiMjIEV4cGxvcmF0b3J5IGRhdGEgYW5hbHlzaXMKCiMjIFFDIHVzaW5nIGFkYXB0aXZlIHRocmVzaG9sZHMKClZpc3VhbGl6ZSB0aGUgY2VsbHMgdGhhdCBhcmUgZ29pbmcgdG8gYmUgcmVtb3ZlZC4gQXJlIHlvdSBoYXBweSB3aXRoIHRoZQpzZWxlY3Rpb24gY3JpdGVyaW9uLCBpLmUuLCBkb2VzIGl0IGFwcGVhciB0aGF0IHdlIGFyZSBvbmx5IHJlbW92aW5nCnRlY2huaWNhbCBhcnRlZmFjdHMgb3IgY291bGQgd2UgYmUgcmVtb3ZpbmcgYmlvbG9naWNhbCBzaWduYWwgYXMgd2VsbD8KClRvIGRvIHRoaXMsIHRyeSBjb2xvcmluZyB0aGUgImRldGVjdGVkIiB2ZXJzdXMgInN1YnNldHNfTWl0b19wZXJjZW50IiBwbG90IGFuZAoic3VtIiB2ZXJzdXMgImRldGVjdGVkIiBwbG90cyBiYXNlZCBvbiBiaW9sb2dpY2FsbHkgc2lnbmlmaWNhbnQgbWV0YWRhdGEuCgojIyBSZW1vdmUgZW1wdHkgZHJvcGxldHMKCldoYXQgZG8geW91IHRoaW5rIG9mIHRoaXMgc3RlcCBmb3IgdGhlIGFuYWx5c2lzIG9mIHRoaXMgZGF0YXNldD8KCiMjIElkZW50aWZ5aW5nIGFuZCByZW1vdmluZyBkb3VibGV0cwoKV2hhdCBkbyB5b3UgdGhpbmsgb2YgdGhpcyBzdGVwIGZvciB0aGUgYW5hbHlzaXMgb2YgdGhpcyBkYXRhc2V0PwoKIyBOb3JtYWxpemF0aW9uCgoKCgoKCg==