The data used in this workflow comes from an RNA-seq experiment where airway smooth muscle cells were treated with dexamethasone, a synthetic glucocorticoid steroid with anti-inflammatory effects (Himes et al. 2014). Glucocorticoids are used, for example, by people with asthma to reduce inflammation of the airways. In the experiment, four human airway smooth muscle cell lines were treated with 1 micromolar dexamethasone for 18 hours. For each of the four cell lines, we have a treated and an untreated sample. For more description of the experiment see the article, PubMed entry 24926665, and for raw data see the GEO entry GSE52778.
In most cases you will have different project on the same organism. Hence, you will be able to use the same index for all of the projects on the same organism. We typically do this once and reuse this index over and over again. I will therefore build the index in a separate script. Therefore, I will construct the Rsubread index for th # Data FastQ files with a small subset of the reads can be found on https://github.com/statOmics/SGA2019/tree/data-rnaseq
library("Rsubread")
## Warning: package 'Rsubread' was built under R version 3.6.1
All reads in the subsampled fastq files map to chromosome 1. We therefore only build an index to chromosome 1 of the human genome so as to save time and disk space. Normally we build an index using the primary assembly fasta . We downloaded the fasta data for human from Ensembl (http://www.ensembl.org/info/data/ftp/index.html).
homoGenome<-"Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz"
system("mkdir airway_index")
indexName<-"airway_index/homo_sapiens_GRCh38_dna_chromosome_1_rsubread"
buildindex(basename=indexName,reference=homoGenome)
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 1.34.7
##
## //================================= setting ==================================\\
## || ||
## || Index name : homo_sapiens_GRCh38_dna_chromosome_1_rsubread ||
## || Index space : base space ||
## || Index split : no-split ||
## || Repeat threshold : 100 repeats ||
## || Gapped index : no ||
## || ||
## || Free / total memory : 4.3GB / 16.0GB ||
## || ||
## || Input files : 1 file in total ||
## || o Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz ||
## || ||
## \\============================================================================//
##
## //================================= Running ==================================\\
## || ||
## || Check the integrity of provided reference sequences ... ||
## || No format issues were found ||
## || Scan uninformative subreads in reference sequences ... ||
## || 51717 uninformative subreads were found. ||
## || These subreads were excluded from index building. ||
## || Estimate the index size... ||
## || 116%, 0 mins elapsed, rate=595197.2k bps/s ||
## || 124%, 0 mins elapsed, rate=53981.5k bps/s ||
## || 133%, 0 mins elapsed, rate=30560.7k bps/s ||
## || 141%, 0 mins elapsed, rate=21763.3k bps/s ||
## || 149%, 0 mins elapsed, rate=17449.0k bps/s ||
## || 158%, 0 mins elapsed, rate=14809.2k bps/s ||
## || 166%, 0 mins elapsed, rate=13243.2k bps/s ||
## || 174%, 0 mins elapsed, rate=12838.0k bps/s ||
## || 183%, 0 mins elapsed, rate=11304.2k bps/s ||
## || 191%, 0 mins elapsed, rate=10486.0k bps/s ||
## || 199%, 0 mins elapsed, rate=9853.5k bps/s ||
## || 208%, 0 mins elapsed, rate=9327.8k bps/s ||
## || 2.5 GB of memory is needed for index building. ||
## || Build the index... ||
## || 8%, 0 mins elapsed, rate=1419.6k bps/s ||
## || 16%, 0 mins elapsed, rate=1392.3k bps/s ||
## || 24%, 0 mins elapsed, rate=1438.0k bps/s ||
## || 33%, 0 mins elapsed, rate=1521.8k bps/s ||
## || 41%, 1 mins elapsed, rate=1569.6k bps/s ||
## || 49%, 1 mins elapsed, rate=1647.1k bps/s ||
## || 58%, 1 mins elapsed, rate=1825.6k bps/s ||
## || 66%, 1 mins elapsed, rate=1849.0k bps/s ||
## || 74%, 1 mins elapsed, rate=1861.1k bps/s ||
## || 83%, 1 mins elapsed, rate=1874.7k bps/s ||
## || 91%, 2 mins elapsed, rate=1884.4k bps/s ||
## || Save current index block... ||
## || [ 0.0% finished ] ||
## || [ 10.0% finished ] ||
## || [ 20.0% finished ] ||
## || [ 30.0% finished ] ||
## || [ 40.0% finished ] ||
## || [ 50.0% finished ] ||
## || [ 60.0% finished ] ||
## || [ 70.0% finished ] ||
## || [ 80.0% finished ] ||
## || [ 90.0% finished ] ||
## || [ 100.0% finished ] ||
## || ||
## || Total running time: 5.4 minutes. ||
## ||Index airway_index/homo_sapiens_GRCh38_dna_chromosome_1_rsubread was s ... ||
## || ||
## \\============================================================================//