After fertilization but prior to the onset of zygotic transcription, the C. elegans zygote cleaves asymmetrically to create the anterior AB and posterior P1 blastomeres, each of which goes on to generate distinct cell lineages. To understand how patterns of RNA inheritance and abundance arise after this first asymmetric cell division, we pooled hand-dissected AB and P1 blastomeres and performed RNA-seq. http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE59943
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────── tidyverse 1.3.0 ──
## ✔ ggplot2 3.3.2 ✔ purrr 0.3.4
## ✔ tibble 3.0.4 ✔ dplyr 1.0.2
## ✔ tidyr 1.1.2 ✔ stringr 1.4.0
## ✔ readr 1.4.0 ✔ forcats 0.5.0
## ── Conflicts ──────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(Rsubread)
library("GEOquery")
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:dplyr':
##
## combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## Filter, Find, Map, Position, Reduce, anyDuplicated, append,
## as.data.frame, basename, cbind, colnames, dirname, do.call,
## duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
## lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
## pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
## tapply, union, unique, unsplit, which, which.max, which.min
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Setting options('download.file.method.GEOquery'='auto')
## Setting options('GEOquery.inmemory.gpl'=FALSE)
We will use the Rsubread read mapper because that is avaible in R for all platforms (Linux, Windows and Mac). For real projects I prefer the use of STAR.
Get all info from GEO. get sample info via getGEO (info from samples)
getGEO("GSE59943") gse <-
## Found 2 file(s)
## GSE59943-GPL13657_series_matrix.txt.gz
##
## ── Column specification ──────────────────────────────────────────────────────────────
## cols(
## ID_REF = col_character(),
## GSM1462557 = col_character(),
## GSM1462558 = col_character(),
## GSM1462559 = col_character(),
## GSM1462560 = col_character()
## )
## File stored at:
## /var/folders/p1/3js9hvbs473g1klcmqm0d8wm0000gn/T//RtmpdzPbIz/GPL13657.soft
## GSE59943-GPL9269_series_matrix.txt.gz
##
## ── Column specification ──────────────────────────────────────────────────────────────
## cols(
## ID_REF = col_character(),
## GSM1462555 = col_character(),
## GSM1462556 = col_character()
## )
## File stored at:
## /var/folders/p1/3js9hvbs473g1klcmqm0d8wm0000gn/T//RtmpdzPbIz/GPL9269.soft
length(gse)
## [1] 2
There are two objects because there were runs with two different machines. Combine the data from both files and add sample name column in order to be able to link the info to that from SRA.
rbind(pData(gse[[1]]), pData(gse[[2]]))
pdata <-$SampleName <- rownames(pdata) pdata
Download SRA info. To link sample info to info sequencing: Go to corresponding SRA page and save the information via the “Send to: File button” This file can also be used to make a script to download sequencing files from the web. Note that sra files can be converted to fastq files via the fastq-dump function of the sra-tools.
read.csv("SraRunInfoElegans.csv")
sraInfo <- merge(pdata, sraInfo, by = "SampleName")
pdata <-$Run pdata
## [1] "SRR1532959" "SRR1532960" "SRR1532961" "SRR1532962" "SRR1532963"
## [6] "SRR1532964"
The run is also the name of the SRA file so we will be able to link alignment file name to the experiment via the SRA file info.
Download the Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz from ensembl. Note that there is no info on multiple haplotypes so the primary assembly files are missing. So the info in toplevel file is the primary assembly.
"~/Downloads/elegans/"
path <- paste0(path, "Caenorhabditis_elegans.WBcel235.dna.toplevel.fa.gz")
elegansGenome <-system(paste0("mkdir ",path,"elegans_index"))
paste0(path, "elegans_index/elegans_index_WBcel235_rsubread")
indexName <-buildindex(basename = indexName, reference = elegansGenome)
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || Index space : base space ||
## || Index split : no-split ||
## || Repeat threshold : 100 repeats ||
## || Gapped index : no ||
## || ||
## || Free / total memory : 4.2GB / 16.0GB ||
## || ||
## || Input files : 1 file in total ||
## || o Caenorhabditis_elegans.WBcel235.dna.topl ... ||
## || ||
## \\============================================================================//
##
## //================================= Running ==================================\\
## || ||
## || Check the integrity of provided reference sequences ... ||
## || No format issues were found ||
## || Scan uninformative subreads in reference sequences ... ||
## || 6850 uninformative subreads were found. ||
## || These subreads were excluded from index building. ||
## || Estimate the index size... ||
## || 116%, 0 mins elapsed, rate=214724.1k bps/s ||
## || 124%, 0 mins elapsed, rate=49189.3k bps/s ||
## || 133%, 0 mins elapsed, rate=30313.3k bps/s ||
## || 141%, 0 mins elapsed, rate=22588.0k bps/s ||
## || 149%, 0 mins elapsed, rate=18443.3k bps/s ||
## || 158%, 0 mins elapsed, rate=15538.5k bps/s ||
## || 166%, 0 mins elapsed, rate=13745.9k bps/s ||
## || 174%, 0 mins elapsed, rate=12334.8k bps/s ||
## || 183%, 0 mins elapsed, rate=11351.6k bps/s ||
## || 191%, 0 mins elapsed, rate=10591.4k bps/s ||
## || 199%, 0 mins elapsed, rate=9801.3k bps/s ||
## || 208%, 0 mins elapsed, rate=9228.2k bps/s ||
## || 1.9 GB of memory is needed for index building. ||
## || Build the index... ||
## || 8%, 0 mins elapsed, rate=1618.7k bps/s ||
## || 16%, 0 mins elapsed, rate=1784.9k bps/s ||
## || 24%, 0 mins elapsed, rate=1877.7k bps/s ||
## || 33%, 0 mins elapsed, rate=1895.3k bps/s ||
## || 41%, 0 mins elapsed, rate=1878.2k bps/s ||
## || 49%, 0 mins elapsed, rate=1934.8k bps/s ||
## || 58%, 0 mins elapsed, rate=1970.6k bps/s ||
## || 66%, 0 mins elapsed, rate=1982.1k bps/s ||
## || 74%, 0 mins elapsed, rate=2001.9k bps/s ||
## || 83%, 0 mins elapsed, rate=2028.4k bps/s ||
## || 91%, 0 mins elapsed, rate=1999.6k bps/s ||
## || Save current index block... ||
## || [ 0.0% finished ] ||
## || [ 10.0% finished ] ||
## || [ 20.0% finished ] ||
## || [ 30.0% finished ] ||
## || [ 40.0% finished ] ||
## || [ 50.0% finished ] ||
## || [ 60.0% finished ] ||
## || [ 70.0% finished ] ||
## || [ 80.0% finished ] ||
## || [ 90.0% finished ] ||
## || [ 100.0% finished ] ||
## || ||
## || Total running time: 2.2 minutes. ||
## ||Index /Users/lclement/Downloads/elegans/elegans_index/elegans_index_WB ... ||
## || ||
## \\============================================================================//
paste0(path,"fastQ")
fastqDir <- list.files(fastqDir, "fastq.gz", full=TRUE)
fls <-names(fls) <- sub("small.fastq.gz", "", basename(fls))
paste0(path, "bamDir")
bamDir <-system(paste0("mkdir ", bamDir))
paste0(bamDir,"/",names(fls),".bam")
bamfls <-names(bamfls) <- names(fls)
The offset for the phred scores is 64. We find info on illumina incoding in quality control step of fastQC.
64
phredOffset <-align(index = indexName,
readfile1 = fls,
input_format = "gzFASTQ",
output_format = "BAM",
output_file = bamfls,
phredOffset = phredOffset)
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532959small.fastq.gz ||
## || Output file : SRR1532959.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:34:51, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,40] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.4 mins elapsed, rate=20.9k reads per second ||
## || 7% completed, 1.4 mins elapsed, rate=44.8k reads per second ||
## || 13% completed, 1.5 mins elapsed, rate=47.7k reads per second ||
## || 19% completed, 1.5 mins elapsed, rate=48.4k reads per second ||
## || 26% completed, 1.5 mins elapsed, rate=47.3k reads per second ||
## || 33% completed, 1.6 mins elapsed, rate=48.9k reads per second ||
## || 39% completed, 1.6 mins elapsed, rate=50.1k reads per second ||
## || 45% completed, 1.6 mins elapsed, rate=50.9k reads per second ||
## || 52% completed, 1.7 mins elapsed, rate=52.7k reads per second ||
## || 59% completed, 1.7 mins elapsed, rate=54.3k reads per second ||
## || 66% completed, 1.7 mins elapsed, rate=54.7k reads per second ||
## || 70% completed, 1.8 mins elapsed, rate=11.6k reads per second ||
## || 73% completed, 1.8 mins elapsed, rate=12.0k reads per second ||
## || 76% completed, 1.8 mins elapsed, rate=12.4k reads per second ||
## || 80% completed, 1.9 mins elapsed, rate=12.8k reads per second ||
## || 83% completed, 1.9 mins elapsed, rate=13.1k reads per second ||
## || 86% completed, 1.9 mins elapsed, rate=13.4k reads per second ||
## || 89% completed, 1.9 mins elapsed, rate=13.8k reads per second ||
## || 93% completed, 2.0 mins elapsed, rate=14.1k reads per second ||
## || 96% completed, 2.0 mins elapsed, rate=14.4k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 1783870 ||
## || Mapped : 1685472 (94.5%) ||
## || Uniquely mapped : 1632698 ||
## || Multi-mapping : 52774 ||
## || ||
## || Unmapped : 98398 ||
## || ||
## || Indels : 2512 ||
## || ||
## || Running time : 2.0 minutes ||
## || ||
## \\============================================================================//
##
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532960small.fastq.gz ||
## || Output file : SRR1532960.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:36:52, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,40] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.4 mins elapsed, rate=22.0k reads per second ||
## || 7% completed, 1.5 mins elapsed, rate=52.3k reads per second ||
## || 14% completed, 1.5 mins elapsed, rate=60.3k reads per second ||
## || 20% completed, 1.5 mins elapsed, rate=62.6k reads per second ||
## || 27% completed, 1.5 mins elapsed, rate=63.6k reads per second ||
## || 34% completed, 1.6 mins elapsed, rate=60.9k reads per second ||
## || 40% completed, 1.6 mins elapsed, rate=61.9k reads per second ||
## || 47% completed, 1.6 mins elapsed, rate=63.6k reads per second ||
## || 54% completed, 1.6 mins elapsed, rate=65.4k reads per second ||
## || 61% completed, 1.7 mins elapsed, rate=66.1k reads per second ||
## || 70% completed, 1.7 mins elapsed, rate=10.9k reads per second ||
## || 73% completed, 1.7 mins elapsed, rate=11.4k reads per second ||
## || 76% completed, 1.8 mins elapsed, rate=11.7k reads per second ||
## || 79% completed, 1.8 mins elapsed, rate=12.1k reads per second ||
## || 83% completed, 1.8 mins elapsed, rate=12.5k reads per second ||
## || 86% completed, 1.8 mins elapsed, rate=12.9k reads per second ||
## || 89% completed, 1.8 mins elapsed, rate=13.2k reads per second ||
## || 93% completed, 1.8 mins elapsed, rate=13.6k reads per second ||
## || 96% completed, 1.9 mins elapsed, rate=13.9k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 1610969 ||
## || Mapped : 1353023 (84.0%) ||
## || Uniquely mapped : 1300506 ||
## || Multi-mapping : 52517 ||
## || ||
## || Unmapped : 257946 ||
## || ||
## || Indels : 2303 ||
## || ||
## || Running time : 1.9 minutes ||
## || ||
## \\============================================================================//
##
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532961small.fastq.gz ||
## || Output file : SRR1532961.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:38:46, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,39] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.8 mins elapsed, rate=49.8k reads per second ||
## || 6% completed, 1.9 mins elapsed, rate=64.2k reads per second ||
## || 12% completed, 1.9 mins elapsed, rate=50.9k reads per second ||
## || 19% completed, 2.0 mins elapsed, rate=48.7k reads per second ||
## || 25% completed, 2.0 mins elapsed, rate=49.1k reads per second ||
## || 31% completed, 2.0 mins elapsed, rate=49.3k reads per second ||
## || 38% completed, 2.1 mins elapsed, rate=50.3k reads per second ||
## || 45% completed, 2.1 mins elapsed, rate=51.8k reads per second ||
## || 51% completed, 2.2 mins elapsed, rate=52.8k reads per second ||
## || 58% completed, 2.2 mins elapsed, rate=53.7k reads per second ||
## || 64% completed, 2.2 mins elapsed, rate=54.3k reads per second ||
## || 70% completed, 2.3 mins elapsed, rate=10.6k reads per second ||
## || 73% completed, 2.4 mins elapsed, rate=10.9k reads per second ||
## || 76% completed, 2.4 mins elapsed, rate=11.2k reads per second ||
## || 80% completed, 2.5 mins elapsed, rate=11.5k reads per second ||
## || 83% completed, 2.5 mins elapsed, rate=11.8k reads per second ||
## || 87% completed, 2.5 mins elapsed, rate=12.1k reads per second ||
## || 90% completed, 2.6 mins elapsed, rate=12.4k reads per second ||
## || 93% completed, 2.6 mins elapsed, rate=12.6k reads per second ||
## || 96% completed, 2.7 mins elapsed, rate=12.8k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 2117313 ||
## || Mapped : 2049912 (96.8%) ||
## || Uniquely mapped : 2005307 ||
## || Multi-mapping : 44605 ||
## || ||
## || Unmapped : 67401 ||
## || ||
## || Indels : 7289 ||
## || ||
## || Running time : 2.7 minutes ||
## || ||
## \\============================================================================//
##
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532962small.fastq.gz ||
## || Output file : SRR1532962.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:41:28, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,40] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.4 mins elapsed, rate=6.6k reads per second ||
## || 6% completed, 1.4 mins elapsed, rate=32.6k reads per second ||
## || 12% completed, 1.4 mins elapsed, rate=45.6k reads per second ||
## || 19% completed, 1.5 mins elapsed, rate=51.3k reads per second ||
## || 25% completed, 1.5 mins elapsed, rate=54.7k reads per second ||
## || 31% completed, 1.5 mins elapsed, rate=56.7k reads per second ||
## || 38% completed, 1.5 mins elapsed, rate=57.7k reads per second ||
## || 45% completed, 1.6 mins elapsed, rate=59.2k reads per second ||
## || 52% completed, 1.6 mins elapsed, rate=60.2k reads per second ||
## || 58% completed, 1.6 mins elapsed, rate=60.9k reads per second ||
## || 64% completed, 1.6 mins elapsed, rate=62.0k reads per second ||
## || 70% completed, 1.7 mins elapsed, rate=9.9k reads per second ||
## || 73% completed, 1.7 mins elapsed, rate=10.2k reads per second ||
## || 76% completed, 1.7 mins elapsed, rate=10.6k reads per second ||
## || 80% completed, 1.7 mins elapsed, rate=10.9k reads per second ||
## || 83% completed, 1.7 mins elapsed, rate=11.3k reads per second ||
## || 86% completed, 1.8 mins elapsed, rate=11.6k reads per second ||
## || 90% completed, 1.8 mins elapsed, rate=12.0k reads per second ||
## || 93% completed, 1.8 mins elapsed, rate=12.3k reads per second ||
## || 96% completed, 1.8 mins elapsed, rate=12.6k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 1410013 ||
## || Mapped : 1327093 (94.1%) ||
## || Uniquely mapped : 1299154 ||
## || Multi-mapping : 27939 ||
## || ||
## || Unmapped : 82920 ||
## || ||
## || Indels : 4824 ||
## || ||
## || Running time : 1.8 minutes ||
## || ||
## \\============================================================================//
##
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532963small.fastq.gz ||
## || Output file : SRR1532963.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:43:17, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,41] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.3 mins elapsed, rate=61.7k reads per second ||
## || 6% completed, 1.3 mins elapsed, rate=68.5k reads per second ||
## || 13% completed, 1.4 mins elapsed, rate=71.0k reads per second ||
## || 20% completed, 1.4 mins elapsed, rate=70.8k reads per second ||
## || 26% completed, 1.4 mins elapsed, rate=70.7k reads per second ||
## || 33% completed, 1.5 mins elapsed, rate=71.3k reads per second ||
## || 40% completed, 1.5 mins elapsed, rate=70.6k reads per second ||
## || 46% completed, 1.6 mins elapsed, rate=69.7k reads per second ||
## || 53% completed, 1.6 mins elapsed, rate=69.9k reads per second ||
## || 60% completed, 1.7 mins elapsed, rate=69.7k reads per second ||
## || 69% completed, 1.8 mins elapsed, rate=17.8k reads per second ||
## || 73% completed, 1.8 mins elapsed, rate=18.2k reads per second ||
## || 76% completed, 1.9 mins elapsed, rate=18.5k reads per second ||
## || 79% completed, 1.9 mins elapsed, rate=18.7k reads per second ||
## || 83% completed, 2.0 mins elapsed, rate=18.9k reads per second ||
## || 86% completed, 2.1 mins elapsed, rate=19.1k reads per second ||
## || 89% completed, 2.1 mins elapsed, rate=19.4k reads per second ||
## || 93% completed, 2.2 mins elapsed, rate=19.6k reads per second ||
## || 96% completed, 2.2 mins elapsed, rate=19.8k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 2736076 ||
## || Mapped : 2627215 (96.0%) ||
## || Uniquely mapped : 2451220 ||
## || Multi-mapping : 175995 ||
## || ||
## || Unmapped : 108861 ||
## || ||
## || Indels : 11317 ||
## || ||
## || Running time : 2.3 minutes ||
## || ||
## \\============================================================================//
##
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //================================= setting ==================================\\
## || ||
## || Function : Read alignment (RNA-Seq) ||
## || Input file : SRR1532964small.fastq.gz ||
## || Output file : SRR1532964.bam (BAM) ||
## || Index name : elegans_index_WBcel235_rsubread ||
## || ||
## || ------------------------------------ ||
## || ||
## || Threads : 1 ||
## || Phred offset : 64 ||
## || Min votes : 3 / 10 ||
## || Max mismatches : 3 ||
## || Max indel length : 5 ||
## || Report multi-mapping reads : yes ||
## || Max alignments per multi-mapping read : 1 ||
## || ||
## \\============================================================================//
##
## //================ Running (05-Nov-2020 17:45:34, pid=77005) =================\\
## || ||
## || Check the input reads. ||
## || The input file contains base space reads. ||
## || Initialise the memory objects. ||
## || Estimate the mean read length. ||
## || The range of Phred scores observed in the data is [2,41] ||
## || Create the output BAM file. ||
## || Check the index. ||
## || Init the voting space. ||
## || Global environment is initialised. ||
## || Load the 1-th index block... ||
## || The index block has been loaded. ||
## || Start read mapping in chunk. ||
## || 0% completed, 1.6 mins elapsed, rate=5.4k reads per second ||
## || 7% completed, 1.7 mins elapsed, rate=33.3k reads per second ||
## || 13% completed, 1.7 mins elapsed, rate=43.9k reads per second ||
## || 20% completed, 1.8 mins elapsed, rate=47.0k reads per second ||
## || 27% completed, 1.8 mins elapsed, rate=49.6k reads per second ||
## || 33% completed, 1.9 mins elapsed, rate=49.6k reads per second ||
## || 40% completed, 1.9 mins elapsed, rate=50.4k reads per second ||
## || 46% completed, 2.0 mins elapsed, rate=51.8k reads per second ||
## || 53% completed, 2.0 mins elapsed, rate=53.2k reads per second ||
## || 60% completed, 2.1 mins elapsed, rate=53.9k reads per second ||
## || 70% completed, 2.2 mins elapsed, rate=12.3k reads per second ||
## || 73% completed, 2.3 mins elapsed, rate=12.6k reads per second ||
## || 76% completed, 2.3 mins elapsed, rate=13.1k reads per second ||
## || 80% completed, 2.3 mins elapsed, rate=13.4k reads per second ||
## || 83% completed, 2.4 mins elapsed, rate=13.8k reads per second ||
## || 86% completed, 2.4 mins elapsed, rate=14.2k reads per second ||
## || 90% completed, 2.4 mins elapsed, rate=14.5k reads per second ||
## || 93% completed, 2.4 mins elapsed, rate=14.9k reads per second ||
## || 96% completed, 2.5 mins elapsed, rate=15.2k reads per second ||
## || ||
## || Completed successfully. ||
## || ||
## \\==================================== ====================================//
##
## //================================ Summary =================================\\
## || ||
## || Total reads : 2338023 ||
## || Mapped : 2245445 (96.0%) ||
## || Uniquely mapped : 2184002 ||
## || Multi-mapping : 61443 ||
## || ||
## || Unmapped : 92578 ||
## || ||
## || Indels : 8247 ||
## || ||
## || Running time : 2.5 minutes ||
## || ||
## \\============================================================================//
## SRR1532959.bam SRR1532960.bam SRR1532961.bam
## Total_reads 1783870 1610969 2117313
## Mapped_reads 1685472 1353023 2049912
## Uniquely_mapped_reads 1632698 1300506 2005307
## Multi_mapping_reads 52774 52517 44605
## Unmapped_reads 98398 257946 67401
## Indels 2512 2303 7289
## SRR1532962.bam SRR1532963.bam SRR1532964.bam
## Total_reads 1410013 2736076 2338023
## Mapped_reads 1327093 2627215 2245445
## Uniquely_mapped_reads 1299154 2451220 2184002
## Multi_mapping_reads 27939 175995 61443
## Unmapped_reads 82920 108861 92578
## Indels 4824 11317 8247
featureCounts(files = bamfls,
fcElegans<-annot.ext = paste0(
path,"Caenorhabditis_elegans.WBcel235.98.gtf.gz"),
isGTFAnnotationFile=TRUE,
GTF.featureType = "exon",
GTF.attrType = "gene_id",
useMetaFeatures = TRUE,
strandSpecific = 0,
isPairedEnd = FALSE)
##
## ========== _____ _ _ ____ _____ ______ _____
## ===== / ____| | | | _ \| __ \| ____| /\ | __ \
## ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
## ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
## ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
## ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
## Rsubread 2.2.6
##
## //========================== featureCounts setting ===========================\\
## || ||
## || Input files : 6 BAM files ||
## || o SRR1532959.bam ||
## || o SRR1532960.bam ||
## || o SRR1532961.bam ||
## || o SRR1532962.bam ||
## || o SRR1532963.bam ||
## || o SRR1532964.bam ||
## || ||
## || Annotation : Caenorhabditis_elegans.WBcel235.98.gtf.gz (GTF) ||
## || Dir for temp files : . ||
## || Threads : 1 ||
## || Level : meta-feature level ||
## || Paired-end : no ||
## || Multimapping reads : counted ||
## || Multi-overlapping reads : not counted ||
## || Min overlapping bases : 1 ||
## || ||
## \\============================================================================//
##
## //================================= Running ==================================\\
## || ||
## || Load annotation file Caenorhabditis_elegans.WBcel235.98.gtf.gz ... ||
## || Features : 273641 ||
## || Meta-features : 46904 ||
## || Chromosomes/contigs : 7 ||
## || ||
## || Process BAM file SRR1532959.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 1783870 ||
## || Successfully assigned alignments : 1657500 (92.9%) ||
## || Running time : 0.08 minutes ||
## || ||
## || Process BAM file SRR1532960.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 1610969 ||
## || Successfully assigned alignments : 1323934 (82.2%) ||
## || Running time : 0.05 minutes ||
## || ||
## || Process BAM file SRR1532961.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 2117313 ||
## || Successfully assigned alignments : 2013110 (95.1%) ||
## || Running time : 0.07 minutes ||
## || ||
## || Process BAM file SRR1532962.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 1410013 ||
## || Successfully assigned alignments : 1304759 (92.5%) ||
## || Running time : 0.05 minutes ||
## || ||
## || Process BAM file SRR1532963.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 2736076 ||
## || Successfully assigned alignments : 2568046 (93.9%) ||
## || Running time : 0.17 minutes ||
## || ||
## || Process BAM file SRR1532964.bam... ||
## || Single-end reads are included. ||
## || Total alignments : 2338023 ||
## || Successfully assigned alignments : 2208253 (94.4%) ||
## || Running time : 0.10 minutes ||
## || ||
## || Write the final count table. ||
## || Write the read assignment summary. ||
## || ||
## \\============================================================================//
fcElegans$counts countTableElegans <-
We save the countTable for future use
saveRDS(fcElegans, file = "fcElegans.rds")
saveRDS(pdata, file = "elegansMetaData.rds")