This fits the NB-GAM model as described in Van den Berge et al.[2019]. There are two ways to provide the required input in fitGAM. See Details and the vignette.

fitGAM(counts, ...)

# S4 method for matrix
fitGAM(
  counts,
  sds = NULL,
  pseudotime = NULL,
  cellWeights = NULL,
  conditions = NULL,
  U = NULL,
  genes = seq_len(nrow(counts)),
  weights = NULL,
  offset = NULL,
  nknots = 6,
  verbose = TRUE,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam(),
  control = mgcv::gam.control(),
  sce = TRUE,
  family = "nb",
  gcv = FALSE
)

# S4 method for dgCMatrix
fitGAM(
  counts,
  sds = NULL,
  pseudotime = NULL,
  cellWeights = NULL,
  conditions = NULL,
  U = NULL,
  genes = seq_len(nrow(counts)),
  weights = NULL,
  offset = NULL,
  nknots = 6,
  verbose = TRUE,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam(),
  control = mgcv::gam.control(),
  sce = TRUE,
  family = "nb",
  gcv = FALSE
)

# S4 method for SingleCellExperiment
fitGAM(
  counts,
  U = NULL,
  genes = seq_len(nrow(counts)),
  conditions = NULL,
  weights = NULL,
  offset = NULL,
  nknots = 6,
  verbose = TRUE,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam(),
  control = mgcv::gam.control(),
  sce = TRUE,
  family = "nb",
  gcv = FALSE
)

# S4 method for CellDataSet
fitGAM(
  counts,
  U = NULL,
  genes = seq_len(nrow(counts)),
  weights = NULL,
  offset = NULL,
  nknots = 6,
  verbose = TRUE,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam(),
  control = mgcv::gam.control(),
  sce = TRUE,
  family = "nb",
  gcv = FALSE
)

Arguments

counts

The count matrix of expression values, with genes in rows and cells in columns. Can be a matrix or a sparse matrix.

...

parameters including:

sds

an object of class SlingshotDataSet, typically obtained after running Slingshot. If this is provided, pseudotime and cellWeights arguments are derived from this object.

pseudotime

A matrix of pseudotime values, each row represents a cell and each column represents a lineage.

cellWeights

A matrix of cell weights defining the probability that a cell belongs to a particular lineage. Each row represents a cell and each column represents a lineage. If only a single lineage, provide a matrix with one column containing all values of 1.

conditions

This argument is in beta phase and should be used carefully. If each lineage consists of multiple conditions, this argument can be used to specify the conditions. tradeSeq will then fit a condition-specific smoother for every lineage.

U

The design matrix of fixed effects. The design matrix should not contain an intercept to ensure identifiability.

genes

The genes on which to run fitGAM. Default to all the genes. If only a subset of the genes is indicated, normalization will be done using all the genes but the smoothers will be computed only for the subset.

weights

A matrix of weights with identical dimensions as the counts matrix. Usually a matrix of zero-inflation weights.

offset

The offset, on log-scale. If NULL, TMM is used to account for differences in sequencing depth., see edgeR::calcNormFactors. Alternatively, this may also be a vector with length equal to the number of cells.

nknots

Number of knots used to fit the GAM. Defaults to 6. It is recommended to use the `evaluateK` function to guide in selecting an appropriate number of knots.

verbose

Logical, should progress be printed?

parallel

Logical, defaults to FALSE. Set to TRUE if you want to parallellize the fitting.

BPPARAM

object of class bpparamClass that specifies the back-end to be used for computations. See bpparam in BiocParallel package for details.

control

Variables to control fitting of the GAM, see gam.control.

sce

Logical: should output be of SingleCellExperiment class? This is recommended to be TRUE. If sds argument is specified, it will always be set to TRUE

family

The assumed distribution for the response. Is set to "nb" by default.

gcv

(In development). Logical, should a GCV score also be returned?

Value

If sce=FALSE, returns a list of length equal to the number of genes (number of rows of counts). Each element of the list is either a gamObject if the fiting procedure converged, or an error message. If sce=TRUE, returns a singleCellExperiment object with the tradeSeq results stored in the rowData, colData and metadata.

Details

fitGAM supports four different ways to input the required objects:

  • "Count matrix, matrix of pseudotime and matrix of cellWeights." Input count matrix using counts argument and pseudotimes and cellWeights as a matrix, with number of rows equal to number of cells, and number of columns equal to number of lineages.

  • "Count matrix and Slingshot input."Input count matrix using counts argument and Slingshot object using sds argument.

  • "SingleCellExperiment Object after running slingshot on the object." Input SingleCellExperiment Object using counts argument.

  • "CellDataSet object after running the orderCells function." Input CellDataSet Object using counts argument.

Examples

set.seed(8) data(crv, package="tradeSeq") data(countMatrix, package="tradeSeq") gamList <- fitGAM(counts = as.matrix(countMatrix), sds = crv, nknots = 5)