We will create a \(4 \times 2\) matrix, starting from a vector of values.
## Define the matrix X, filling matrix "byrow": 4 x 2
X <- matrix(c(3, 6, 1, 2, 0.23, 0.46, -2, -4), nrow = 4, byrow = TRUE)
X
#> [,1] [,2]
#> [1,] 3.00 6.00
#> [2,] 1.00 2.00
#> [3,] 0.23 0.46
#> [4,] -2.00 -4.00
## Alternative way to define the matrix, `byrow = FALSE` is the default
X <- matrix(c(3, 1, 0.23, -2, 6, 2, 0.46, -4), ncol = 2, byrow = FALSE)
X
#> [,1] [,2]
#> [1,] 3.00 6.00
#> [2,] 1.00 2.00
#> [3,] 0.23 0.46
#> [4,] -2.00 -4.00
# Creating a second matrix: 2 x 4, using the square of `X`
Y <- matrix(X^2, nrow = 2, ncol = 4)
Y
#> [,1] [,2] [,3] [,4]
#> [1,] 9 0.0529 36 0.2116
#> [2,] 1 4.0000 4 16.0000
## Check that we really have a matrix
class(X)
#> [1] "matrix" "array"
Generally you can check the arguments of a function, e.g. matrix
, in R with ?matrix
.
The dim
function tells you the dimensions of the matrix, nrow
and ncol
give you the number of rows and columns, respectively.
Note that R sees a matrix as X[row, column]
, so you can subset X
with
X[1, ] # first row of X
#> [1] 3 6
X[, 2] # second column of X
#> [1] 6.00 2.00 0.46 -4.00
X[1:2, ] # first 2 rows of X
#> [,1] [,2]
#> [1,] 3 6
#> [2,] 1 2
Note that, by default, subsetting a single row or column from a matrix returns a vector, not a column- or row-matrix as might be expected. We can see this from the fact that the subset does not have dimensions anymore (an attribute only defined for matrices in R):
dim(X[, 1]) # NULL = non-existent
#> NULL
class(X[, 1]) # not matrix but numeric vector
#> [1] "numeric"
This can lead to unexpected behavior when doing matrix multiplication (see further below). To avoid this, we can set drop = FALSE
in the subsetting brackets.
You can also create matrices by ‘binding’ together vectors (of the same length) column-wise or row-wise with cbind()
and rbind()
, respectively.
# Create vectors from the rows of X
(x1 <- X[1, ])
#> [1] 3 6
(x2 <- X[2, ])
#> [1] 1 2
(x3 <- X[3, ])
#> [1] 0.23 0.46
(x4 <- X[4, ])
#> [1] -2 -4
# Bind them back together row-wise: re-creates X
rbind(x1, x2, x3, x4)
#> [,1] [,2]
#> x1 3.00 6.00
#> x2 1.00 2.00
#> x3 0.23 0.46
#> x4 -2.00 -4.00
# Bind them back together column-wise: (essentially the transpose of X)
cbind(x1, x2, x3, x4)
#> x1 x2 x3 x4
#> [1,] 3 1 0.23 -2
#> [2,] 6 2 0.46 -4
Binding together 2 matrices also works (but be mindful of their dimensions!).
# Row-binding two matrices
# X and Y don't have the same dimensions (4x2 vs. 2x4),
# so binding them directly doesn't work
rbind(X, Y)
#> Error in rbind(X, Y): number of columns of matrices must match (see arg 2)
# For rbind we need the same number of columns
rbind(X, Y[, 1:2]) # works
#> [,1] [,2]
#> [1,] 3.00 6.0000
#> [2,] 1.00 2.0000
#> [3,] 0.23 0.4600
#> [4,] -2.00 -4.0000
#> [5,] 9.00 0.0529
#> [6,] 1.00 4.0000
# For cbind, same number of rows
cbind(X[1:2, ], Y)
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 3 6 9 0.0529 36 0.2116
#> [2,] 1 2 1 4.0000 4 16.0000
Note that R automatically creates row names or column names from the variable names when using rbind()
and cbind()
.
To add row and columns names to an existing matrix, use rownames()
, colnames()
or the more general dimnames()
.
rownames(X) <- c("row_1", "row_2", "row_3", "row_4")
colnames(X) <- c("col_1", "col_2")
X
#> col_1 col_2
#> row_1 3.00 6.00
#> row_2 1.00 2.00
#> row_3 0.23 0.46
#> row_4 -2.00 -4.00
## Using `NULL` removes the dimnames again
dimnames(X) <- NULL
X
#> [,1] [,2]
#> [1,] 3.00 6.00
#> [2,] 1.00 2.00
#> [3,] 0.23 0.46
#> [4,] -2.00 -4.00
diag()
The diag
function can also be used to create identity matrices with specific dimensions
diag(1, nrow = 4, ncol = 4) # 4 x 4
#> [,1] [,2] [,3] [,4]
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 0 0 1 0
#> [4,] 0 0 0 1
diag(1, nrow = 2, ncol = 4) # 2 x 4
#> [,1] [,2] [,3] [,4]
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
diag(1, nrow = 4, ncol = 2) # 4 x 2
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
#> [3,] 0 0
#> [4,] 0 0
Scalar operations occur element-wise for matrices.
X # for reference
#> [,1] [,2]
#> [1,] 3.00 6.00
#> [2,] 1.00 2.00
#> [3,] 0.23 0.46
#> [4,] -2.00 -4.00
X + 1
#> [,1] [,2]
#> [1,] 4.00 7.00
#> [2,] 2.00 3.00
#> [3,] 1.23 1.46
#> [4,] -1.00 -3.00
X - 100
#> [,1] [,2]
#> [1,] -97.00 -94.00
#> [2,] -99.00 -98.00
#> [3,] -99.77 -99.54
#> [4,] -102.00 -104.00
2 * X
#> [,1] [,2]
#> [1,] 6.00 12.00
#> [2,] 2.00 4.00
#> [3,] 0.46 0.92
#> [4,] -4.00 -8.00
X / 3
#> [,1] [,2]
#> [1,] 1.00000000 2.0000000
#> [2,] 0.33333333 0.6666667
#> [3,] 0.07666667 0.1533333
#> [4,] -0.66666667 -1.3333333
X^2
#> [,1] [,2]
#> [1,] 9.0000 36.0000
#> [2,] 1.0000 4.0000
#> [3,] 0.0529 0.2116
#> [4,] 4.0000 16.0000
Note that ^2
also occurs element-wise, not using matrix multiplication (see further below)!
You can also do operations with vectors, though you should be careful with the dimensions. R
will add the vector to each column of the matrix. When using a vector with a different length than the number of rows (elements in a column), the vector elements will be recycled. When the longer object is not a multiple of the shorter, the operation still occurs but a warning is produced (technically a message
).
X
#> [,1] [,2]
#> [1,] 3.00 6.00
#> [2,] 1.00 2.00
#> [3,] 0.23 0.46
#> [4,] -2.00 -4.00
v4 <- c(1, 2, 3, 4) # vector of length 4
X + v4
#> [,1] [,2]
#> [1,] 4.00 7.00
#> [2,] 3.00 4.00
#> [3,] 3.23 3.46
#> [4,] 2.00 0.00
v2 <- c(1, 2) # vector of length 2, will be recycled
X + v2
#> [,1] [,2]
#> [1,] 4.00 7.00
#> [2,] 3.00 4.00
#> [3,] 1.23 1.46
#> [4,] 0.00 -2.00
v3 <- c(1, 2, 3) # vector or length 3, also recycled with warning
X + v3
#> Warning in X + v3: longer object length is not a multiple of shorter object
#> length
#> [,1] [,2]
#> [1,] 4.00 8.00
#> [2,] 3.00 5.00
#> [3,] 3.23 1.46
#> [4,] -1.00 -2.00
v5 <- c(1, 2, 3, 4, 5) # vector or length 5, also recycled with warning
X + v5
#> Warning in X + v5: longer object length is not a multiple of shorter object
#> length
#> [,1] [,2]
#> [1,] 4.00 11.00
#> [2,] 3.00 3.00
#> [3,] 3.23 2.46
#> [4,] 2.00 -1.00
Define a second matrix Y
with same dimensions as X
.
# We will generate a matrix Y with 2s and 4s at random places (using `sample`)
# Setting a seed allows reproducibility
set.seed(1)
Y <- matrix(sample(c(2, 4), size = nrow(X) * ncol(X), replace = TRUE),
nrow = nrow(X), ncol = ncol(X))
Y
#> [,1] [,2]
#> [1,] 2 4
#> [2,] 4 2
#> [3,] 2 2
#> [4,] 2 2
Using the standard math operators occurs element-wise.
X + Y
#> [,1] [,2]
#> [1,] 5.00 10.00
#> [2,] 5.00 4.00
#> [3,] 2.23 2.46
#> [4,] 0.00 -2.00
X - Y
#> [,1] [,2]
#> [1,] 1.00 2.00
#> [2,] -3.00 0.00
#> [3,] -1.77 -1.54
#> [4,] -4.00 -6.00
X * Y
#> [,1] [,2]
#> [1,] 6.00 24.00
#> [2,] 4.00 4.00
#> [3,] 0.46 0.92
#> [4,] -4.00 -8.00
X / Y
#> [,1] [,2]
#> [1,] 1.500 1.50
#> [2,] 0.250 1.00
#> [3,] 0.115 0.23
#> [4,] -1.000 -2.00
X ^ Y
#> [,1] [,2]
#> [1,] 9.0000 1296.0000
#> [2,] 1.0000 4.0000
#> [3,] 0.0529 0.2116
#> [4,] 4.0000 16.0000
https://www.statmethods.net/advstats/matrix.html
t()
\(\mathbf{X}^T\)
%*%
Note that X
and Y
have the same dimensions, so multiplying them straight away doesn’t work (incompatible dimensions)
## This produces an error because the dimensions are not compatible
X %*% Y
#> Error in X %*% Y: non-conformable arguments
Instead, we can transpose X
and compute \(\mathbf{X}^T \cdot \mathbf{Y}\):
t(X) %*% Y
#> [,1] [,2]
#> [1,] 6.46 10.46
#> [2,] 12.92 20.92
## `crossprod()` is a shortcut for this operation, and is slightly faster
crossprod(X, Y)
#> [,1] [,2]
#> [1,] 6.46 10.46
#> [2,] 12.92 20.92
## `tcrossprod()` calculates X.Y^T
X %*% t(Y)
#> [,1] [,2] [,3] [,4]
#> [1,] 30.0 24.00 18.00 18.00
#> [2,] 10.0 8.00 6.00 6.00
#> [3,] 2.3 1.84 1.38 1.38
#> [4,] -20.0 -16.00 -12.00 -12.00
tcrossprod(X, Y)
#> [,1] [,2] [,3] [,4]
#> [1,] 30.0 24.00 18.00 18.00
#> [2,] 10.0 8.00 6.00 6.00
#> [3,] 2.3 1.84 1.38 1.38
#> [4,] -20.0 -16.00 -12.00 -12.00
The crossprod()
and tcrossprod()
functions also work with a single matrix as input, in which case they calculate \(\mathbf{X}^T\mathbf{X}\) and \(\mathbf{X}\mathbf{X}^T\), respectively:
## X^TX
crossprod(X)
#> [,1] [,2]
#> [1,] 14.0529 28.1058
#> [2,] 28.1058 56.2116
## XX^T
tcrossprod(X)
#> [,1] [,2] [,3] [,4]
#> [1,] 45.00 15.00 3.4500 -30.0
#> [2,] 15.00 5.00 1.1500 -10.0
#> [3,] 3.45 1.15 0.2645 -2.3
#> [4,] -30.00 -10.00 -2.3000 20.0
As already mentioned before, subsetting a matrix with [
returns a vector by default. Doing matrix multiplication with vectors in R can be ambiguous as R will “guess” itself whether the vector should be treated as a column- or row-vector. See ?`%*%`
for details.
The example below illustrates some potentially unexpected results
## Selecting the first column of X, we would expect this to be treated as a column
## vector with dimensions 4 x 1...
X[, 1]
#> [1] 3.00 1.00 0.23 -2.00
dim(X[, 1])
#> NULL
## ...but in R vectors don't have dimensions
## Mathematically, multiplicating the first column of X with X is not possible
## (multiplying a 4x1 with a 4x2 matrix doesn't work)
## So we would expect the following code to throw an error...
X[, 1] %*% X
#> [,1] [,2]
#> [1,] 14.0529 28.1058
## ...but it doesn't and instead returns the result as if X[, 1] is a row-vector
To avoid confusion and when you intend to use single rows or vectors from a matrix as they were single-row or single-column matrices, it’s better to use drop = FALSE
when subsetting.
X1 <- X[, 1, drop = FALSE]
## Now X1 does have dimensions
dim(X1)
#> [1] 4 1
## And the following code throws an error
X1 %*% X
#> Error in X1 %*% X: non-conformable arguments
You can also do matrix multiplication with 2 vectors, in which case again R will guess if they should be treated as column- or row-vectors.
solve()
Inverse: \(\mathbf{X}^{-1}\) (only for square matrices)
## Create new square matrix Z
(Z <- matrix(1:4, nrow = 2, ncol = 2))
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
## Compute inverse
(Z_inv <- solve(Z))
#> [,1] [,2]
#> [1,] -2 1.5
#> [2,] 1 -0.5
## Check that it's indeed the inverse
Z %*% Z_inv # the 2x2 identity matrix
#> [,1] [,2]
#> [1,] 1 0
#> [2,] 0 1
Warning: DON’T USE X^-1
to compute a matrix inverse!
The notation might be confusing because of the way we mathematically describe the inverse of a matrix (\(\mathbf{X}^{-1}\)). In R, however, X^-1
computes element-wise inverses of each value.
If you’re wondering why the name “solve
” is used for the function to compute a matrix inverse, it’s because solve
can do more than just that. See ?solve
for details.
Calculating the rank of the matrix \(\mathbf X\). There are several ways to do this in R. We’ll use the qr()
function (see ?qr
). An alternative would be to use rankMatrix
from the Matrix package: Matrix::rankMatrix()
.
We see that the rank of X
is lower than its number of columns and rows, meaning the matrix is not of full rank.
Considering the contents of X
, can you see why this is?
Hint: