Indexing into a data structure
Problem
You want to get part of a data structure.
Solution
Elements from a vector, matrix, or data frame can be extracted using numeric indexing, or by using a boolean vector of the appropriate length.
In many of the examples, below, there are multiple ways of doing the same thing.
Indexing with numbers and names
With a vector:
# A sample vector
v <- c(1,4,4,3,2,2,3)
v[c(2,3,4)]
#> [1] 4 4 3
v[2:4]
#> [1] 4 4 3
v[c(2,4,3)]
#> [1] 4 3 4
With a data frame:
# Create a sample data frame
data <- read.table(header=T, text='
subject sex size
1 M 7
2 F 6
3 F 9
4 M 11
')
# Get the element at row 1, column 3
data[1,3]
#> [1] 7
data[1,"size"]
#> [1] 7
# Get rows 1 and 2, and all columns
data[1:2, ]
#> subject sex size
#> 1 1 M 7
#> 2 2 F 6
data[c(1,2), ]
#> subject sex size
#> 1 1 M 7
#> 2 2 F 6
# Get rows 1 and 2, and only column 2
data[1:2, 2]
#> [1] M F
#> Levels: F M
data[c(1,2), 2]
#> [1] M F
#> Levels: F M
# Get rows 1 and 2, and only the columns named "sex" and "size"
data[1:2, c("sex","size")]
#> sex size
#> 1 M 7
#> 2 F 6
data[c(1,2), c(2,3)]
#> sex size
#> 1 M 7
#> 2 F 6
Indexing with a boolean vector
With the vector v
from above:
v > 2
#> [1] FALSE TRUE TRUE TRUE FALSE FALSE TRUE
v[v>2]
#> [1] 4 4 3 3
v[ c(F,T,T,T,F,F,T)]
#> [1] 4 4 3 3
With the data frame from above:
# A boolean vector
data$subject < 3
#> [1] TRUE TRUE FALSE FALSE
data[data$subject < 3, ]
#> subject sex size
#> 1 1 M 7
#> 2 2 F 6
data[c(TRUE,TRUE,FALSE,FALSE), ]
#> subject sex size
#> 1 1 M 7
#> 2 2 F 6
# It is also possible to get the numeric indices of the TRUEs
which(data$subject < 3)
#> [1] 1 2
Negative indexing
Unlike in some other programming languages, when you use negative numbers for indexing in R, it doesn’t mean to index backward from the end. Instead, it means to drop the element at that index, counting the usual way, from the beginning.
# Here's the vector again.
v
#> [1] 1 4 4 3 2 2 3
# Drop the first element
v[-1]
#> [1] 4 4 3 2 2 3
# Drop first three
v[-1:-3]
#> [1] 3 2 2 3
# Drop just the last element
v[-length(v)]
#> [1] 1 4 4 3 2 2
Notes
Also see ../Getting a subset of a data structure.