Section 10 Subset Data
10.1 Subset a data.frame
Example data.frame:
x <- c(1:10); y <- rep(c('M','F'), each=5); z <- rep(c(T,F), length=10)
DF <- data.frame(Age=x, Sex=y, Vac=z)
Index | Explanation | Example. |
---|---|---|
$ |
Get the elements of the data.frame for the named column | DF$Age |
[ i, j ] |
Single square bracket with apprpriate index for the row(s) and column(s) | DF[1:2,3] |
Positive integer | Select all elements corresponding to the integer value of the specific dimension | DF[2,] |
Negative integer | Remove the elements corresponding to the integer value of the specific dimension | DF[-2,]; DF[,-2] |
Zero | Select no element | DF[0] |
Blank | Select all elements for the specific dimension | DF[2,]; DF[,2] |
Logical values | Select the element corresponding to the logical value TRUE | DF[,c(T,F,T)] |
Names | Select the element corresponding to the named value | DF[,c('Age','Vac')] |
10.2 Example
x <- c(1:10); y <- rep(c('M','F'), each=5); z <- rep(c(T,F), length=10)
DF <- data.frame(Age=x, Sex=y, Vac=z)
DF[2,]
DF[1:2,3]
DF[c(1,3),]
x[,2]
DF[-2,]
DF[,-2]
DF[-c(1,3),]
DF[0]
DF[]
DF[c(T,F,T),]
DF[,c(T,F,T)]
DF$Age
DF[,'Age']
DF[,c('Age'), drop=F]
DF[,c('Age','Vac')]
DF[c(1:3),'Age']
DF[2]
DF['Age']
DF[DF['Age']>4,]
DF[DF$Age>4,]
DF[DF$Sex='M',]
DF[DF$Vac==TRUE,]
DF[DF$Vac==T,]
DF[DF$Vac==1,]
DF[c(DF$Sex=='M' & DF$Vac==1),]
DF[c(DF$Age>=4 & DF$Vac==1),]
Note:
You can use a combination of operators to subset the data:
x[c(1:3),'A']
When subsetted with only one index, the data.frame returns the column:
x[2]
Some special functions to handle
NA
in a data.frame can be implemented as follow:complete.cases(DF); order(DF$A, na.last=FALSE); table(DF, useNA='always')