R for Public Health
Department of Community Medicine, MGIMS
23 Sep 2024
<-
->
=
Addition
Subtraction
Multiplication
Division
Exponentiation
Modulus
Greater
Lesser
Equal
Greater or equal
Lesser or equal
Not Equal
AND
OR
Integer
Numeric
Character
Logical
Date
[1] "2023-12-18"
[1] 2.0 4.0 6.0 8.0 3.0 5.5 4.0
[1] "2023-11-28" "2023-12-22" "2024-09-23"
[1] 3
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "A" "G" "M" "S" "Y" "E"
[2,] "B" "H" "N" "T" "Z" "F"
[3,] "C" "I" "O" "U" "A" "G"
[4,] "D" "J" "P" "V" "B" "H"
[5,] "E" "K" "Q" "W" "C" "I"
[6,] "F" "L" "R" "X" "D" "J"
[1] Male Female Female Male Male Male Female Female Male Female
[11] Male
Levels: Male Female
age <- c(12,24,NA,23,65,33) # create age vector
gender <- c("M","F","F","M","M","F") #create gender vector
occu <- factor(c(1,4,3,2,4,5), #occupation
levels = c(1:5),
labels = c("Unemp","Service","Student","Business","Prof"))
#date of birth
dob <- c(as.Date("1993-01-16"),as.Date("1963-12-24"),as.Date("1971-01-05"),
as.Date("1982-11-11"),as.Date("1984-05-15"),as.Date("1999-03-07"))
#create data frame
df <- data.frame(age,gender,occu,dob)
age gender occu dob
1 12 M Unemp 1993-01-16
2 24 F Business 1963-12-24
3 NA F Student 1971-01-05
4 23 M Service 1982-11-11
5 65 M Business 1984-05-15
6 33 F Prof 1999-03-07
[[1]]
age gender occu dob
1 12 M Unemp 1993-01-16
2 24 F Business 1963-12-24
3 NA F Student 1971-01-05
4 23 M Service 1982-11-11
5 65 M Business 1984-05-15
6 33 F Prof 1999-03-07
[[2]]
[1] "1993-01-16" "1963-12-24" "1971-01-05" "1982-11-11" "1984-05-15"
[6] "1999-03-07"
[[3]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "A" "G" "M" "S" "Y" "E"
[2,] "B" "H" "N" "T" "Z" "F"
[3,] "C" "I" "O" "U" "A" "G"
[4,] "D" "J" "P" "V" "B" "H"
[5,] "E" "K" "Q" "W" "C" "I"
[6,] "F" "L" "R" "X" "D" "J"
[[4]]
[1] 2.0 4.0 6.0 8.0 3.0 5.5 4.0
List with nth object(s)
nth object
selecting withing object
# Avoid
pipe <- df %>% select(age,dob,occu) %>% mutate(age_cat = if_else(age < 20,"Young","Old"))
# Strive for
pipe <- df %>%
select(age, dob, occu) %>%
mutate(age_cat = if_else(age < 20, "Young", "Old"))
# Avoid
pipe <- df %>%
select(age, dob, occu) %>%
summarise(age_cat = mean(
age,
na.rm = TRUE)
)
# Strive for
pipe <- df %>%
select(age, dob, occu) %>%
summarise(age_cat = mean(
age,
na.rm = TRUE)
)
Each variable is a column; each column is a variable.
Each observation is a row; each row is an observation.
Each value is a cell; each cell is a single value.