Logical indexing and filtering are powerful tools for manipulating data in R. They let you slice and dice your datasets, pulling out exactly what you need. With these techniques, you can easily select specific rows or columns based on conditions.
These skills are crucial for data analysis and cleaning. By mastering logical operators and filtering methods, you'll be able to efficiently subset large datasets, handle missing values, and prepare your data for further analysis or visualization.
Logical Vectors and Operators
Understanding Logical Vectors and Boolean Operations
Top images from around the web for Understanding Logical Vectors and Boolean Operations Boolean Expressions: Example | Saylor Academy View original
Is this image relevant?
Boolean Expressions: Example | Saylor Academy View original
Is this image relevant?
1 of 3
Top images from around the web for Understanding Logical Vectors and Boolean Operations Boolean Expressions: Example | Saylor Academy View original
Is this image relevant?
Boolean Expressions: Example | Saylor Academy View original
Is this image relevant?
1 of 3
Logical vectors contain only TRUE or FALSE values
Boolean operators manipulate logical vectors
NOT (!) reverses logical values
AND (&) returns TRUE if both operands are TRUE
OR (|) returns TRUE if at least one operand is TRUE
Comparison operators create logical vectors
Equal to (== )
Not equal to (!=)
Greater than (>)
Less than (< )
Greater than or equal to (>=)
Less than or equal to (<=)
Vectorized operations apply element-wise to vectors
c(1, 2, 3) > 2
results in c(FALSE, FALSE, TRUE)
Advanced Logical Operations
Combine multiple conditions using AND (&) and OR (|) operators
(x > 0) & (x < 10)
checks if x is between 0 and 10
(y == "A") | (y == "B")
checks if y is either "A" or "B"
Short-circuit evaluation optimizes performance
AND stops evaluating if first condition is FALSE
OR stops evaluating if first condition is TRUE
Use parentheses to control order of operations
Subsetting and Filtering
Basic Subsetting Techniques
Subset operator [] extracts elements from vectors, matrices, or data frames
x[3]
selects the third element of vector x
df[2, 3]
selects the element in the second row and third column of data frame df
which() function returns indices of TRUE values in a logical vector
which(x > 5)
returns positions where x is greater than 5
subset() function selects rows based on logical conditions
subset(df, age > 18)
selects rows where age is greater than 18
Conditional subsetting combines logical vectors with the subset operator
x[x > 0]
selects all positive values in vector x
Advanced Filtering Techniques
filter() function from dplyr package provides intuitive data frame filtering
Combine multiple conditions for complex filtering
Use %in% operator for membership tests
Apply functions within subsetting for dynamic filtering
Handling Missing Values
Identifying and Working with Missing Data
is.na() function checks for missing values (NA)
Returns TRUE for NA values, FALSE otherwise
is.na(x)
creates a logical vector indicating NA positions in x
Missing value handling strategies
Remove rows with missing values using na.omit() or complete.cases()
na.omit(df)
removes rows with any NA values
Impute missing values with mean, median, or other methods
Subset to exclude or include missing values
df[!is.na(df$x), ]
selects rows where x is not NA
df[is.na(df$y), ]
selects rows where y is NA
Advanced Missing Value Operations
Combine is.na() with logical operators for complex conditions
Use colSums() or rowSums() with is.na() to count missing values
colSums(is.na(df))
counts NA values in each column
Apply na.rm = TRUE in functions to ignore missing values
mean(x, na.rm = TRUE)
calculates mean excluding NA values
Visualize missing data patterns using libraries like VIM or naniar
Create heatmaps or bar plots to identify missing data trends