Module 11: Debugging R Functions
For this assignment, I worked on debugging a function that identifies outlier rows in a numeric matrix using the Tukey rule. The original code had a bug that I had to find and fix.
The Original Buggy Code
The function was supposed to flag rows where every column contains an outlier. Here's what I started with:
tukey.outlier = function(x) {
Q1 = quantile(x, 0.25, na.rm = TRUE)
Q3 = quantile(x, 0.75, na.rm = TRUE)
IQR = Q3 - Q1
lower = Q1 - 1.5 * IQR
upper = Q3 + 1.5 * IQR
return(x < lower | x > upper)
}
tukey_multiple = function(x) {
outliers = array(TRUE, dim = dim(x))
for (j in 1:ncol(x)) {
outliers[, j] = outliers[, j] && tukey.outlier(x[, j])
}
outlier.vec = vector("logical", length = nrow(x))
for (i in 1:nrow(x)) {
outlier.vec[i] = all(outliers[i, ])
}
return(outlier.vec)
}
Error in outliers[, j] && tukey.outlier(x[, j]) :
'length = 10' in coercion to 'logical(1)'
What Went Wrong
The problem was on line 4 where I used && instead of &. The double ampersand && only works with single TRUE or FALSE values, but here I was comparing entire vectors of 10 elements. R couldn't handle that and threw an error.
The Fix
I changed && to & for element-wise comparison:
corrected_tukey = function(x) {
outliers = array(TRUE, dim = dim(x))
for (j in 1:ncol(x)) {
outliers[, j] = outliers[, j] & tukey.outlier(x[, j])
}
outlier.vec = vector("logical", length = nrow(x))
for (i in 1:nrow(x)) {
outlier.vec[i] = all(outliers[i, ])
}
return(outlier.vec)
}
Testing the Fix
After fixing the bug, I tested it again:
set.seed(123)
test_mat = matrix(rnorm(50), nrow = 10)
corrected_tukey(test_mat)
corrected_tukey(test_mat)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[8] FALSE FALSE FALSE
The function now works without errors. The output shows that none of the 10 rows have outliers in every column, which makes sense for this random data.
What I Learned
This debugging exercise taught me the difference between logical operators in R. Use && and || for single values, and &and | for vectors. Reading error messages carefully really helps pinpoint where things go wrong.
Comments
Post a Comment