Hello Developer, Hope you guys are doing great. Today at Tutorial Guruji Official website, we are sharing the answer of R double for loop: outer or apply? without wasting too much if your time.
The question is published on by Tutorial Guruji team.
The question is published on by Tutorial Guruji team.
I have the following code:
a <- c(1,2,2,3,4,5,6) b <- c(4,5,6,7,8,8,9) data <- data.frame(cbind(a,b)) trial <- copy(data) for (j in 1: ncol(trial)) { for (i in 2: nrow(trial)) { if (trial[i,j] == trial[i-1,j] & !is.na(trial[i,j]) & !is.na(trial[i-1,j])) { trial[i,j] <- trial[i-1,j] + (0.001*sd(trial[,j], na.rm = T)) } } }
The code perfectly works, but on a larger dataset is a bit slow. I thought to improve speed by using either the apply or the outer family. The issues are:
- I know how to apply a single loop with apply, but not as for 2, especially in this case, where I need to replace single values according to case-specific conditions, with another single value (the lag) plus a multiplier of the standard deviation (which is something I need to compute over the whole column;
- Except for this solved question, I have no experience at all of using outer and vectorised functions instead of loops.
Answer
With data.table
library(data.table) f <- function(x)ifelse(x==shift(x), x + 0.001* sd(x, na.rm = TRUE), x) setDT(data)[, lapply(.SD, f), ]
With dplyr
library(dplyr) f <- function(x)ifelse(x==lag(x), x + 0.001* sd(x, na.rm = TRUE), x) data %>% mutate_each(funs(f))
We are here to answer your question about R double for loop: outer or apply? - If you find the proper solution, please don't forgot to share this with your team members.