more efficient way than using a 'for' loop
I feel there's smarter/more efficient way than this code:
df <- mtcars
df$somename <- as.array(rep(c(0), 32))
for (i in 1:32){
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
}
maybe with %>%? but how?
r
add a comment |
I feel there's smarter/more efficient way than this code:
df <- mtcars
df$somename <- as.array(rep(c(0), 32))
for (i in 1:32){
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
}
maybe with %>%? but how?
r
1
This is actually a loop as well but one alternative ismapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
ORsapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20
add a comment |
I feel there's smarter/more efficient way than this code:
df <- mtcars
df$somename <- as.array(rep(c(0), 32))
for (i in 1:32){
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
}
maybe with %>%? but how?
r
I feel there's smarter/more efficient way than this code:
df <- mtcars
df$somename <- as.array(rep(c(0), 32))
for (i in 1:32){
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
}
maybe with %>%? but how?
r
r
asked Nov 26 '18 at 5:02
Tony DTony D
1168
1168
1
This is actually a loop as well but one alternative ismapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
ORsapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20
add a comment |
1
This is actually a loop as well but one alternative ismapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
ORsapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20
1
1
This is actually a loop as well but one alternative is
mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
OR sapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20
This is actually a loop as well but one alternative is
mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
OR sapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20
add a comment |
3 Answers
3
active
oldest
votes
An option using purrr::map2
library(tidyverse)
mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))
# mpg cyl disp hp drat wt qsec vs am gear carb somename
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 9.786358
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 10.00203
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 11.51877
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 11.47281
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 9.60251
#6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 11.85111
#7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8.6762
#8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 11.88646
#9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 13.96536
#10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 10.50761
#11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 10.93187
#12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 9.425733
#13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 9.807571
#14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 10.05506
#15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 9.001469
#16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8.765296
#17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8.538314
#18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 12.21173
#19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 11.95364
#20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 12.77388
#21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 12.40619
#22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 9.439876
#23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 9.804036
#24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8.181225
#25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 9.337345
#26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 11.99607
#27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 10.29547
#28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 10.88025
#29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8.01152
#30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 9.001469
#31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 7.799388
#32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 11.18643
Update
I re-ran @42-'s microbenchmark
analysis using a larger dataset
library(microbenchmark)
df <- do.call(rbind, lapply(1:100, function(x) mtcars))
res <- microbenchmark(
orig = {
df$somename <- as.array(rep(c(0), nrow(df)))
for (i in 1:nrow(df)) {
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))}},
tidy = {
df <- df %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))},
mapply = {
df$somename <- mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)},
rowMeans = {
df$rm <- rowMeans(df[,c("wt","qsec")])
df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )})
res
#Unit: microseconds
# expr min lq mean median uq max
# orig 331092.86 349754.808 360716.6501 357229.3920 366635.2820 446581.924
# tidy 168701.28 181079.910 189710.1927 187026.6290 194392.5190 273725.354
# mapply 161711.77 172457.395 179326.5484 177263.3045 183688.5365 266102.901
# rowMeans 228.08 315.854 343.9151 334.8975 358.5915 807.847
library(ggplot2)
autoplot(res)
Themapply
solution is faster.
– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for atidyverse
-eque approach.
– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, somapply
andtidy
scale very similarly, withmapply
being still slightly faster.
– Maurits Evers
Nov 26 '18 at 6:40
1
for this particular specific example,df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster
– chinsoon12
Nov 26 '18 at 6:42
|
show 1 more comment
More of a comment than an answer:
> library(microbenchmark)
> microbenchmark( orig = {df <- mtcars
+
+ df$somename <- as.array(rep(c(0), 32))
+
+ for (i in 1:32){
+ df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
+ }}, tidy = {
+ mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))}, mapply = { mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)})
#------------------------------------
Unit: microseconds
expr min lq mean median uq max neval cld
orig 5069.391 5161.9270 5555.5886 5236.769 5490.7365 12400.502 100 b
tidy 910.071 943.9685 986.4419 970.541 998.8075 1241.711 100 a
mapply 744.639 761.1875 805.6328 773.426 807.2545 2206.393 100 a
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and thefor
loop solution ended up being significantly faster than both themapply
andtidyverse
approach. Not much difference betweenmapply
andpurrr::map2
.
– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.mapply
andtidy
scale very similarly with larger datasets (and are faster thanorig
), withmapply
being slightly faster thantidy
.
– Maurits Evers
Nov 26 '18 at 6:42
add a comment |
Code:
df$somename <- apply(matrix(c(df$wt, df$qsec), ncol=2), MARGIN = 1, FUN=sd)
Output:
> head(df$somename)
somename
1 9.786358
2 10.002025
3 11.518769
4 11.472808
5 9.602510
6 11.851110
7 8.676200
8 11.886465
9 13.965359
10 10.507607
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53475021%2fmore-efficient-way-than-using-a-for-loop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
An option using purrr::map2
library(tidyverse)
mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))
# mpg cyl disp hp drat wt qsec vs am gear carb somename
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 9.786358
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 10.00203
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 11.51877
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 11.47281
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 9.60251
#6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 11.85111
#7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8.6762
#8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 11.88646
#9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 13.96536
#10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 10.50761
#11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 10.93187
#12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 9.425733
#13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 9.807571
#14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 10.05506
#15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 9.001469
#16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8.765296
#17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8.538314
#18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 12.21173
#19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 11.95364
#20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 12.77388
#21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 12.40619
#22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 9.439876
#23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 9.804036
#24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8.181225
#25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 9.337345
#26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 11.99607
#27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 10.29547
#28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 10.88025
#29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8.01152
#30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 9.001469
#31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 7.799388
#32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 11.18643
Update
I re-ran @42-'s microbenchmark
analysis using a larger dataset
library(microbenchmark)
df <- do.call(rbind, lapply(1:100, function(x) mtcars))
res <- microbenchmark(
orig = {
df$somename <- as.array(rep(c(0), nrow(df)))
for (i in 1:nrow(df)) {
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))}},
tidy = {
df <- df %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))},
mapply = {
df$somename <- mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)},
rowMeans = {
df$rm <- rowMeans(df[,c("wt","qsec")])
df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )})
res
#Unit: microseconds
# expr min lq mean median uq max
# orig 331092.86 349754.808 360716.6501 357229.3920 366635.2820 446581.924
# tidy 168701.28 181079.910 189710.1927 187026.6290 194392.5190 273725.354
# mapply 161711.77 172457.395 179326.5484 177263.3045 183688.5365 266102.901
# rowMeans 228.08 315.854 343.9151 334.8975 358.5915 807.847
library(ggplot2)
autoplot(res)
Themapply
solution is faster.
– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for atidyverse
-eque approach.
– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, somapply
andtidy
scale very similarly, withmapply
being still slightly faster.
– Maurits Evers
Nov 26 '18 at 6:40
1
for this particular specific example,df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster
– chinsoon12
Nov 26 '18 at 6:42
|
show 1 more comment
An option using purrr::map2
library(tidyverse)
mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))
# mpg cyl disp hp drat wt qsec vs am gear carb somename
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 9.786358
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 10.00203
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 11.51877
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 11.47281
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 9.60251
#6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 11.85111
#7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8.6762
#8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 11.88646
#9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 13.96536
#10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 10.50761
#11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 10.93187
#12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 9.425733
#13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 9.807571
#14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 10.05506
#15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 9.001469
#16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8.765296
#17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8.538314
#18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 12.21173
#19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 11.95364
#20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 12.77388
#21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 12.40619
#22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 9.439876
#23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 9.804036
#24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8.181225
#25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 9.337345
#26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 11.99607
#27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 10.29547
#28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 10.88025
#29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8.01152
#30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 9.001469
#31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 7.799388
#32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 11.18643
Update
I re-ran @42-'s microbenchmark
analysis using a larger dataset
library(microbenchmark)
df <- do.call(rbind, lapply(1:100, function(x) mtcars))
res <- microbenchmark(
orig = {
df$somename <- as.array(rep(c(0), nrow(df)))
for (i in 1:nrow(df)) {
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))}},
tidy = {
df <- df %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))},
mapply = {
df$somename <- mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)},
rowMeans = {
df$rm <- rowMeans(df[,c("wt","qsec")])
df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )})
res
#Unit: microseconds
# expr min lq mean median uq max
# orig 331092.86 349754.808 360716.6501 357229.3920 366635.2820 446581.924
# tidy 168701.28 181079.910 189710.1927 187026.6290 194392.5190 273725.354
# mapply 161711.77 172457.395 179326.5484 177263.3045 183688.5365 266102.901
# rowMeans 228.08 315.854 343.9151 334.8975 358.5915 807.847
library(ggplot2)
autoplot(res)
Themapply
solution is faster.
– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for atidyverse
-eque approach.
– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, somapply
andtidy
scale very similarly, withmapply
being still slightly faster.
– Maurits Evers
Nov 26 '18 at 6:40
1
for this particular specific example,df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster
– chinsoon12
Nov 26 '18 at 6:42
|
show 1 more comment
An option using purrr::map2
library(tidyverse)
mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))
# mpg cyl disp hp drat wt qsec vs am gear carb somename
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 9.786358
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 10.00203
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 11.51877
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 11.47281
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 9.60251
#6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 11.85111
#7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8.6762
#8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 11.88646
#9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 13.96536
#10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 10.50761
#11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 10.93187
#12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 9.425733
#13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 9.807571
#14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 10.05506
#15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 9.001469
#16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8.765296
#17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8.538314
#18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 12.21173
#19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 11.95364
#20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 12.77388
#21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 12.40619
#22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 9.439876
#23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 9.804036
#24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8.181225
#25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 9.337345
#26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 11.99607
#27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 10.29547
#28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 10.88025
#29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8.01152
#30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 9.001469
#31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 7.799388
#32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 11.18643
Update
I re-ran @42-'s microbenchmark
analysis using a larger dataset
library(microbenchmark)
df <- do.call(rbind, lapply(1:100, function(x) mtcars))
res <- microbenchmark(
orig = {
df$somename <- as.array(rep(c(0), nrow(df)))
for (i in 1:nrow(df)) {
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))}},
tidy = {
df <- df %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))},
mapply = {
df$somename <- mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)},
rowMeans = {
df$rm <- rowMeans(df[,c("wt","qsec")])
df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )})
res
#Unit: microseconds
# expr min lq mean median uq max
# orig 331092.86 349754.808 360716.6501 357229.3920 366635.2820 446581.924
# tidy 168701.28 181079.910 189710.1927 187026.6290 194392.5190 273725.354
# mapply 161711.77 172457.395 179326.5484 177263.3045 183688.5365 266102.901
# rowMeans 228.08 315.854 343.9151 334.8975 358.5915 807.847
library(ggplot2)
autoplot(res)
An option using purrr::map2
library(tidyverse)
mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))
# mpg cyl disp hp drat wt qsec vs am gear carb somename
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 9.786358
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 10.00203
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 11.51877
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 11.47281
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 9.60251
#6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 11.85111
#7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8.6762
#8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 11.88646
#9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 13.96536
#10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 10.50761
#11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 10.93187
#12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 9.425733
#13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 9.807571
#14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 10.05506
#15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 9.001469
#16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8.765296
#17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8.538314
#18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 12.21173
#19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 11.95364
#20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 12.77388
#21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 12.40619
#22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 9.439876
#23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 9.804036
#24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8.181225
#25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 9.337345
#26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 11.99607
#27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 10.29547
#28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 10.88025
#29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8.01152
#30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 9.001469
#31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 7.799388
#32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 11.18643
Update
I re-ran @42-'s microbenchmark
analysis using a larger dataset
library(microbenchmark)
df <- do.call(rbind, lapply(1:100, function(x) mtcars))
res <- microbenchmark(
orig = {
df$somename <- as.array(rep(c(0), nrow(df)))
for (i in 1:nrow(df)) {
df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))}},
tidy = {
df <- df %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))},
mapply = {
df$somename <- mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)},
rowMeans = {
df$rm <- rowMeans(df[,c("wt","qsec")])
df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )})
res
#Unit: microseconds
# expr min lq mean median uq max
# orig 331092.86 349754.808 360716.6501 357229.3920 366635.2820 446581.924
# tidy 168701.28 181079.910 189710.1927 187026.6290 194392.5190 273725.354
# mapply 161711.77 172457.395 179326.5484 177263.3045 183688.5365 266102.901
# rowMeans 228.08 315.854 343.9151 334.8975 358.5915 807.847
library(ggplot2)
autoplot(res)
edited Nov 26 '18 at 6:48
answered Nov 26 '18 at 5:07
Maurits EversMaurits Evers
30.1k41636
30.1k41636
Themapply
solution is faster.
– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for atidyverse
-eque approach.
– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, somapply
andtidy
scale very similarly, withmapply
being still slightly faster.
– Maurits Evers
Nov 26 '18 at 6:40
1
for this particular specific example,df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster
– chinsoon12
Nov 26 '18 at 6:42
|
show 1 more comment
Themapply
solution is faster.
– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for atidyverse
-eque approach.
– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, somapply
andtidy
scale very similarly, withmapply
being still slightly faster.
– Maurits Evers
Nov 26 '18 at 6:40
1
for this particular specific example,df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster
– chinsoon12
Nov 26 '18 at 6:42
The
mapply
solution is faster.– 42-
Nov 26 '18 at 5:33
The
mapply
solution is faster.– 42-
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for a
tidyverse
-eque approach.– Maurits Evers
Nov 26 '18 at 5:33
@42- I agree. But part of OPs question seems to ask for a
tidyverse
-eque approach.– Maurits Evers
Nov 26 '18 at 5:33
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 Aaah bugger!! You're absolutely right. Let me re-run...
– Maurits Evers
Nov 26 '18 at 6:35
@chinsoon12 ... and done. Ok, so
mapply
and tidy
scale very similarly, with mapply
being still slightly faster.– Maurits Evers
Nov 26 '18 at 6:40
@chinsoon12 ... and done. Ok, so
mapply
and tidy
scale very similarly, with mapply
being still slightly faster.– Maurits Evers
Nov 26 '18 at 6:40
1
1
for this particular specific example,
df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster– chinsoon12
Nov 26 '18 at 6:42
for this particular specific example,
df$rm <- rowMeans(df[,c("wt","qsec")]) ; df$sd2col <- sqrt( (df$wt - df$rm)^2 + (df$qsec - df$rm)^2 )
would be faster– chinsoon12
Nov 26 '18 at 6:42
|
show 1 more comment
More of a comment than an answer:
> library(microbenchmark)
> microbenchmark( orig = {df <- mtcars
+
+ df$somename <- as.array(rep(c(0), 32))
+
+ for (i in 1:32){
+ df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
+ }}, tidy = {
+ mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))}, mapply = { mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)})
#------------------------------------
Unit: microseconds
expr min lq mean median uq max neval cld
orig 5069.391 5161.9270 5555.5886 5236.769 5490.7365 12400.502 100 b
tidy 910.071 943.9685 986.4419 970.541 998.8075 1241.711 100 a
mapply 744.639 761.1875 805.6328 773.426 807.2545 2206.393 100 a
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and thefor
loop solution ended up being significantly faster than both themapply
andtidyverse
approach. Not much difference betweenmapply
andpurrr::map2
.
– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.mapply
andtidy
scale very similarly with larger datasets (and are faster thanorig
), withmapply
being slightly faster thantidy
.
– Maurits Evers
Nov 26 '18 at 6:42
add a comment |
More of a comment than an answer:
> library(microbenchmark)
> microbenchmark( orig = {df <- mtcars
+
+ df$somename <- as.array(rep(c(0), 32))
+
+ for (i in 1:32){
+ df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
+ }}, tidy = {
+ mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))}, mapply = { mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)})
#------------------------------------
Unit: microseconds
expr min lq mean median uq max neval cld
orig 5069.391 5161.9270 5555.5886 5236.769 5490.7365 12400.502 100 b
tidy 910.071 943.9685 986.4419 970.541 998.8075 1241.711 100 a
mapply 744.639 761.1875 805.6328 773.426 807.2545 2206.393 100 a
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and thefor
loop solution ended up being significantly faster than both themapply
andtidyverse
approach. Not much difference betweenmapply
andpurrr::map2
.
– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.mapply
andtidy
scale very similarly with larger datasets (and are faster thanorig
), withmapply
being slightly faster thantidy
.
– Maurits Evers
Nov 26 '18 at 6:42
add a comment |
More of a comment than an answer:
> library(microbenchmark)
> microbenchmark( orig = {df <- mtcars
+
+ df$somename <- as.array(rep(c(0), 32))
+
+ for (i in 1:32){
+ df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
+ }}, tidy = {
+ mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))}, mapply = { mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)})
#------------------------------------
Unit: microseconds
expr min lq mean median uq max neval cld
orig 5069.391 5161.9270 5555.5886 5236.769 5490.7365 12400.502 100 b
tidy 910.071 943.9685 986.4419 970.541 998.8075 1241.711 100 a
mapply 744.639 761.1875 805.6328 773.426 807.2545 2206.393 100 a
More of a comment than an answer:
> library(microbenchmark)
> microbenchmark( orig = {df <- mtcars
+
+ df$somename <- as.array(rep(c(0), 32))
+
+ for (i in 1:32){
+ df$somename[i] <- sd(c(df$wt[i], df$qsec[i]))
+ }}, tidy = {
+ mtcars %>% mutate(somename = map2(wt, qsec, ~sd(c(.x, .y))))}, mapply = { mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)})
#------------------------------------
Unit: microseconds
expr min lq mean median uq max neval cld
orig 5069.391 5161.9270 5555.5886 5236.769 5490.7365 12400.502 100 b
tidy 910.071 943.9685 986.4419 970.541 998.8075 1241.711 100 a
mapply 744.639 761.1875 805.6328 773.426 807.2545 2206.393 100 a
answered Nov 26 '18 at 5:36
42-42-
216k15265402
216k15265402
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and thefor
loop solution ended up being significantly faster than both themapply
andtidyverse
approach. Not much difference betweenmapply
andpurrr::map2
.
– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.mapply
andtidy
scale very similarly with larger datasets (and are faster thanorig
), withmapply
being slightly faster thantidy
.
– Maurits Evers
Nov 26 '18 at 6:42
add a comment |
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and thefor
loop solution ended up being significantly faster than both themapply
andtidyverse
approach. Not much difference betweenmapply
andpurrr::map2
.
– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.mapply
andtidy
scale very similarly with larger datasets (and are faster thanorig
), withmapply
being slightly faster thantidy
.
– Maurits Evers
Nov 26 '18 at 6:42
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and the
for
loop solution ended up being significantly faster than both the mapply
and tidyverse
approach. Not much difference between mapply
and purrr::map2
.– Maurits Evers
Nov 26 '18 at 6:22
Just out of curiosity, I re-ran the benchmark analysis using a larger dataset (see my updated post), and the
for
loop solution ended up being significantly faster than both the mapply
and tidyverse
approach. Not much difference between mapply
and purrr::map2
.– Maurits Evers
Nov 26 '18 at 6:22
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.
mapply
and tidy
scale very similarly with larger datasets (and are faster than orig
), with mapply
being slightly faster than tidy
.– Maurits Evers
Nov 26 '18 at 6:42
Ah I made a blunder of things (definitely time to call it a day;-). It's fixed now.
mapply
and tidy
scale very similarly with larger datasets (and are faster than orig
), with mapply
being slightly faster than tidy
.– Maurits Evers
Nov 26 '18 at 6:42
add a comment |
Code:
df$somename <- apply(matrix(c(df$wt, df$qsec), ncol=2), MARGIN = 1, FUN=sd)
Output:
> head(df$somename)
somename
1 9.786358
2 10.002025
3 11.518769
4 11.472808
5 9.602510
6 11.851110
7 8.676200
8 11.886465
9 13.965359
10 10.507607
add a comment |
Code:
df$somename <- apply(matrix(c(df$wt, df$qsec), ncol=2), MARGIN = 1, FUN=sd)
Output:
> head(df$somename)
somename
1 9.786358
2 10.002025
3 11.518769
4 11.472808
5 9.602510
6 11.851110
7 8.676200
8 11.886465
9 13.965359
10 10.507607
add a comment |
Code:
df$somename <- apply(matrix(c(df$wt, df$qsec), ncol=2), MARGIN = 1, FUN=sd)
Output:
> head(df$somename)
somename
1 9.786358
2 10.002025
3 11.518769
4 11.472808
5 9.602510
6 11.851110
7 8.676200
8 11.886465
9 13.965359
10 10.507607
Code:
df$somename <- apply(matrix(c(df$wt, df$qsec), ncol=2), MARGIN = 1, FUN=sd)
Output:
> head(df$somename)
somename
1 9.786358
2 10.002025
3 11.518769
4 11.472808
5 9.602510
6 11.851110
7 8.676200
8 11.886465
9 13.965359
10 10.507607
answered Nov 26 '18 at 7:27
Farah NazifaFarah Nazifa
587512
587512
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53475021%2fmore-efficient-way-than-using-a-for-loop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
This is actually a loop as well but one alternative is
mapply(function(x, y) sd(c(x, y)), df$wt, df$qsec)
ORsapply(1:nrow(df), function(x) sd(c(df$wt[x], df$qsec[x])))
– Ronak Shah
Nov 26 '18 at 5:20