How to calculate dynamically the sum for each element of nestled list in r?
I have this data frame:
split.test.input <- data.frame(matrix(ncol=7,nrow=10,
c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,
rep("year",4),rep("month",3),rep("year",3),
rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),
rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),
rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))
colnames(split.test.input) <- c("species", "year", "Time.unit","country","GSA","Gear","Quantity")
I split for many variable:
split.res <- dlply(split.test.input,.(species),
dlply,.(Time.unit),
dlply,.(country),
dlply,.(GSA),
dlply,.(Gear))
Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):
df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])
colnames(df.fromSplit) <- "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list
df.fromSplit
a,year,ITA,GSA 17,gear.1
1 75
2 45
3 230
4 89
I would like to calculate sum
for this column:
sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))
439
but it's not elegant...
IMPORTANT
I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:
combination sum
a,year,ITA,GSA 17,gear.1 439
b,month,HVR,GSA.1,gear.2 78
[...]
and so on for each combination of list
I thought a for
loop which can extract each element of a list and it calculate the sum for the quantity of each list, but with for loop I don't know how extract each list based on variables (my experience with a list is very low)
r list dataframe for-loop split
add a comment |
I have this data frame:
split.test.input <- data.frame(matrix(ncol=7,nrow=10,
c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,
rep("year",4),rep("month",3),rep("year",3),
rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),
rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),
rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))
colnames(split.test.input) <- c("species", "year", "Time.unit","country","GSA","Gear","Quantity")
I split for many variable:
split.res <- dlply(split.test.input,.(species),
dlply,.(Time.unit),
dlply,.(country),
dlply,.(GSA),
dlply,.(Gear))
Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):
df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])
colnames(df.fromSplit) <- "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list
df.fromSplit
a,year,ITA,GSA 17,gear.1
1 75
2 45
3 230
4 89
I would like to calculate sum
for this column:
sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))
439
but it's not elegant...
IMPORTANT
I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:
combination sum
a,year,ITA,GSA 17,gear.1 439
b,month,HVR,GSA.1,gear.2 78
[...]
and so on for each combination of list
I thought a for
loop which can extract each element of a list and it calculate the sum for the quantity of each list, but with for loop I don't know how extract each list based on variables (my experience with a list is very low)
r list dataframe for-loop split
add a comment |
I have this data frame:
split.test.input <- data.frame(matrix(ncol=7,nrow=10,
c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,
rep("year",4),rep("month",3),rep("year",3),
rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),
rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),
rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))
colnames(split.test.input) <- c("species", "year", "Time.unit","country","GSA","Gear","Quantity")
I split for many variable:
split.res <- dlply(split.test.input,.(species),
dlply,.(Time.unit),
dlply,.(country),
dlply,.(GSA),
dlply,.(Gear))
Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):
df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])
colnames(df.fromSplit) <- "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list
df.fromSplit
a,year,ITA,GSA 17,gear.1
1 75
2 45
3 230
4 89
I would like to calculate sum
for this column:
sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))
439
but it's not elegant...
IMPORTANT
I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:
combination sum
a,year,ITA,GSA 17,gear.1 439
b,month,HVR,GSA.1,gear.2 78
[...]
and so on for each combination of list
I thought a for
loop which can extract each element of a list and it calculate the sum for the quantity of each list, but with for loop I don't know how extract each list based on variables (my experience with a list is very low)
r list dataframe for-loop split
I have this data frame:
split.test.input <- data.frame(matrix(ncol=7,nrow=10,
c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,
rep("year",4),rep("month",3),rep("year",3),
rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),
rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),
rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))
colnames(split.test.input) <- c("species", "year", "Time.unit","country","GSA","Gear","Quantity")
I split for many variable:
split.res <- dlply(split.test.input,.(species),
dlply,.(Time.unit),
dlply,.(country),
dlply,.(GSA),
dlply,.(Gear))
Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):
df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])
colnames(df.fromSplit) <- "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list
df.fromSplit
a,year,ITA,GSA 17,gear.1
1 75
2 45
3 230
4 89
I would like to calculate sum
for this column:
sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))
439
but it's not elegant...
IMPORTANT
I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:
combination sum
a,year,ITA,GSA 17,gear.1 439
b,month,HVR,GSA.1,gear.2 78
[...]
and so on for each combination of list
I thought a for
loop which can extract each element of a list and it calculate the sum for the quantity of each list, but with for loop I don't know how extract each list based on variables (my experience with a list is very low)
r list dataframe for-loop split
r list dataframe for-loop split
asked Nov 22 '18 at 16:11
skyloboskylobo
766
766
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Actually it's hard to imagine a purpose for which such a complicated object as split.res
would be needed. What you are asking can be done much simpler.
First, let's convert Quantity
to numeric type (currently it's a factor).
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
Then simply
tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)
# a, year, ITA, GSA 17, gear 1 b, month, HVR, GSA 1, gear 2
# 439 78
# b, month, HVR, GSA 17, gear 2 b, month, ITA, GSA 1, gear 2
# 45 96
# b, year, ITA, GSA 12, gear 2 c, year, ESP, GSA 12, gear 2
# 100 325
or
(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))
# [1] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [3] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2"
# [7] "b, month, ITA, GSA 1, gear 2" "b, year, ITA, GSA 12, gear 2"
# [9] "c, year, ESP, GSA 12, gear 2" "c, year, ESP, GSA 12, gear 2"
tapply(split.test.input$Quantity, groups, sum)
Also, since you already are using dlply
, you may be interested in something like
ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))
species Time.unit country GSA Gear Sum
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month HVR GSA 17 gear 2 45
# 4 b month ITA GSA 1 gear 2 96
# 5 b year ITA GSA 12 gear 2 100
# 6 c year ESP GSA 12 gear 2 325
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, addingna.rm = TRUE
aftersum
would do that.
– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector withtapply
, andaggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).
– Julius Vainora
Nov 23 '18 at 10:53
add a comment |
Consider aggregate on multiple columns:
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,
data=split.test.input, FUN=sum)
agg_df
# species Time.unit country GSA Gear Quantity
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month ITA GSA 1 gear 2 96
# 4 c year ESP GSA 12 gear 2 325
# 5 b year ITA GSA 12 gear 2 100
# 6 b month HVR GSA 17 gear 2 45
If needing a list, run by
(object-oriented wrapper to tapply
) with paste(..., collapse="")
for combination column:
df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],
function(sub) unique(transform(sub,
combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),
sum = sum(sub$Quantity))[c("combination", "sum")])
)
df_list <- Filter(NROW, df_list)
df_list
# [[1]]
# combination sum
# 1 a year ITA GSA 17 gear 1 439
# [[2]]
# combination sum
# 6 b month HVR GSA 1 gear 2 78
# [[3]]
# combination sum
# 7 b month ITA GSA 1 gear 2 96
# [[4]]
# combination sum
# 9 c year ESP GSA 12 gear 2 325
# [[5]]
# combination sum
# 8 b year ITA GSA 12 gear 2 100
# [[6]]
# combination sum
# 5 b month HVR GSA 17 gear 2 45
add a comment |
We could use tidyverse
library(tidyverse)
split.test.input %>%
group_by_at(vars(names(.)[c(1, 3:6)])) %>%
summarise(Quantity = sum(parse_number(Quantity)))
# A tibble: 6 x 6
# Groups: species, Time.unit, country, GSA [?]
# species Time.unit country GSA Gear Quantity
# <fct> <fct> <fct> <fct> <fct> <dbl>
#1 a year ITA GSA 17 gear 1 439
#2 b month HVR GSA 1 gear 2 78
#3 b month HVR GSA 17 gear 2 45
#4 b month ITA GSA 1 gear 2 96
#5 b year ITA GSA 12 gear 2 100
#6 c year ESP GSA 12 gear 2 325
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434775%2fhow-to-calculate-dynamically-the-sum-for-each-element-of-nestled-list-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Actually it's hard to imagine a purpose for which such a complicated object as split.res
would be needed. What you are asking can be done much simpler.
First, let's convert Quantity
to numeric type (currently it's a factor).
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
Then simply
tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)
# a, year, ITA, GSA 17, gear 1 b, month, HVR, GSA 1, gear 2
# 439 78
# b, month, HVR, GSA 17, gear 2 b, month, ITA, GSA 1, gear 2
# 45 96
# b, year, ITA, GSA 12, gear 2 c, year, ESP, GSA 12, gear 2
# 100 325
or
(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))
# [1] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [3] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2"
# [7] "b, month, ITA, GSA 1, gear 2" "b, year, ITA, GSA 12, gear 2"
# [9] "c, year, ESP, GSA 12, gear 2" "c, year, ESP, GSA 12, gear 2"
tapply(split.test.input$Quantity, groups, sum)
Also, since you already are using dlply
, you may be interested in something like
ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))
species Time.unit country GSA Gear Sum
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month HVR GSA 17 gear 2 45
# 4 b month ITA GSA 1 gear 2 96
# 5 b year ITA GSA 12 gear 2 100
# 6 c year ESP GSA 12 gear 2 325
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, addingna.rm = TRUE
aftersum
would do that.
– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector withtapply
, andaggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).
– Julius Vainora
Nov 23 '18 at 10:53
add a comment |
Actually it's hard to imagine a purpose for which such a complicated object as split.res
would be needed. What you are asking can be done much simpler.
First, let's convert Quantity
to numeric type (currently it's a factor).
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
Then simply
tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)
# a, year, ITA, GSA 17, gear 1 b, month, HVR, GSA 1, gear 2
# 439 78
# b, month, HVR, GSA 17, gear 2 b, month, ITA, GSA 1, gear 2
# 45 96
# b, year, ITA, GSA 12, gear 2 c, year, ESP, GSA 12, gear 2
# 100 325
or
(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))
# [1] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [3] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2"
# [7] "b, month, ITA, GSA 1, gear 2" "b, year, ITA, GSA 12, gear 2"
# [9] "c, year, ESP, GSA 12, gear 2" "c, year, ESP, GSA 12, gear 2"
tapply(split.test.input$Quantity, groups, sum)
Also, since you already are using dlply
, you may be interested in something like
ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))
species Time.unit country GSA Gear Sum
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month HVR GSA 17 gear 2 45
# 4 b month ITA GSA 1 gear 2 96
# 5 b year ITA GSA 12 gear 2 100
# 6 c year ESP GSA 12 gear 2 325
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, addingna.rm = TRUE
aftersum
would do that.
– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector withtapply
, andaggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).
– Julius Vainora
Nov 23 '18 at 10:53
add a comment |
Actually it's hard to imagine a purpose for which such a complicated object as split.res
would be needed. What you are asking can be done much simpler.
First, let's convert Quantity
to numeric type (currently it's a factor).
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
Then simply
tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)
# a, year, ITA, GSA 17, gear 1 b, month, HVR, GSA 1, gear 2
# 439 78
# b, month, HVR, GSA 17, gear 2 b, month, ITA, GSA 1, gear 2
# 45 96
# b, year, ITA, GSA 12, gear 2 c, year, ESP, GSA 12, gear 2
# 100 325
or
(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))
# [1] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [3] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2"
# [7] "b, month, ITA, GSA 1, gear 2" "b, year, ITA, GSA 12, gear 2"
# [9] "c, year, ESP, GSA 12, gear 2" "c, year, ESP, GSA 12, gear 2"
tapply(split.test.input$Quantity, groups, sum)
Also, since you already are using dlply
, you may be interested in something like
ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))
species Time.unit country GSA Gear Sum
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month HVR GSA 17 gear 2 45
# 4 b month ITA GSA 1 gear 2 96
# 5 b year ITA GSA 12 gear 2 100
# 6 c year ESP GSA 12 gear 2 325
Actually it's hard to imagine a purpose for which such a complicated object as split.res
would be needed. What you are asking can be done much simpler.
First, let's convert Quantity
to numeric type (currently it's a factor).
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
Then simply
tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)
# a, year, ITA, GSA 17, gear 1 b, month, HVR, GSA 1, gear 2
# 439 78
# b, month, HVR, GSA 17, gear 2 b, month, ITA, GSA 1, gear 2
# 45 96
# b, year, ITA, GSA 12, gear 2 c, year, ESP, GSA 12, gear 2
# 100 325
or
(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))
# [1] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [3] "a, year, ITA, GSA 17, gear 1" "a, year, ITA, GSA 17, gear 1"
# [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2"
# [7] "b, month, ITA, GSA 1, gear 2" "b, year, ITA, GSA 12, gear 2"
# [9] "c, year, ESP, GSA 12, gear 2" "c, year, ESP, GSA 12, gear 2"
tapply(split.test.input$Quantity, groups, sum)
Also, since you already are using dlply
, you may be interested in something like
ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))
species Time.unit country GSA Gear Sum
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month HVR GSA 17 gear 2 45
# 4 b month ITA GSA 1 gear 2 96
# 5 b year ITA GSA 12 gear 2 100
# 6 c year ESP GSA 12 gear 2 325
edited Nov 22 '18 at 17:23
answered Nov 22 '18 at 16:30
Julius VainoraJulius Vainora
36k76380
36k76380
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, addingna.rm = TRUE
aftersum
would do that.
– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector withtapply
, andaggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).
– Julius Vainora
Nov 23 '18 at 10:53
add a comment |
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, addingna.rm = TRUE
aftersum
would do that.
– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector withtapply
, andaggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).
– Julius Vainora
Nov 23 '18 at 10:53
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
Is it works also for the presence of NA values? thank you
– skylobo
Nov 22 '18 at 17:02
@skylobo, yes, adding
na.rm = TRUE
after sum
would do that.– Julius Vainora
Nov 22 '18 at 17:04
@skylobo, yes, adding
na.rm = TRUE
after sum
would do that.– Julius Vainora
Nov 22 '18 at 17:04
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct
– skylobo
Nov 23 '18 at 8:57
@skylobo, in our examples now the only difference is that I got a vector with
tapply
, and aggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).– Julius Vainora
Nov 23 '18 at 10:53
@skylobo, in our examples now the only difference is that I got a vector with
tapply
, and aggregate
gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6)
in my case).– Julius Vainora
Nov 23 '18 at 10:53
add a comment |
Consider aggregate on multiple columns:
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,
data=split.test.input, FUN=sum)
agg_df
# species Time.unit country GSA Gear Quantity
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month ITA GSA 1 gear 2 96
# 4 c year ESP GSA 12 gear 2 325
# 5 b year ITA GSA 12 gear 2 100
# 6 b month HVR GSA 17 gear 2 45
If needing a list, run by
(object-oriented wrapper to tapply
) with paste(..., collapse="")
for combination column:
df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],
function(sub) unique(transform(sub,
combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),
sum = sum(sub$Quantity))[c("combination", "sum")])
)
df_list <- Filter(NROW, df_list)
df_list
# [[1]]
# combination sum
# 1 a year ITA GSA 17 gear 1 439
# [[2]]
# combination sum
# 6 b month HVR GSA 1 gear 2 78
# [[3]]
# combination sum
# 7 b month ITA GSA 1 gear 2 96
# [[4]]
# combination sum
# 9 c year ESP GSA 12 gear 2 325
# [[5]]
# combination sum
# 8 b year ITA GSA 12 gear 2 100
# [[6]]
# combination sum
# 5 b month HVR GSA 17 gear 2 45
add a comment |
Consider aggregate on multiple columns:
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,
data=split.test.input, FUN=sum)
agg_df
# species Time.unit country GSA Gear Quantity
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month ITA GSA 1 gear 2 96
# 4 c year ESP GSA 12 gear 2 325
# 5 b year ITA GSA 12 gear 2 100
# 6 b month HVR GSA 17 gear 2 45
If needing a list, run by
(object-oriented wrapper to tapply
) with paste(..., collapse="")
for combination column:
df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],
function(sub) unique(transform(sub,
combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),
sum = sum(sub$Quantity))[c("combination", "sum")])
)
df_list <- Filter(NROW, df_list)
df_list
# [[1]]
# combination sum
# 1 a year ITA GSA 17 gear 1 439
# [[2]]
# combination sum
# 6 b month HVR GSA 1 gear 2 78
# [[3]]
# combination sum
# 7 b month ITA GSA 1 gear 2 96
# [[4]]
# combination sum
# 9 c year ESP GSA 12 gear 2 325
# [[5]]
# combination sum
# 8 b year ITA GSA 12 gear 2 100
# [[6]]
# combination sum
# 5 b month HVR GSA 17 gear 2 45
add a comment |
Consider aggregate on multiple columns:
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,
data=split.test.input, FUN=sum)
agg_df
# species Time.unit country GSA Gear Quantity
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month ITA GSA 1 gear 2 96
# 4 c year ESP GSA 12 gear 2 325
# 5 b year ITA GSA 12 gear 2 100
# 6 b month HVR GSA 17 gear 2 45
If needing a list, run by
(object-oriented wrapper to tapply
) with paste(..., collapse="")
for combination column:
df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],
function(sub) unique(transform(sub,
combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),
sum = sum(sub$Quantity))[c("combination", "sum")])
)
df_list <- Filter(NROW, df_list)
df_list
# [[1]]
# combination sum
# 1 a year ITA GSA 17 gear 1 439
# [[2]]
# combination sum
# 6 b month HVR GSA 1 gear 2 78
# [[3]]
# combination sum
# 7 b month ITA GSA 1 gear 2 96
# [[4]]
# combination sum
# 9 c year ESP GSA 12 gear 2 325
# [[5]]
# combination sum
# 8 b year ITA GSA 12 gear 2 100
# [[6]]
# combination sum
# 5 b month HVR GSA 17 gear 2 45
Consider aggregate on multiple columns:
split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))
agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,
data=split.test.input, FUN=sum)
agg_df
# species Time.unit country GSA Gear Quantity
# 1 a year ITA GSA 17 gear 1 439
# 2 b month HVR GSA 1 gear 2 78
# 3 b month ITA GSA 1 gear 2 96
# 4 c year ESP GSA 12 gear 2 325
# 5 b year ITA GSA 12 gear 2 100
# 6 b month HVR GSA 17 gear 2 45
If needing a list, run by
(object-oriented wrapper to tapply
) with paste(..., collapse="")
for combination column:
df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],
function(sub) unique(transform(sub,
combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),
sum = sum(sub$Quantity))[c("combination", "sum")])
)
df_list <- Filter(NROW, df_list)
df_list
# [[1]]
# combination sum
# 1 a year ITA GSA 17 gear 1 439
# [[2]]
# combination sum
# 6 b month HVR GSA 1 gear 2 78
# [[3]]
# combination sum
# 7 b month ITA GSA 1 gear 2 96
# [[4]]
# combination sum
# 9 c year ESP GSA 12 gear 2 325
# [[5]]
# combination sum
# 8 b year ITA GSA 12 gear 2 100
# [[6]]
# combination sum
# 5 b month HVR GSA 17 gear 2 45
edited Nov 22 '18 at 16:37
answered Nov 22 '18 at 16:30
ParfaitParfait
51k84270
51k84270
add a comment |
add a comment |
We could use tidyverse
library(tidyverse)
split.test.input %>%
group_by_at(vars(names(.)[c(1, 3:6)])) %>%
summarise(Quantity = sum(parse_number(Quantity)))
# A tibble: 6 x 6
# Groups: species, Time.unit, country, GSA [?]
# species Time.unit country GSA Gear Quantity
# <fct> <fct> <fct> <fct> <fct> <dbl>
#1 a year ITA GSA 17 gear 1 439
#2 b month HVR GSA 1 gear 2 78
#3 b month HVR GSA 17 gear 2 45
#4 b month ITA GSA 1 gear 2 96
#5 b year ITA GSA 12 gear 2 100
#6 c year ESP GSA 12 gear 2 325
add a comment |
We could use tidyverse
library(tidyverse)
split.test.input %>%
group_by_at(vars(names(.)[c(1, 3:6)])) %>%
summarise(Quantity = sum(parse_number(Quantity)))
# A tibble: 6 x 6
# Groups: species, Time.unit, country, GSA [?]
# species Time.unit country GSA Gear Quantity
# <fct> <fct> <fct> <fct> <fct> <dbl>
#1 a year ITA GSA 17 gear 1 439
#2 b month HVR GSA 1 gear 2 78
#3 b month HVR GSA 17 gear 2 45
#4 b month ITA GSA 1 gear 2 96
#5 b year ITA GSA 12 gear 2 100
#6 c year ESP GSA 12 gear 2 325
add a comment |
We could use tidyverse
library(tidyverse)
split.test.input %>%
group_by_at(vars(names(.)[c(1, 3:6)])) %>%
summarise(Quantity = sum(parse_number(Quantity)))
# A tibble: 6 x 6
# Groups: species, Time.unit, country, GSA [?]
# species Time.unit country GSA Gear Quantity
# <fct> <fct> <fct> <fct> <fct> <dbl>
#1 a year ITA GSA 17 gear 1 439
#2 b month HVR GSA 1 gear 2 78
#3 b month HVR GSA 17 gear 2 45
#4 b month ITA GSA 1 gear 2 96
#5 b year ITA GSA 12 gear 2 100
#6 c year ESP GSA 12 gear 2 325
We could use tidyverse
library(tidyverse)
split.test.input %>%
group_by_at(vars(names(.)[c(1, 3:6)])) %>%
summarise(Quantity = sum(parse_number(Quantity)))
# A tibble: 6 x 6
# Groups: species, Time.unit, country, GSA [?]
# species Time.unit country GSA Gear Quantity
# <fct> <fct> <fct> <fct> <fct> <dbl>
#1 a year ITA GSA 17 gear 1 439
#2 b month HVR GSA 1 gear 2 78
#3 b month HVR GSA 17 gear 2 45
#4 b month ITA GSA 1 gear 2 96
#5 b year ITA GSA 12 gear 2 100
#6 c year ESP GSA 12 gear 2 325
answered Nov 22 '18 at 17:11
akrunakrun
404k13196269
404k13196269
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434775%2fhow-to-calculate-dynamically-the-sum-for-each-element-of-nestled-list-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown