How to calculate dynamically the sum for each element of nestled list in r?

I have this data frame:

split.test.input <- data.frame(matrix(ncol=7,nrow=10,

                        c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,

                          rep("year",4),rep("month",3),rep("year",3),

                        rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),

                      rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),

                      rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))



colnames(split.test.input) <-  c("species", "year", "Time.unit","country","GSA","Gear","Quantity")

I split for many variable:

split.res <- dlply(split.test.input,.(species),

      dlply,.(Time.unit),

      dlply,.(country),

      dlply,.(GSA),

      dlply,.(Gear))

Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):

df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])     





colnames(df.fromSplit) <-  "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list

     df.fromSplit

           a,year,ITA,GSA 17,gear.1

        1                    75

        2                    45

        3                    230

        4                    89

I would like to calculate sum for this column:

sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))     

   439

but it's not elegant...

IMPORTANT

I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:

    combination             sum

a,year,ITA,GSA 17,gear.1    439

b,month,HVR,GSA.1,gear.2    78

[...]

and so on for each combination of list

I thought a for loop which can extract each element of a list and it calculate the sum for the quantity of each list, but with for loop I don't know how extract each list based on variables (my experience with a list is very low)

asked Nov 22 '18 at 16:11

skylobo

766

add a comment |

I have this data frame:

split.test.input <- data.frame(matrix(ncol=7,nrow=10,

                        c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,

                          rep("year",4),rep("month",3),rep("year",3),

                        rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),

                      rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),

                      rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))



colnames(split.test.input) <-  c("species", "year", "Time.unit","country","GSA","Gear","Quantity")

I split for many variable:

split.res <- dlply(split.test.input,.(species),

      dlply,.(Time.unit),

      dlply,.(country),

      dlply,.(GSA),

      dlply,.(Gear))

Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):

df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])     





colnames(df.fromSplit) <-  "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list

     df.fromSplit

           a,year,ITA,GSA 17,gear.1

        1                    75

        2                    45

        3                    230

        4                    89

I would like to calculate sum for this column:

sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))     

   439

but it's not elegant...

IMPORTANT

I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:

    combination             sum

a,year,ITA,GSA 17,gear.1    439

b,month,HVR,GSA.1,gear.2    78

[...]

and so on for each combination of list

asked Nov 22 '18 at 16:11

skylobo

766

add a comment |

I have this data frame:

split.test.input <- data.frame(matrix(ncol=7,nrow=10,

                        c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,

                          rep("year",4),rep("month",3),rep("year",3),

                        rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),

                      rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),

                      rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))



colnames(split.test.input) <-  c("species", "year", "Time.unit","country","GSA","Gear","Quantity")

I split for many variable:

split.res <- dlply(split.test.input,.(species),

      dlply,.(Time.unit),

      dlply,.(country),

      dlply,.(GSA),

      dlply,.(Gear))

Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):

df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])     





colnames(df.fromSplit) <-  "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list

     df.fromSplit

           a,year,ITA,GSA 17,gear.1

        1                    75

        2                    45

        3                    230

        4                    89

I would like to calculate sum for this column:

sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))     

   439

but it's not elegant...

IMPORTANT

I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:

    combination             sum

a,year,ITA,GSA 17,gear.1    439

b,month,HVR,GSA.1,gear.2    78

[...]

and so on for each combination of list

asked Nov 22 '18 at 16:11

skylobo

766

I have this data frame:

split.test.input <- data.frame(matrix(ncol=7,nrow=10,

                        c(rep("a",4),rep("b",4),rep("c",2),1910:1913,1902:1905,1925:1926,

                          rep("year",4),rep("month",3),rep("year",3),

                        rep("ITA",4),rep("HVR",2),rep("ITA",2),rep("ESP",2),

                      rep("GSA 17",5),rep("GSA 1",2),rep("GSA 12",3),

                      rep("gear 1",4),rep("gear 2",6),75,45,230,89,45,78,96,100,125,200)))



colnames(split.test.input) <-  c("species", "year", "Time.unit","country","GSA","Gear","Quantity")

I split for many variable:

split.res <- dlply(split.test.input,.(species),

      dlply,.(Time.unit),

      dlply,.(country),

      dlply,.(GSA),

      dlply,.(Gear))

Now, I would like to calculate some statistical analysis (in this case sum) for each quantity of each element of a list, I extract for example the first list (list of a list of a list etc..):

df.fromSplit <- data.frame(split.res[["a"]][["year"]][["ITA"]][["GSA 17"]][["gear 1"]][["Quantity"]])     





colnames(df.fromSplit) <-  "a,year,ITA,GSA 17,gear.1" #the name of my variables for the first list

     df.fromSplit

           a,year,ITA,GSA 17,gear.1

        1                    75

        2                    45

        3                    230

        4                    89

I would like to calculate sum for this column:

sum(as.numeric(levels(df.fromSplit[,1])[df.fromSplit[,1]] ))     

   439

but it's not elegant...

IMPORTANT

I would like to calculate dynamically the sum for each quantity
of each element of my list. The result could be (more or less) a data
frame or many data frame (one for each list) as:

    combination             sum

a,year,ITA,GSA 17,gear.1    439

b,month,HVR,GSA.1,gear.2    78

[...]

and so on for each combination of list

r list dataframe for-loop split

asked Nov 22 '18 at 16:11

skylobo

766

asked Nov 22 '18 at 16:11

skylobo

766

asked Nov 22 '18 at 16:11

skylobo

766

asked Nov 22 '18 at 16:11

skylobo

766

asked Nov 22 '18 at 16:11

skylobo

766

add a comment |

3 Answers
3

active

oldest

votes

Actually it's hard to imagine a purpose for which such a complicated object as split.res would be needed. What you are asking can be done much simpler.

First, let's convert Quantity to numeric type (currently it's a factor).

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))

Then simply

tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)

#  a, year, ITA, GSA 17, gear 1  b, month, HVR, GSA 1, gear 2 

#                           439                            78 

# b, month, HVR, GSA 17, gear 2  b, month, ITA, GSA 1, gear 2 

#                            45                            96 

#  b, year, ITA, GSA 12, gear 2  c, year, ESP, GSA 12, gear 2 

#                           100                           325

(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))

#  [1] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [3] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2" 

#  [7] "b, month, ITA, GSA 1, gear 2"  "b, year, ITA, GSA 12, gear 2" 

#  [9] "c, year, ESP, GSA 12, gear 2"  "c, year, ESP, GSA 12, gear 2" 

tapply(split.test.input$Quantity, groups, sum)

Also, since you already are using dlply, you may be interested in something like

ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))

  species Time.unit country    GSA   Gear Sum

# 1       a      year     ITA GSA 17 gear 1 439

# 2       b     month     HVR  GSA 1 gear 2  78

# 3       b     month     HVR GSA 17 gear 2  45

# 4       b     month     ITA  GSA 1 gear 2  96

# 5       b      year     ITA GSA 12 gear 2 100

# 6       c      year     ESP GSA 12 gear 2 325

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

add a comment |

Consider aggregate on multiple columns:

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))



agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,

                    data=split.test.input, FUN=sum)



agg_df

#   species Time.unit country    GSA   Gear Quantity

# 1       a      year     ITA GSA 17 gear 1      439

# 2       b     month     HVR  GSA 1 gear 2       78

# 3       b     month     ITA  GSA 1 gear 2       96

# 4       c      year     ESP GSA 12 gear 2      325

# 5       b      year     ITA GSA 12 gear 2      100

# 6       b     month     HVR GSA 17 gear 2       45

If needing a list, run by (object-oriented wrapper to tapply) with paste(..., collapse="") for combination column:

df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],

              function(sub) unique(transform(sub,

                                             combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),

                                             sum = sum(sub$Quantity))[c("combination", "sum")])

)

df_list <- Filter(NROW, df_list)

df_list



# [[1]]

#                combination sum

# 1 a year ITA GSA 17 gear 1 439



# [[2]]

#                combination sum

# 6 b month HVR GSA 1 gear 2  78



# [[3]]

#                combination sum

# 7 b month ITA GSA 1 gear 2  96



# [[4]]

#                combination sum

# 9 c year ESP GSA 12 gear 2 325



# [[5]]

#                combination sum

# 8 b year ITA GSA 12 gear 2 100



# [[6]]

#                 combination sum

# 5 b month HVR GSA 17 gear 2  45

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

add a comment |

We could use tidyverse

library(tidyverse)

split.test.input %>%

    group_by_at(vars(names(.)[c(1, 3:6)])) %>% 

    summarise(Quantity = sum(parse_number(Quantity)))

# A tibble: 6 x 6

# Groups:   species, Time.unit, country, GSA [?]

#  species Time.unit country GSA    Gear   Quantity

#  <fct>   <fct>     <fct>   <fct>  <fct>     <dbl>

#1 a       year      ITA     GSA 17 gear 1      439

#2 b       month     HVR     GSA 1  gear 2       78

#3 b       month     HVR     GSA 17 gear 2       45

#4 b       month     ITA     GSA 1  gear 2       96

#5 b       year      ITA     GSA 12 gear 2      100

#6 c       year      ESP     GSA 12 gear 2      325

answered Nov 22 '18 at 17:11

akrun

404k13196269

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53434775%2fhow-to-calculate-dynamically-the-sum-for-each-element-of-nestled-list-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Actually it's hard to imagine a purpose for which such a complicated object as split.res would be needed. What you are asking can be done much simpler.

First, let's convert Quantity to numeric type (currently it's a factor).

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))

Then simply

tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)

#  a, year, ITA, GSA 17, gear 1  b, month, HVR, GSA 1, gear 2 

#                           439                            78 

# b, month, HVR, GSA 17, gear 2  b, month, ITA, GSA 1, gear 2 

#                            45                            96 

#  b, year, ITA, GSA 12, gear 2  c, year, ESP, GSA 12, gear 2 

#                           100                           325

(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))

#  [1] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [3] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2" 

#  [7] "b, month, ITA, GSA 1, gear 2"  "b, year, ITA, GSA 12, gear 2" 

#  [9] "c, year, ESP, GSA 12, gear 2"  "c, year, ESP, GSA 12, gear 2" 

tapply(split.test.input$Quantity, groups, sum)

Also, since you already are using dlply, you may be interested in something like

ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))

  species Time.unit country    GSA   Gear Sum

# 1       a      year     ITA GSA 17 gear 1 439

# 2       b     month     HVR  GSA 1 gear 2  78

# 3       b     month     HVR GSA 17 gear 2  45

# 4       b     month     ITA  GSA 1 gear 2  96

# 5       b      year     ITA GSA 12 gear 2 100

# 6       c      year     ESP GSA 12 gear 2 325

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

add a comment |

Actually it's hard to imagine a purpose for which such a complicated object as split.res would be needed. What you are asking can be done much simpler.

First, let's convert Quantity to numeric type (currently it's a factor).

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))

Then simply

tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)

#  a, year, ITA, GSA 17, gear 1  b, month, HVR, GSA 1, gear 2 

#                           439                            78 

# b, month, HVR, GSA 17, gear 2  b, month, ITA, GSA 1, gear 2 

#                            45                            96 

#  b, year, ITA, GSA 12, gear 2  c, year, ESP, GSA 12, gear 2 

#                           100                           325

(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))

#  [1] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [3] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2" 

#  [7] "b, month, ITA, GSA 1, gear 2"  "b, year, ITA, GSA 12, gear 2" 

#  [9] "c, year, ESP, GSA 12, gear 2"  "c, year, ESP, GSA 12, gear 2" 

tapply(split.test.input$Quantity, groups, sum)

Also, since you already are using dlply, you may be interested in something like

ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))

  species Time.unit country    GSA   Gear Sum

# 1       a      year     ITA GSA 17 gear 1 439

# 2       b     month     HVR  GSA 1 gear 2  78

# 3       b     month     HVR GSA 17 gear 2  45

# 4       b     month     ITA  GSA 1 gear 2  96

# 5       b      year     ITA GSA 12 gear 2 100

# 6       c      year     ESP GSA 12 gear 2 325

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

add a comment |

Actually it's hard to imagine a purpose for which such a complicated object as split.res would be needed. What you are asking can be done much simpler.

First, let's convert Quantity to numeric type (currently it's a factor).

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))

Then simply

tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)

#  a, year, ITA, GSA 17, gear 1  b, month, HVR, GSA 1, gear 2 

#                           439                            78 

# b, month, HVR, GSA 17, gear 2  b, month, ITA, GSA 1, gear 2 

#                            45                            96 

#  b, year, ITA, GSA 12, gear 2  c, year, ESP, GSA 12, gear 2 

#                           100                           325

(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))

#  [1] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [3] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2" 

#  [7] "b, month, ITA, GSA 1, gear 2"  "b, year, ITA, GSA 12, gear 2" 

#  [9] "c, year, ESP, GSA 12, gear 2"  "c, year, ESP, GSA 12, gear 2" 

tapply(split.test.input$Quantity, groups, sum)

Also, since you already are using dlply, you may be interested in something like

ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))

  species Time.unit country    GSA   Gear Sum

# 1       a      year     ITA GSA 17 gear 1 439

# 2       b     month     HVR  GSA 1 gear 2  78

# 3       b     month     HVR GSA 17 gear 2  45

# 4       b     month     ITA  GSA 1 gear 2  96

# 5       b      year     ITA GSA 12 gear 2 100

# 6       c      year     ESP GSA 12 gear 2 325

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

Actually it's hard to imagine a purpose for which such a complicated object as split.res would be needed. What you are asking can be done much simpler.

First, let's convert Quantity to numeric type (currently it's a factor).

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))

Then simply

tapply(split.test.input$Quantity, apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "), sum)

#  a, year, ITA, GSA 17, gear 1  b, month, HVR, GSA 1, gear 2 

#                           439                            78 

# b, month, HVR, GSA 17, gear 2  b, month, ITA, GSA 1, gear 2 

#                            45                            96 

#  b, year, ITA, GSA 12, gear 2  c, year, ESP, GSA 12, gear 2 

#                           100                           325

(groups <- apply(split.test.input[c(1, 3:6)], 1, paste0, collapse = ", "))

#  [1] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [3] "a, year, ITA, GSA 17, gear 1"  "a, year, ITA, GSA 17, gear 1" 

#  [5] "b, month, HVR, GSA 17, gear 2" "b, month, HVR, GSA 1, gear 2" 

#  [7] "b, month, ITA, GSA 1, gear 2"  "b, year, ITA, GSA 12, gear 2" 

#  [9] "c, year, ESP, GSA 12, gear 2"  "c, year, ESP, GSA 12, gear 2" 

tapply(split.test.input$Quantity, groups, sum)

Also, since you already are using dlply, you may be interested in something like

ddply(split.test.input, .(species, Time.unit, country, GSA, Gear), summarise, Sum = sum(Quantity))

  species Time.unit country    GSA   Gear Sum

# 1       a      year     ITA GSA 17 gear 1 439

# 2       b     month     HVR  GSA 1 gear 2  78

# 3       b     month     HVR GSA 17 gear 2  45

# 4       b     month     ITA  GSA 1 gear 2  96

# 5       b      year     ITA GSA 12 gear 2 100

# 6       c      year     ESP GSA 12 gear 2 325

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

edited Nov 22 '18 at 17:23

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

answered Nov 22 '18 at 16:30

Julius Vainora

36k76380

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

add a comment |

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

Is it works also for the presence of NA values? thank you

– skylobo
Nov 22 '18 at 17:02

@skylobo, yes, adding na.rm = TRUE after sum would do that.

– Julius Vainora
Nov 22 '18 at 17:04

Thank you. I don't understand the difference between tapply and aggregate, I think that the results are different. In my case (my true df) tapply is correct

– skylobo
Nov 23 '18 at 8:57

@skylobo, in our examples now the only difference is that I got a vector with tapply, and aggregate gave a data frame. If you get different results, perhaps there is something about variable names or indices (c(1, 3:6) in my case).

– Julius Vainora
Nov 23 '18 at 10:53

add a comment |

Consider aggregate on multiple columns:

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))



agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,

                    data=split.test.input, FUN=sum)



agg_df

#   species Time.unit country    GSA   Gear Quantity

# 1       a      year     ITA GSA 17 gear 1      439

# 2       b     month     HVR  GSA 1 gear 2       78

# 3       b     month     ITA  GSA 1 gear 2       96

# 4       c      year     ESP GSA 12 gear 2      325

# 5       b      year     ITA GSA 12 gear 2      100

# 6       b     month     HVR GSA 17 gear 2       45

If needing a list, run by (object-oriented wrapper to tapply) with paste(..., collapse="") for combination column:

df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],

              function(sub) unique(transform(sub,

                                             combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),

                                             sum = sum(sub$Quantity))[c("combination", "sum")])

)

df_list <- Filter(NROW, df_list)

df_list



# [[1]]

#                combination sum

# 1 a year ITA GSA 17 gear 1 439



# [[2]]

#                combination sum

# 6 b month HVR GSA 1 gear 2  78



# [[3]]

#                combination sum

# 7 b month ITA GSA 1 gear 2  96



# [[4]]

#                combination sum

# 9 c year ESP GSA 12 gear 2 325



# [[5]]

#                combination sum

# 8 b year ITA GSA 12 gear 2 100



# [[6]]

#                 combination sum

# 5 b month HVR GSA 17 gear 2  45

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

add a comment |

Consider aggregate on multiple columns:

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))



agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,

                    data=split.test.input, FUN=sum)



agg_df

#   species Time.unit country    GSA   Gear Quantity

# 1       a      year     ITA GSA 17 gear 1      439

# 2       b     month     HVR  GSA 1 gear 2       78

# 3       b     month     ITA  GSA 1 gear 2       96

# 4       c      year     ESP GSA 12 gear 2      325

# 5       b      year     ITA GSA 12 gear 2      100

# 6       b     month     HVR GSA 17 gear 2       45

If needing a list, run by (object-oriented wrapper to tapply) with paste(..., collapse="") for combination column:

df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],

              function(sub) unique(transform(sub,

                                             combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),

                                             sum = sum(sub$Quantity))[c("combination", "sum")])

)

df_list <- Filter(NROW, df_list)

df_list



# [[1]]

#                combination sum

# 1 a year ITA GSA 17 gear 1 439



# [[2]]

#                combination sum

# 6 b month HVR GSA 1 gear 2  78



# [[3]]

#                combination sum

# 7 b month ITA GSA 1 gear 2  96



# [[4]]

#                combination sum

# 9 c year ESP GSA 12 gear 2 325



# [[5]]

#                combination sum

# 8 b year ITA GSA 12 gear 2 100



# [[6]]

#                 combination sum

# 5 b month HVR GSA 17 gear 2  45

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

add a comment |

Consider aggregate on multiple columns:

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))



agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,

                    data=split.test.input, FUN=sum)



agg_df

#   species Time.unit country    GSA   Gear Quantity

# 1       a      year     ITA GSA 17 gear 1      439

# 2       b     month     HVR  GSA 1 gear 2       78

# 3       b     month     ITA  GSA 1 gear 2       96

# 4       c      year     ESP GSA 12 gear 2      325

# 5       b      year     ITA GSA 12 gear 2      100

# 6       b     month     HVR GSA 17 gear 2       45

If needing a list, run by (object-oriented wrapper to tapply) with paste(..., collapse="") for combination column:

df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],

              function(sub) unique(transform(sub,

                                             combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),

                                             sum = sum(sub$Quantity))[c("combination", "sum")])

)

df_list <- Filter(NROW, df_list)

df_list



# [[1]]

#                combination sum

# 1 a year ITA GSA 17 gear 1 439



# [[2]]

#                combination sum

# 6 b month HVR GSA 1 gear 2  78



# [[3]]

#                combination sum

# 7 b month ITA GSA 1 gear 2  96



# [[4]]

#                combination sum

# 9 c year ESP GSA 12 gear 2 325



# [[5]]

#                combination sum

# 8 b year ITA GSA 12 gear 2 100



# [[6]]

#                 combination sum

# 5 b month HVR GSA 17 gear 2  45

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

Consider aggregate on multiple columns:

split.test.input$Quantity <- as.numeric(as.character(split.test.input$Quantity))



agg_df <- aggregate(Quantity ~ species + Time.unit + country + GSA + Gear,

                    data=split.test.input, FUN=sum)



agg_df

#   species Time.unit country    GSA   Gear Quantity

# 1       a      year     ITA GSA 17 gear 1      439

# 2       b     month     HVR  GSA 1 gear 2       78

# 3       b     month     ITA  GSA 1 gear 2       96

# 4       c      year     ESP GSA 12 gear 2      325

# 5       b      year     ITA GSA 12 gear 2      100

# 6       b     month     HVR GSA 17 gear 2       45

If needing a list, run by (object-oriented wrapper to tapply) with paste(..., collapse="") for combination column:

df_list <- by(split.test.input, split.test.input[c("species", "Time.unit", "country", "GSA", "Gear")],

              function(sub) unique(transform(sub,

                                             combination = paste(unique(sub[c("species", "Time.unit", "country", "GSA", "Gear")]), collapse=" "),

                                             sum = sum(sub$Quantity))[c("combination", "sum")])

)

df_list <- Filter(NROW, df_list)

df_list



# [[1]]

#                combination sum

# 1 a year ITA GSA 17 gear 1 439



# [[2]]

#                combination sum

# 6 b month HVR GSA 1 gear 2  78



# [[3]]

#                combination sum

# 7 b month ITA GSA 1 gear 2  96



# [[4]]

#                combination sum

# 9 c year ESP GSA 12 gear 2 325



# [[5]]

#                combination sum

# 8 b year ITA GSA 12 gear 2 100



# [[6]]

#                 combination sum

# 5 b month HVR GSA 17 gear 2  45

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

edited Nov 22 '18 at 16:37

answered Nov 22 '18 at 16:30

Parfait

51k84270

answered Nov 22 '18 at 16:30

Parfait

51k84270

answered Nov 22 '18 at 16:30

Parfait

51k84270

add a comment |

We could use tidyverse

library(tidyverse)

split.test.input %>%

    group_by_at(vars(names(.)[c(1, 3:6)])) %>% 

    summarise(Quantity = sum(parse_number(Quantity)))

# A tibble: 6 x 6

# Groups:   species, Time.unit, country, GSA [?]

#  species Time.unit country GSA    Gear   Quantity

#  <fct>   <fct>     <fct>   <fct>  <fct>     <dbl>

#1 a       year      ITA     GSA 17 gear 1      439

#2 b       month     HVR     GSA 1  gear 2       78

#3 b       month     HVR     GSA 17 gear 2       45

#4 b       month     ITA     GSA 1  gear 2       96

#5 b       year      ITA     GSA 12 gear 2      100

#6 c       year      ESP     GSA 12 gear 2      325

answered Nov 22 '18 at 17:11

akrun

404k13196269

add a comment |

We could use tidyverse

library(tidyverse)

split.test.input %>%

    group_by_at(vars(names(.)[c(1, 3:6)])) %>% 

    summarise(Quantity = sum(parse_number(Quantity)))

# A tibble: 6 x 6

# Groups:   species, Time.unit, country, GSA [?]

#  species Time.unit country GSA    Gear   Quantity

#  <fct>   <fct>     <fct>   <fct>  <fct>     <dbl>

#1 a       year      ITA     GSA 17 gear 1      439

#2 b       month     HVR     GSA 1  gear 2       78

#3 b       month     HVR     GSA 17 gear 2       45

#4 b       month     ITA     GSA 1  gear 2       96

#5 b       year      ITA     GSA 12 gear 2      100

#6 c       year      ESP     GSA 12 gear 2      325

answered Nov 22 '18 at 17:11

akrun

404k13196269

add a comment |

We could use tidyverse

library(tidyverse)

split.test.input %>%

    group_by_at(vars(names(.)[c(1, 3:6)])) %>% 

    summarise(Quantity = sum(parse_number(Quantity)))

# A tibble: 6 x 6

# Groups:   species, Time.unit, country, GSA [?]

#  species Time.unit country GSA    Gear   Quantity

#  <fct>   <fct>     <fct>   <fct>  <fct>     <dbl>

#1 a       year      ITA     GSA 17 gear 1      439

#2 b       month     HVR     GSA 1  gear 2       78

#3 b       month     HVR     GSA 17 gear 2       45

#4 b       month     ITA     GSA 1  gear 2       96

#5 b       year      ITA     GSA 12 gear 2      100

#6 c       year      ESP     GSA 12 gear 2      325

answered Nov 22 '18 at 17:11

akrun

404k13196269

We could use tidyverse

library(tidyverse)

split.test.input %>%

    group_by_at(vars(names(.)[c(1, 3:6)])) %>% 

    summarise(Quantity = sum(parse_number(Quantity)))

# A tibble: 6 x 6

# Groups:   species, Time.unit, country, GSA [?]

#  species Time.unit country GSA    Gear   Quantity

#  <fct>   <fct>     <fct>   <fct>  <fct>     <dbl>

#1 a       year      ITA     GSA 17 gear 1      439

#2 b       month     HVR     GSA 1  gear 2       78

#3 b       month     HVR     GSA 17 gear 2       45

#4 b       month     ITA     GSA 1  gear 2       96

#5 b       year      ITA     GSA 12 gear 2      100

#6 c       year      ESP     GSA 12 gear 2      325

answered Nov 22 '18 at 17:11

akrun

404k13196269

answered Nov 22 '18 at 17:11

akrun

404k13196269

answered Nov 22 '18 at 17:11

akrun

404k13196269

answered Nov 22 '18 at 17:11

akrun

404k13196269

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

AskaSTwmfFk6,Hpixp7gQyB,Yx61JU,tLh5E7tt6lbPfyjAfal U,7XaGfUA1c,wS0Udz,J ovfdtgA1B4QsNI4 J5hoHQ

搜尋此網誌

Ytukyg