Sum of independent Binomial random variables with different probabilities












12












$begingroup$


Suppose I have independent random variables $X_i$ which are distributed binomially via
$$X_i sim mathrm{Bin}(n_i, p_i)$$.



Are there relatively simple formulae or at least bounds for the distribution
$$S = sum_i X_i$$
available?










share|cite|improve this question











$endgroup$

















    12












    $begingroup$


    Suppose I have independent random variables $X_i$ which are distributed binomially via
    $$X_i sim mathrm{Bin}(n_i, p_i)$$.



    Are there relatively simple formulae or at least bounds for the distribution
    $$S = sum_i X_i$$
    available?










    share|cite|improve this question











    $endgroup$















      12












      12








      12


      4



      $begingroup$


      Suppose I have independent random variables $X_i$ which are distributed binomially via
      $$X_i sim mathrm{Bin}(n_i, p_i)$$.



      Are there relatively simple formulae or at least bounds for the distribution
      $$S = sum_i X_i$$
      available?










      share|cite|improve this question











      $endgroup$




      Suppose I have independent random variables $X_i$ which are distributed binomially via
      $$X_i sim mathrm{Bin}(n_i, p_i)$$.



      Are there relatively simple formulae or at least bounds for the distribution
      $$S = sum_i X_i$$
      available?







      probability-distributions random






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Dec 23 '16 at 23:45









      suomynonA

      5,64422557




      5,64422557










      asked Mar 30 '11 at 19:35









      LagerbaerLagerbaer

      2,05621726




      2,05621726






















          4 Answers
          4






          active

          oldest

          votes


















          9












          $begingroup$

          See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).






          share|cite|improve this answer









          $endgroup$













          • $begingroup$
            I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
            $endgroup$
            – May
            Dec 6 '12 at 22:33












          • $begingroup$
            @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
            $endgroup$
            – Michael Kuhn
            Dec 7 '12 at 20:26






          • 1




            $begingroup$
            @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
            $endgroup$
            – May
            Dec 11 '12 at 0:16












          • $begingroup$
            @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
            $endgroup$
            – Michael Kuhn
            Dec 11 '12 at 9:09






          • 1




            $begingroup$
            Page Not Found ...
            $endgroup$
            – Dor
            Sep 13 '15 at 0:30



















          4












          $begingroup$

          This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):



          # explicitly combine two probability distributions, expecting a vector of 
          # probabilities (first element = count 0)
          combine.distributions <- function(a, b) {

          # because of the following computation, make a matrix with more columns than rows
          if (length(a) < length(b)) {
          t <- a
          a <- b
          b <- t
          }

          # explicitly multiply the probability distributions
          m <- a %*% t(b)

          # initialized the final result, element 1 = count 0
          result <- rep(0, length(a)+length(b)-1)

          # add the probabilities, always adding to the next subsequent slice
          # of the result vector
          for (i in 1:nrow(m)) {
          result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
          }

          result
          }

          a <- dbinom(0:1000, 1000, 0.5)
          b <- dbinom(0:2000, 2000, 0.9)

          ab <- combine.distributions(a, b)
          ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)

          plot(ab.df$N, ab.df$p, type="l")





          share|cite|improve this answer









          $endgroup$





















            3












            $begingroup$

            One short answer is that a normal approximation still works well as long as the variance $sigma^2 = sum n_i p_i(1-p_i)$ is not too small. Compute the average $mu = sum n_i p_i$ and the variance, and approximate $S$ by $N(mu,sigma)$.






            share|cite|improve this answer









            $endgroup$













            • $begingroup$
              Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
              $endgroup$
              – Lagerbaer
              Mar 30 '11 at 22:31






            • 1




              $begingroup$
              Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
              $endgroup$
              – Douglas Zare
              Mar 30 '11 at 23:29










            • $begingroup$
              I'd be interested in an estimate on the expected value.
              $endgroup$
              – Lagerbaer
              Mar 31 '11 at 3:03










            • $begingroup$
              To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
              $endgroup$
              – Douglas Zare
              Apr 3 '11 at 4:41












            • $begingroup$
              I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
              $endgroup$
              – Lagerbaer
              Apr 3 '11 at 4:44



















            2












            $begingroup$

            It is possible to get a Chernoff bound using the standard moment generating function method:
            $$
            begin{align}
            Pr[Sge s]
            &le E[exp[t sum_i X_i]]exp(-st)
            \&= expleft(sum_i 1 + (e^t-1) p_iright) exp(-st)
            \&le expleft(sum_i exp((e^t-1) p_i)-stright)
            \&= expleft(s-sum_ip_i-slogfrac{s}{sum_i p_i}right)
            end{align},
            $$

            where we took $t=log(s/sum_ip_i)$.
            This is basically equal to the standard Chernoff bound for equal probabilities, just replaced with the sum (or average if you set $s=n s'$.)



            Here we (surprisingly) used the inequality $1+xle e^x$, but a slightly stronger bound may be possible without it. It'll just be much more messy.



            Another way to look at the bound is that we bound each variable with a poisson distribution with the same mean.






            share|cite|improve this answer











            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "69"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              noCode: true, onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f29998%2fsum-of-independent-binomial-random-variables-with-different-probabilities%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              9












              $begingroup$

              See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).






              share|cite|improve this answer









              $endgroup$













              • $begingroup$
                I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
                $endgroup$
                – May
                Dec 6 '12 at 22:33












              • $begingroup$
                @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
                $endgroup$
                – Michael Kuhn
                Dec 7 '12 at 20:26






              • 1




                $begingroup$
                @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
                $endgroup$
                – May
                Dec 11 '12 at 0:16












              • $begingroup$
                @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
                $endgroup$
                – Michael Kuhn
                Dec 11 '12 at 9:09






              • 1




                $begingroup$
                Page Not Found ...
                $endgroup$
                – Dor
                Sep 13 '15 at 0:30
















              9












              $begingroup$

              See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).






              share|cite|improve this answer









              $endgroup$













              • $begingroup$
                I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
                $endgroup$
                – May
                Dec 6 '12 at 22:33












              • $begingroup$
                @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
                $endgroup$
                – Michael Kuhn
                Dec 7 '12 at 20:26






              • 1




                $begingroup$
                @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
                $endgroup$
                – May
                Dec 11 '12 at 0:16












              • $begingroup$
                @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
                $endgroup$
                – Michael Kuhn
                Dec 11 '12 at 9:09






              • 1




                $begingroup$
                Page Not Found ...
                $endgroup$
                – Dor
                Sep 13 '15 at 0:30














              9












              9








              9





              $begingroup$

              See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).






              share|cite|improve this answer









              $endgroup$



              See this paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens).







              share|cite|improve this answer












              share|cite|improve this answer



              share|cite|improve this answer










              answered Mar 30 '11 at 19:40









              PrimeNumberPrimeNumber

              9,30853968




              9,30853968












              • $begingroup$
                I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
                $endgroup$
                – May
                Dec 6 '12 at 22:33












              • $begingroup$
                @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
                $endgroup$
                – Michael Kuhn
                Dec 7 '12 at 20:26






              • 1




                $begingroup$
                @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
                $endgroup$
                – May
                Dec 11 '12 at 0:16












              • $begingroup$
                @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
                $endgroup$
                – Michael Kuhn
                Dec 11 '12 at 9:09






              • 1




                $begingroup$
                Page Not Found ...
                $endgroup$
                – Dor
                Sep 13 '15 at 0:30


















              • $begingroup$
                I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
                $endgroup$
                – May
                Dec 6 '12 at 22:33












              • $begingroup$
                @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
                $endgroup$
                – Michael Kuhn
                Dec 7 '12 at 20:26






              • 1




                $begingroup$
                @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
                $endgroup$
                – May
                Dec 11 '12 at 0:16












              • $begingroup$
                @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
                $endgroup$
                – Michael Kuhn
                Dec 11 '12 at 9:09






              • 1




                $begingroup$
                Page Not Found ...
                $endgroup$
                – Dor
                Sep 13 '15 at 0:30
















              $begingroup$
              I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
              $endgroup$
              – May
              Dec 6 '12 at 22:33






              $begingroup$
              I have the same question and i read the paper (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). unfortunately the approximations are not clear to me ( for example how are the probabilities in Table 2 calculated?)
              $endgroup$
              – May
              Dec 6 '12 at 22:33














              $begingroup$
              @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
              $endgroup$
              – Michael Kuhn
              Dec 7 '12 at 20:26




              $begingroup$
              @May: I had the same problem these days and I ended up using the explicit formula given in the linked paper. Should be fine if you don't have too many samples.
              $endgroup$
              – Michael Kuhn
              Dec 7 '12 at 20:26




              1




              1




              $begingroup$
              @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
              $endgroup$
              – May
              Dec 11 '12 at 0:16






              $begingroup$
              @MichaelKuhn: so you used poisson binomial distribution function en.wikipedia.org/wiki/Poisson_binomial_distribution, unfortunately I have many samples and I need to use an approximation.
              $endgroup$
              – May
              Dec 11 '12 at 0:16














              $begingroup$
              @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
              $endgroup$
              – Michael Kuhn
              Dec 11 '12 at 9:09




              $begingroup$
              @May: These seems to be an R package for this distribution: cran.r-project.org/web/packages/poibin/poibin.pdf
              $endgroup$
              – Michael Kuhn
              Dec 11 '12 at 9:09




              1




              1




              $begingroup$
              Page Not Found ...
              $endgroup$
              – Dor
              Sep 13 '15 at 0:30




              $begingroup$
              Page Not Found ...
              $endgroup$
              – Dor
              Sep 13 '15 at 0:30











              4












              $begingroup$

              This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):



              # explicitly combine two probability distributions, expecting a vector of 
              # probabilities (first element = count 0)
              combine.distributions <- function(a, b) {

              # because of the following computation, make a matrix with more columns than rows
              if (length(a) < length(b)) {
              t <- a
              a <- b
              b <- t
              }

              # explicitly multiply the probability distributions
              m <- a %*% t(b)

              # initialized the final result, element 1 = count 0
              result <- rep(0, length(a)+length(b)-1)

              # add the probabilities, always adding to the next subsequent slice
              # of the result vector
              for (i in 1:nrow(m)) {
              result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
              }

              result
              }

              a <- dbinom(0:1000, 1000, 0.5)
              b <- dbinom(0:2000, 2000, 0.9)

              ab <- combine.distributions(a, b)
              ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)

              plot(ab.df$N, ab.df$p, type="l")





              share|cite|improve this answer









              $endgroup$


















                4












                $begingroup$

                This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):



                # explicitly combine two probability distributions, expecting a vector of 
                # probabilities (first element = count 0)
                combine.distributions <- function(a, b) {

                # because of the following computation, make a matrix with more columns than rows
                if (length(a) < length(b)) {
                t <- a
                a <- b
                b <- t
                }

                # explicitly multiply the probability distributions
                m <- a %*% t(b)

                # initialized the final result, element 1 = count 0
                result <- rep(0, length(a)+length(b)-1)

                # add the probabilities, always adding to the next subsequent slice
                # of the result vector
                for (i in 1:nrow(m)) {
                result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
                }

                result
                }

                a <- dbinom(0:1000, 1000, 0.5)
                b <- dbinom(0:2000, 2000, 0.9)

                ab <- combine.distributions(a, b)
                ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)

                plot(ab.df$N, ab.df$p, type="l")





                share|cite|improve this answer









                $endgroup$
















                  4












                  4








                  4





                  $begingroup$

                  This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):



                  # explicitly combine two probability distributions, expecting a vector of 
                  # probabilities (first element = count 0)
                  combine.distributions <- function(a, b) {

                  # because of the following computation, make a matrix with more columns than rows
                  if (length(a) < length(b)) {
                  t <- a
                  a <- b
                  b <- t
                  }

                  # explicitly multiply the probability distributions
                  m <- a %*% t(b)

                  # initialized the final result, element 1 = count 0
                  result <- rep(0, length(a)+length(b)-1)

                  # add the probabilities, always adding to the next subsequent slice
                  # of the result vector
                  for (i in 1:nrow(m)) {
                  result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
                  }

                  result
                  }

                  a <- dbinom(0:1000, 1000, 0.5)
                  b <- dbinom(0:2000, 2000, 0.9)

                  ab <- combine.distributions(a, b)
                  ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)

                  plot(ab.df$N, ab.df$p, type="l")





                  share|cite|improve this answer









                  $endgroup$



                  This answer provides an R implementation of the explicit formula from the paper linked in the accepted answer (The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens). (This code can in fact be used to combine any two independent probability distributions):



                  # explicitly combine two probability distributions, expecting a vector of 
                  # probabilities (first element = count 0)
                  combine.distributions <- function(a, b) {

                  # because of the following computation, make a matrix with more columns than rows
                  if (length(a) < length(b)) {
                  t <- a
                  a <- b
                  b <- t
                  }

                  # explicitly multiply the probability distributions
                  m <- a %*% t(b)

                  # initialized the final result, element 1 = count 0
                  result <- rep(0, length(a)+length(b)-1)

                  # add the probabilities, always adding to the next subsequent slice
                  # of the result vector
                  for (i in 1:nrow(m)) {
                  result[i:(ncol(m)+i-1)] <- result[i:(ncol(m)+i-1)] + m[i,]
                  }

                  result
                  }

                  a <- dbinom(0:1000, 1000, 0.5)
                  b <- dbinom(0:2000, 2000, 0.9)

                  ab <- combine.distributions(a, b)
                  ab.df <- data.frame( N = 0:(length(ab)-1), p = ab)

                  plot(ab.df$N, ab.df$p, type="l")






                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered Dec 13 '12 at 10:03









                  Michael KuhnMichael Kuhn

                  1412




                  1412























                      3












                      $begingroup$

                      One short answer is that a normal approximation still works well as long as the variance $sigma^2 = sum n_i p_i(1-p_i)$ is not too small. Compute the average $mu = sum n_i p_i$ and the variance, and approximate $S$ by $N(mu,sigma)$.






                      share|cite|improve this answer









                      $endgroup$













                      • $begingroup$
                        Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                        $endgroup$
                        – Lagerbaer
                        Mar 30 '11 at 22:31






                      • 1




                        $begingroup$
                        Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                        $endgroup$
                        – Douglas Zare
                        Mar 30 '11 at 23:29










                      • $begingroup$
                        I'd be interested in an estimate on the expected value.
                        $endgroup$
                        – Lagerbaer
                        Mar 31 '11 at 3:03










                      • $begingroup$
                        To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                        $endgroup$
                        – Douglas Zare
                        Apr 3 '11 at 4:41












                      • $begingroup$
                        I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                        $endgroup$
                        – Lagerbaer
                        Apr 3 '11 at 4:44
















                      3












                      $begingroup$

                      One short answer is that a normal approximation still works well as long as the variance $sigma^2 = sum n_i p_i(1-p_i)$ is not too small. Compute the average $mu = sum n_i p_i$ and the variance, and approximate $S$ by $N(mu,sigma)$.






                      share|cite|improve this answer









                      $endgroup$













                      • $begingroup$
                        Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                        $endgroup$
                        – Lagerbaer
                        Mar 30 '11 at 22:31






                      • 1




                        $begingroup$
                        Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                        $endgroup$
                        – Douglas Zare
                        Mar 30 '11 at 23:29










                      • $begingroup$
                        I'd be interested in an estimate on the expected value.
                        $endgroup$
                        – Lagerbaer
                        Mar 31 '11 at 3:03










                      • $begingroup$
                        To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                        $endgroup$
                        – Douglas Zare
                        Apr 3 '11 at 4:41












                      • $begingroup$
                        I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                        $endgroup$
                        – Lagerbaer
                        Apr 3 '11 at 4:44














                      3












                      3








                      3





                      $begingroup$

                      One short answer is that a normal approximation still works well as long as the variance $sigma^2 = sum n_i p_i(1-p_i)$ is not too small. Compute the average $mu = sum n_i p_i$ and the variance, and approximate $S$ by $N(mu,sigma)$.






                      share|cite|improve this answer









                      $endgroup$



                      One short answer is that a normal approximation still works well as long as the variance $sigma^2 = sum n_i p_i(1-p_i)$ is not too small. Compute the average $mu = sum n_i p_i$ and the variance, and approximate $S$ by $N(mu,sigma)$.







                      share|cite|improve this answer












                      share|cite|improve this answer



                      share|cite|improve this answer










                      answered Mar 30 '11 at 20:09









                      Douglas ZareDouglas Zare

                      2,7951215




                      2,7951215












                      • $begingroup$
                        Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                        $endgroup$
                        – Lagerbaer
                        Mar 30 '11 at 22:31






                      • 1




                        $begingroup$
                        Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                        $endgroup$
                        – Douglas Zare
                        Mar 30 '11 at 23:29










                      • $begingroup$
                        I'd be interested in an estimate on the expected value.
                        $endgroup$
                        – Lagerbaer
                        Mar 31 '11 at 3:03










                      • $begingroup$
                        To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                        $endgroup$
                        – Douglas Zare
                        Apr 3 '11 at 4:41












                      • $begingroup$
                        I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                        $endgroup$
                        – Lagerbaer
                        Apr 3 '11 at 4:44


















                      • $begingroup$
                        Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                        $endgroup$
                        – Lagerbaer
                        Mar 30 '11 at 22:31






                      • 1




                        $begingroup$
                        Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                        $endgroup$
                        – Douglas Zare
                        Mar 30 '11 at 23:29










                      • $begingroup$
                        I'd be interested in an estimate on the expected value.
                        $endgroup$
                        – Lagerbaer
                        Mar 31 '11 at 3:03










                      • $begingroup$
                        To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                        $endgroup$
                        – Douglas Zare
                        Apr 3 '11 at 4:41












                      • $begingroup$
                        I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                        $endgroup$
                        – Lagerbaer
                        Apr 3 '11 at 4:44
















                      $begingroup$
                      Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                      $endgroup$
                      – Lagerbaer
                      Mar 30 '11 at 22:31




                      $begingroup$
                      Unfortunately, I cannot say anything about the Variance. In what direction would the normal approximation go?
                      $endgroup$
                      – Lagerbaer
                      Mar 30 '11 at 22:31




                      1




                      1




                      $begingroup$
                      Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                      $endgroup$
                      – Douglas Zare
                      Mar 30 '11 at 23:29




                      $begingroup$
                      Do you mean, "is the normal approximation an overestimate or an underestimate?" That depends on the range of values you are considering. Both distributions have total mass $1$.
                      $endgroup$
                      – Douglas Zare
                      Mar 30 '11 at 23:29












                      $begingroup$
                      I'd be interested in an estimate on the expected value.
                      $endgroup$
                      – Lagerbaer
                      Mar 31 '11 at 3:03




                      $begingroup$
                      I'd be interested in an estimate on the expected value.
                      $endgroup$
                      – Lagerbaer
                      Mar 31 '11 at 3:03












                      $begingroup$
                      To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                      $endgroup$
                      – Douglas Zare
                      Apr 3 '11 at 4:41






                      $begingroup$
                      To use a normal approximation, you have to know the mean and variance. (To use the more complicated approximations in the paper PEV cited, you need more information, such as the first 4 moments.) If you don't know the expected value, then what do you know about these binomial summands?
                      $endgroup$
                      – Douglas Zare
                      Apr 3 '11 at 4:41














                      $begingroup$
                      I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                      $endgroup$
                      – Lagerbaer
                      Apr 3 '11 at 4:44




                      $begingroup$
                      I know $n_i$ and $p_i$ of each of the summands, and hence I know the expected value and variance of each of the summands.
                      $endgroup$
                      – Lagerbaer
                      Apr 3 '11 at 4:44











                      2












                      $begingroup$

                      It is possible to get a Chernoff bound using the standard moment generating function method:
                      $$
                      begin{align}
                      Pr[Sge s]
                      &le E[exp[t sum_i X_i]]exp(-st)
                      \&= expleft(sum_i 1 + (e^t-1) p_iright) exp(-st)
                      \&le expleft(sum_i exp((e^t-1) p_i)-stright)
                      \&= expleft(s-sum_ip_i-slogfrac{s}{sum_i p_i}right)
                      end{align},
                      $$

                      where we took $t=log(s/sum_ip_i)$.
                      This is basically equal to the standard Chernoff bound for equal probabilities, just replaced with the sum (or average if you set $s=n s'$.)



                      Here we (surprisingly) used the inequality $1+xle e^x$, but a slightly stronger bound may be possible without it. It'll just be much more messy.



                      Another way to look at the bound is that we bound each variable with a poisson distribution with the same mean.






                      share|cite|improve this answer











                      $endgroup$


















                        2












                        $begingroup$

                        It is possible to get a Chernoff bound using the standard moment generating function method:
                        $$
                        begin{align}
                        Pr[Sge s]
                        &le E[exp[t sum_i X_i]]exp(-st)
                        \&= expleft(sum_i 1 + (e^t-1) p_iright) exp(-st)
                        \&le expleft(sum_i exp((e^t-1) p_i)-stright)
                        \&= expleft(s-sum_ip_i-slogfrac{s}{sum_i p_i}right)
                        end{align},
                        $$

                        where we took $t=log(s/sum_ip_i)$.
                        This is basically equal to the standard Chernoff bound for equal probabilities, just replaced with the sum (or average if you set $s=n s'$.)



                        Here we (surprisingly) used the inequality $1+xle e^x$, but a slightly stronger bound may be possible without it. It'll just be much more messy.



                        Another way to look at the bound is that we bound each variable with a poisson distribution with the same mean.






                        share|cite|improve this answer











                        $endgroup$
















                          2












                          2








                          2





                          $begingroup$

                          It is possible to get a Chernoff bound using the standard moment generating function method:
                          $$
                          begin{align}
                          Pr[Sge s]
                          &le E[exp[t sum_i X_i]]exp(-st)
                          \&= expleft(sum_i 1 + (e^t-1) p_iright) exp(-st)
                          \&le expleft(sum_i exp((e^t-1) p_i)-stright)
                          \&= expleft(s-sum_ip_i-slogfrac{s}{sum_i p_i}right)
                          end{align},
                          $$

                          where we took $t=log(s/sum_ip_i)$.
                          This is basically equal to the standard Chernoff bound for equal probabilities, just replaced with the sum (or average if you set $s=n s'$.)



                          Here we (surprisingly) used the inequality $1+xle e^x$, but a slightly stronger bound may be possible without it. It'll just be much more messy.



                          Another way to look at the bound is that we bound each variable with a poisson distribution with the same mean.






                          share|cite|improve this answer











                          $endgroup$



                          It is possible to get a Chernoff bound using the standard moment generating function method:
                          $$
                          begin{align}
                          Pr[Sge s]
                          &le E[exp[t sum_i X_i]]exp(-st)
                          \&= expleft(sum_i 1 + (e^t-1) p_iright) exp(-st)
                          \&le expleft(sum_i exp((e^t-1) p_i)-stright)
                          \&= expleft(s-sum_ip_i-slogfrac{s}{sum_i p_i}right)
                          end{align},
                          $$

                          where we took $t=log(s/sum_ip_i)$.
                          This is basically equal to the standard Chernoff bound for equal probabilities, just replaced with the sum (or average if you set $s=n s'$.)



                          Here we (surprisingly) used the inequality $1+xle e^x$, but a slightly stronger bound may be possible without it. It'll just be much more messy.



                          Another way to look at the bound is that we bound each variable with a poisson distribution with the same mean.







                          share|cite|improve this answer














                          share|cite|improve this answer



                          share|cite|improve this answer








                          edited Dec 22 '18 at 0:08

























                          answered Dec 21 '18 at 19:24









                          Thomas AhleThomas Ahle

                          1,5171320




                          1,5171320






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Mathematics Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f29998%2fsum-of-independent-binomial-random-variables-with-different-probabilities%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Wiesbaden

                              Marschland

                              Dieringhausen