BigQuery - Cannot use count distinct with scoped aggregation












-1















I have a table as



+--------+------------------+-----------+---------+-------------+ 
|visit_id|browsed_categories | num_seen| num_borrows |
+--------+------------------+-----------+---------+-------------+
|1 | fiction,history | 20 | 3 |
|2 | selfhelp,fiction,science | 15 | 3 |
|3 | cooking,kids,home,selfhelp | 7 | 2 |
+--------+------------------------------+---------+-------------+


and am attempting to summarize this table to find if a correlation exists between the number of distinct browsed categories and borrowals.



+-------------+---------------------------------+-------------------------+
| borrow_rate | num_distinct_browsed_categories | distinct_categories |
+-------------+---------------------------------+-------------------------+
| 0 | 3 | cooking,selfhelp,home |
| 1 | 2 | history,fiction |
+-------------+---------------------------------+-------------------------+


My query is as follows:



select
*,
count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
from
(
select
(num_borrows/num_seen) as borrow_rate,
count(visit_id) as num_visits,
group_concat(browsed_categories, ',') as all_cats
from [table]
group by borrow_rate
)


The query gives me this error:



Cannot use count distinct with scoped aggregation


How should I modify the query to get the desired output?










share|improve this question



























    -1















    I have a table as



    +--------+------------------+-----------+---------+-------------+ 
    |visit_id|browsed_categories | num_seen| num_borrows |
    +--------+------------------+-----------+---------+-------------+
    |1 | fiction,history | 20 | 3 |
    |2 | selfhelp,fiction,science | 15 | 3 |
    |3 | cooking,kids,home,selfhelp | 7 | 2 |
    +--------+------------------------------+---------+-------------+


    and am attempting to summarize this table to find if a correlation exists between the number of distinct browsed categories and borrowals.



    +-------------+---------------------------------+-------------------------+
    | borrow_rate | num_distinct_browsed_categories | distinct_categories |
    +-------------+---------------------------------+-------------------------+
    | 0 | 3 | cooking,selfhelp,home |
    | 1 | 2 | history,fiction |
    +-------------+---------------------------------+-------------------------+


    My query is as follows:



    select
    *,
    count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
    from
    (
    select
    (num_borrows/num_seen) as borrow_rate,
    count(visit_id) as num_visits,
    group_concat(browsed_categories, ',') as all_cats
    from [table]
    group by borrow_rate
    )


    The query gives me this error:



    Cannot use count distinct with scoped aggregation


    How should I modify the query to get the desired output?










    share|improve this question

























      -1












      -1








      -1








      I have a table as



      +--------+------------------+-----------+---------+-------------+ 
      |visit_id|browsed_categories | num_seen| num_borrows |
      +--------+------------------+-----------+---------+-------------+
      |1 | fiction,history | 20 | 3 |
      |2 | selfhelp,fiction,science | 15 | 3 |
      |3 | cooking,kids,home,selfhelp | 7 | 2 |
      +--------+------------------------------+---------+-------------+


      and am attempting to summarize this table to find if a correlation exists between the number of distinct browsed categories and borrowals.



      +-------------+---------------------------------+-------------------------+
      | borrow_rate | num_distinct_browsed_categories | distinct_categories |
      +-------------+---------------------------------+-------------------------+
      | 0 | 3 | cooking,selfhelp,home |
      | 1 | 2 | history,fiction |
      +-------------+---------------------------------+-------------------------+


      My query is as follows:



      select
      *,
      count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
      from
      (
      select
      (num_borrows/num_seen) as borrow_rate,
      count(visit_id) as num_visits,
      group_concat(browsed_categories, ',') as all_cats
      from [table]
      group by borrow_rate
      )


      The query gives me this error:



      Cannot use count distinct with scoped aggregation


      How should I modify the query to get the desired output?










      share|improve this question














      I have a table as



      +--------+------------------+-----------+---------+-------------+ 
      |visit_id|browsed_categories | num_seen| num_borrows |
      +--------+------------------+-----------+---------+-------------+
      |1 | fiction,history | 20 | 3 |
      |2 | selfhelp,fiction,science | 15 | 3 |
      |3 | cooking,kids,home,selfhelp | 7 | 2 |
      +--------+------------------------------+---------+-------------+


      and am attempting to summarize this table to find if a correlation exists between the number of distinct browsed categories and borrowals.



      +-------------+---------------------------------+-------------------------+
      | borrow_rate | num_distinct_browsed_categories | distinct_categories |
      +-------------+---------------------------------+-------------------------+
      | 0 | 3 | cooking,selfhelp,home |
      | 1 | 2 | history,fiction |
      +-------------+---------------------------------+-------------------------+


      My query is as follows:



      select
      *,
      count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
      from
      (
      select
      (num_borrows/num_seen) as borrow_rate,
      count(visit_id) as num_visits,
      group_concat(browsed_categories, ',') as all_cats
      from [table]
      group by borrow_rate
      )


      The query gives me this error:



      Cannot use count distinct with scoped aggregation


      How should I modify the query to get the desired output?







      google-bigquery






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 25 '18 at 5:53









      trshivtrshiv

      1,9771722




      1,9771722
























          1 Answer
          1






          active

          oldest

          votes


















          2














          Below is version for BigQuery Standard SQL



          #standardSQL
          SELECT
          *,
          (SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
          FROM (
          SELECT
          (num_borrows/num_seen) AS borrow_rate,
          COUNT(visit_id) AS num_visits,
          STRING_AGG(browsed_categories, ',') AS all_cats
          FROM `project.dataset.table`
          GROUP BY borrow_rate
          )


          Btw, if for some reason you still bound to BigQuery Legacy SQL - just replace



          count(distinct(split(all_cats, ',')))    


          with



          exact_count_distinct(split(all_cats, ','))   


          in your original query






          share|improve this answer


























          • Thanks! Using exact_count_distinct() function worked.

            – trshiv
            Nov 28 '18 at 13:21











          • Is there a way to check if the "all_cats" column contains any value from a given list of values?

            – trshiv
            Nov 28 '18 at 13:23











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465028%2fbigquery-cannot-use-count-distinct-with-scoped-aggregation%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          Below is version for BigQuery Standard SQL



          #standardSQL
          SELECT
          *,
          (SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
          FROM (
          SELECT
          (num_borrows/num_seen) AS borrow_rate,
          COUNT(visit_id) AS num_visits,
          STRING_AGG(browsed_categories, ',') AS all_cats
          FROM `project.dataset.table`
          GROUP BY borrow_rate
          )


          Btw, if for some reason you still bound to BigQuery Legacy SQL - just replace



          count(distinct(split(all_cats, ',')))    


          with



          exact_count_distinct(split(all_cats, ','))   


          in your original query






          share|improve this answer


























          • Thanks! Using exact_count_distinct() function worked.

            – trshiv
            Nov 28 '18 at 13:21











          • Is there a way to check if the "all_cats" column contains any value from a given list of values?

            – trshiv
            Nov 28 '18 at 13:23
















          2














          Below is version for BigQuery Standard SQL



          #standardSQL
          SELECT
          *,
          (SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
          FROM (
          SELECT
          (num_borrows/num_seen) AS borrow_rate,
          COUNT(visit_id) AS num_visits,
          STRING_AGG(browsed_categories, ',') AS all_cats
          FROM `project.dataset.table`
          GROUP BY borrow_rate
          )


          Btw, if for some reason you still bound to BigQuery Legacy SQL - just replace



          count(distinct(split(all_cats, ',')))    


          with



          exact_count_distinct(split(all_cats, ','))   


          in your original query






          share|improve this answer


























          • Thanks! Using exact_count_distinct() function worked.

            – trshiv
            Nov 28 '18 at 13:21











          • Is there a way to check if the "all_cats" column contains any value from a given list of values?

            – trshiv
            Nov 28 '18 at 13:23














          2












          2








          2







          Below is version for BigQuery Standard SQL



          #standardSQL
          SELECT
          *,
          (SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
          FROM (
          SELECT
          (num_borrows/num_seen) AS borrow_rate,
          COUNT(visit_id) AS num_visits,
          STRING_AGG(browsed_categories, ',') AS all_cats
          FROM `project.dataset.table`
          GROUP BY borrow_rate
          )


          Btw, if for some reason you still bound to BigQuery Legacy SQL - just replace



          count(distinct(split(all_cats, ',')))    


          with



          exact_count_distinct(split(all_cats, ','))   


          in your original query






          share|improve this answer















          Below is version for BigQuery Standard SQL



          #standardSQL
          SELECT
          *,
          (SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
          FROM (
          SELECT
          (num_borrows/num_seen) AS borrow_rate,
          COUNT(visit_id) AS num_visits,
          STRING_AGG(browsed_categories, ',') AS all_cats
          FROM `project.dataset.table`
          GROUP BY borrow_rate
          )


          Btw, if for some reason you still bound to BigQuery Legacy SQL - just replace



          count(distinct(split(all_cats, ',')))    


          with



          exact_count_distinct(split(all_cats, ','))   


          in your original query







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 25 '18 at 7:42

























          answered Nov 25 '18 at 7:17









          Mikhail BerlyantMikhail Berlyant

          60.9k43772




          60.9k43772













          • Thanks! Using exact_count_distinct() function worked.

            – trshiv
            Nov 28 '18 at 13:21











          • Is there a way to check if the "all_cats" column contains any value from a given list of values?

            – trshiv
            Nov 28 '18 at 13:23



















          • Thanks! Using exact_count_distinct() function worked.

            – trshiv
            Nov 28 '18 at 13:21











          • Is there a way to check if the "all_cats" column contains any value from a given list of values?

            – trshiv
            Nov 28 '18 at 13:23

















          Thanks! Using exact_count_distinct() function worked.

          – trshiv
          Nov 28 '18 at 13:21





          Thanks! Using exact_count_distinct() function worked.

          – trshiv
          Nov 28 '18 at 13:21













          Is there a way to check if the "all_cats" column contains any value from a given list of values?

          – trshiv
          Nov 28 '18 at 13:23





          Is there a way to check if the "all_cats" column contains any value from a given list of values?

          – trshiv
          Nov 28 '18 at 13:23




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465028%2fbigquery-cannot-use-count-distinct-with-scoped-aggregation%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Wiesbaden

          Marschland

          Dieringhausen