MYSQL LEFT JOIN returns unexpected results





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







-1















I have two tables talk_comments and talk_comment_votes.



I run the following code to select, commentId, numberOfUpvotes, whetherUserUpvoted, numberOfDownvotes, whetherUserDownvoted usin LEFT JOINs to the same table.



SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id


I have the following data in my talk_comment_votes table



phpMyAdmin database screenshot



So, according to the query, it should select values 2,2,0,1,1 respectively. When I break those JOIN statements and do the queries, it returns the expected results. But, with JOINs, it returns something like the follows.



phpMyAdmin table results



Can I get some help on fixing this?



Thanks.










share|improve this question




















  • 1





    Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

    – Raymond Nijland
    Nov 26 '18 at 20:33













  • @RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

    – Supun Kavinda
    Nov 26 '18 at 20:37











  • Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

    – Raymond Nijland
    Nov 26 '18 at 20:38













  • I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

    – Supun Kavinda
    Nov 26 '18 at 20:40











  • By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

    – Raymond Nijland
    Nov 26 '18 at 20:43


















-1















I have two tables talk_comments and talk_comment_votes.



I run the following code to select, commentId, numberOfUpvotes, whetherUserUpvoted, numberOfDownvotes, whetherUserDownvoted usin LEFT JOINs to the same table.



SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id


I have the following data in my talk_comment_votes table



phpMyAdmin database screenshot



So, according to the query, it should select values 2,2,0,1,1 respectively. When I break those JOIN statements and do the queries, it returns the expected results. But, with JOINs, it returns something like the follows.



phpMyAdmin table results



Can I get some help on fixing this?



Thanks.










share|improve this question




















  • 1





    Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

    – Raymond Nijland
    Nov 26 '18 at 20:33













  • @RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

    – Supun Kavinda
    Nov 26 '18 at 20:37











  • Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

    – Raymond Nijland
    Nov 26 '18 at 20:38













  • I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

    – Supun Kavinda
    Nov 26 '18 at 20:40











  • By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

    – Raymond Nijland
    Nov 26 '18 at 20:43














-1












-1








-1








I have two tables talk_comments and talk_comment_votes.



I run the following code to select, commentId, numberOfUpvotes, whetherUserUpvoted, numberOfDownvotes, whetherUserDownvoted usin LEFT JOINs to the same table.



SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id


I have the following data in my talk_comment_votes table



phpMyAdmin database screenshot



So, according to the query, it should select values 2,2,0,1,1 respectively. When I break those JOIN statements and do the queries, it returns the expected results. But, with JOINs, it returns something like the follows.



phpMyAdmin table results



Can I get some help on fixing this?



Thanks.










share|improve this question
















I have two tables talk_comments and talk_comment_votes.



I run the following code to select, commentId, numberOfUpvotes, whetherUserUpvoted, numberOfDownvotes, whetherUserDownvoted usin LEFT JOINs to the same table.



SELECT c.id, COUNT(v1.id) as upvotes, COUNT(v2.id) as userUpvoted, COUNT(v3.id) as downvotes, COUNT(v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id


I have the following data in my talk_comment_votes table



phpMyAdmin database screenshot



So, according to the query, it should select values 2,2,0,1,1 respectively. When I break those JOIN statements and do the queries, it returns the expected results. But, with JOINs, it returns something like the follows.



phpMyAdmin table results



Can I get some help on fixing this?



Thanks.







mysql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 26 '18 at 20:34







Supun Kavinda

















asked Nov 26 '18 at 20:31









Supun KavindaSupun Kavinda

6217




6217








  • 1





    Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

    – Raymond Nijland
    Nov 26 '18 at 20:33













  • @RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

    – Supun Kavinda
    Nov 26 '18 at 20:37











  • Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

    – Raymond Nijland
    Nov 26 '18 at 20:38













  • I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

    – Supun Kavinda
    Nov 26 '18 at 20:40











  • By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

    – Raymond Nijland
    Nov 26 '18 at 20:43














  • 1





    Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

    – Raymond Nijland
    Nov 26 '18 at 20:33













  • @RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

    – Supun Kavinda
    Nov 26 '18 at 20:37











  • Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

    – Raymond Nijland
    Nov 26 '18 at 20:38













  • I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

    – Supun Kavinda
    Nov 26 '18 at 20:40











  • By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

    – Raymond Nijland
    Nov 26 '18 at 20:43








1




1





Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

– Raymond Nijland
Nov 26 '18 at 20:33







Not the best for performance but try COUNT(DISTINCT ...) the LEFT JOIN's seams to multiply the count.. Best performing and safest method would be to do the counting in separted subqueries when working with multiple join's

– Raymond Nijland
Nov 26 '18 at 20:33















@RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

– Supun Kavinda
Nov 26 '18 at 20:37





@RaymondNijland Wow, that returns the correct result. How is it working? Can you explain a little bit? Please add it as an answer so that I can mark it as the answer.

– Supun Kavinda
Nov 26 '18 at 20:37













Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

– Raymond Nijland
Nov 26 '18 at 20:38







Well the JOIN's make multiple records that will be counted. try the same query without the GROUP BY and COUNT() then you will see what is going on there... Or change in the current query the COUNT into GROUP_CONCAT then you will see "duplicated" records.. Which COUNT(DISTINCT ..) filters out..

– Raymond Nijland
Nov 26 '18 at 20:38















I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

– Supun Kavinda
Nov 26 '18 at 20:40





I see that. Thank you for the answer. If it is not the best performance, what would be the best one?

– Supun Kavinda
Nov 26 '18 at 20:40













By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

– Raymond Nijland
Nov 26 '18 at 20:43





By making subqueries more or less like ... SELECT * FROM (SELECT COUNT(*) FROM table) AS alias INNER JOIN table ON alias.column = table.column or a subquery in a join like ... table INNER JOIN (SELECT COUNT(*) FROM table2) AS alias ON table.column = alias.column

– Raymond Nijland
Nov 26 '18 at 20:43












2 Answers
2






active

oldest

votes


















0














I ran a benchmark on queries based on @spencer7593 and @RaymondNijland's 2 answers.



LEFT JOINS wins!



1. Using LEFT JOINS



SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
WHERE c.id = 2 GROUP BY c.id


Time for 1000 queries: 0.55000805854797s





2. Using Sub Queries



SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
(SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
(SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
(SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
(SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
FROM talk_comments c
WHERE c.id = 2 GROUP BY c.id


Time for 1000 queries: 0.95499300956726s



3. Using SUM, IF



SELECT c.id
, SUM(IF(v.status = 1 ,1,0)) AS upvotes
, SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
, SUM(IF(v.status = 2 ,1,0)) AS downvotes
, SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
FROM talk_comments c
LEFT
JOIN talk_comment_votes v
ON v.comment_id = c.id
WHERE c.id = 2
GROUP BY c.id


Time for 1000 queries: 1.2266919612885s



Thank you for all the answers.






share|improve this answer































    0














    I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.



    SELECT c.id
    , SUM(IF(v.status = 1 ,1,0)) AS upvotes
    , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
    , SUM(IF(v.status = 2 ,1,0)) AS downvotes
    , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
    FROM talk_comments c
    LEFT
    JOIN talk_comment_votes v
    ON v.comment_id = c.id
    WHERE c.id = 2
    GROUP
    BY c.id




    This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.



    The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.



        , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes




    FOLLOWUP



    setup test case and observe execution plans and performance



    populate tables



    CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
    CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
    CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
    INSERT INTO talk_comments (id) VALUES (1),(2),(3);
    INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);


    query execution plans



    EXPLAIN
    SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
    LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
    LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
    LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
    LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
    WHERE c.id = 2 GROUP BY c.id
    ;

    EXPLAIN
    SELECT c.id
    , SUM(IF(v.status = 1 ,1,0)) AS upvotes
    , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
    , SUM(IF(v.status = 2 ,1,0)) AS downvotes
    , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
    FROM talk_comments c
    LEFT
    JOIN talk_comment_votes v
    ON v.comment_id = c.id
    WHERE c.id = 2
    GROUP BY c.id
    ;


    output from explain



    --     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
    -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
    -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
    -- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
    -- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
    -- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
    -- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index



    -- id select_type table type possible_keys key key_len ref rows Extra
    -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
    -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
    -- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index


    measured performance:



    100 executions                        round 1      round 2     round 3
    ------------------------------------ ---------- ---------- ---------
    multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
    conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs





    share|improve this answer


























    • Thank you very much! It works like a charm.

      – Supun Kavinda
      Nov 28 '18 at 2:19











    • But, according to the performance test, still LEFT JOIN wins

      – Supun Kavinda
      Nov 28 '18 at 2:22











    • I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

      – spencer7593
      Nov 28 '18 at 19:41













    • A test case shows performance of conditional aggregation is 10% faster.

      – spencer7593
      Nov 28 '18 at 20:10












    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53488595%2fmysql-left-join-returns-unexpected-results%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    I ran a benchmark on queries based on @spencer7593 and @RaymondNijland's 2 answers.



    LEFT JOINS wins!



    1. Using LEFT JOINS



    SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
    LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
    LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
    LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
    LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
    WHERE c.id = 2 GROUP BY c.id


    Time for 1000 queries: 0.55000805854797s





    2. Using Sub Queries



    SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
    (SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
    (SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
    (SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
    (SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
    FROM talk_comments c
    WHERE c.id = 2 GROUP BY c.id


    Time for 1000 queries: 0.95499300956726s



    3. Using SUM, IF



    SELECT c.id
    , SUM(IF(v.status = 1 ,1,0)) AS upvotes
    , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
    , SUM(IF(v.status = 2 ,1,0)) AS downvotes
    , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
    FROM talk_comments c
    LEFT
    JOIN talk_comment_votes v
    ON v.comment_id = c.id
    WHERE c.id = 2
    GROUP BY c.id


    Time for 1000 queries: 1.2266919612885s



    Thank you for all the answers.






    share|improve this answer




























      0














      I ran a benchmark on queries based on @spencer7593 and @RaymondNijland's 2 answers.



      LEFT JOINS wins!



      1. Using LEFT JOINS



      SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
      LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
      LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
      LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
      LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
      WHERE c.id = 2 GROUP BY c.id


      Time for 1000 queries: 0.55000805854797s





      2. Using Sub Queries



      SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
      (SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
      (SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
      (SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
      (SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
      FROM talk_comments c
      WHERE c.id = 2 GROUP BY c.id


      Time for 1000 queries: 0.95499300956726s



      3. Using SUM, IF



      SELECT c.id
      , SUM(IF(v.status = 1 ,1,0)) AS upvotes
      , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
      , SUM(IF(v.status = 2 ,1,0)) AS downvotes
      , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
      FROM talk_comments c
      LEFT
      JOIN talk_comment_votes v
      ON v.comment_id = c.id
      WHERE c.id = 2
      GROUP BY c.id


      Time for 1000 queries: 1.2266919612885s



      Thank you for all the answers.






      share|improve this answer


























        0












        0








        0







        I ran a benchmark on queries based on @spencer7593 and @RaymondNijland's 2 answers.



        LEFT JOINS wins!



        1. Using LEFT JOINS



        SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
        LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
        LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
        LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
        LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
        WHERE c.id = 2 GROUP BY c.id


        Time for 1000 queries: 0.55000805854797s





        2. Using Sub Queries



        SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
        (SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
        (SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
        (SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
        (SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
        FROM talk_comments c
        WHERE c.id = 2 GROUP BY c.id


        Time for 1000 queries: 0.95499300956726s



        3. Using SUM, IF



        SELECT c.id
        , SUM(IF(v.status = 1 ,1,0)) AS upvotes
        , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
        , SUM(IF(v.status = 2 ,1,0)) AS downvotes
        , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
        FROM talk_comments c
        LEFT
        JOIN talk_comment_votes v
        ON v.comment_id = c.id
        WHERE c.id = 2
        GROUP BY c.id


        Time for 1000 queries: 1.2266919612885s



        Thank you for all the answers.






        share|improve this answer













        I ran a benchmark on queries based on @spencer7593 and @RaymondNijland's 2 answers.



        LEFT JOINS wins!



        1. Using LEFT JOINS



        SELECT c.id, COUNT(DISTINCT v1.id) as upvotes, COUNT(DISTINCT v2.id) as userUpvoted, COUNT(DISTINCT v3.id) as downvotes, COUNT(DISTINCT v4.id) as userDownvoted FROM talk_comments c
        LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
        LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
        LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
        LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
        WHERE c.id = 2 GROUP BY c.id


        Time for 1000 queries: 0.55000805854797s





        2. Using Sub Queries



        SELECT c.id,c.user_id, c.time,c.body, c.reply_to, 
        (SELECT COUNT(v1.id) FROM talk_comment_votes v1 WHERE v1.comment_id = c.id AND v1.status = 1 LIMIT 1) as upvotes,
        (SELECT COUNT(v2.id) FROM talk_comment_votes v2 WHERE v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 LIMIT 1) as clientUpvoted,
        (SELECT COUNT(v3.id) FROM talk_comment_votes v3 WHERE v3.comment_id = c.id AND v3.status = 2 LIMIT 1) as downvotes,
        (SELECT COUNT(v4.id) FROM talk_comment_votes v4 WHERE v4.comment_id = c.id AND v4.status = 2 AND v4.user_id = 1 LIMIT 1) as clientDownvoted
        FROM talk_comments c
        WHERE c.id = 2 GROUP BY c.id


        Time for 1000 queries: 0.95499300956726s



        3. Using SUM, IF



        SELECT c.id
        , SUM(IF(v.status = 1 ,1,0)) AS upvotes
        , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
        , SUM(IF(v.status = 2 ,1,0)) AS downvotes
        , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
        FROM talk_comments c
        LEFT
        JOIN talk_comment_votes v
        ON v.comment_id = c.id
        WHERE c.id = 2
        GROUP BY c.id


        Time for 1000 queries: 1.2266919612885s



        Thank you for all the answers.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 28 '18 at 2:31









        Supun KavindaSupun Kavinda

        6217




        6217

























            0














            I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.



            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP
            BY c.id




            This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.



            The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.



                , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes




            FOLLOWUP



            setup test case and observe execution plans and performance



            populate tables



            CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
            CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
            CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
            INSERT INTO talk_comments (id) VALUES (1),(2),(3);
            INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);


            query execution plans



            EXPLAIN
            SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
            LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
            LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
            LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
            LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
            WHERE c.id = 2 GROUP BY c.id
            ;

            EXPLAIN
            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP BY c.id
            ;


            output from explain



            --     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
            -- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
            -- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
            -- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index



            -- id select_type table type possible_keys key key_len ref rows Extra
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index


            measured performance:



            100 executions                        round 1      round 2     round 3
            ------------------------------------ ---------- ---------- ---------
            multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
            conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs





            share|improve this answer


























            • Thank you very much! It works like a charm.

              – Supun Kavinda
              Nov 28 '18 at 2:19











            • But, according to the performance test, still LEFT JOIN wins

              – Supun Kavinda
              Nov 28 '18 at 2:22











            • I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

              – spencer7593
              Nov 28 '18 at 19:41













            • A test case shows performance of conditional aggregation is 10% faster.

              – spencer7593
              Nov 28 '18 at 20:10
















            0














            I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.



            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP
            BY c.id




            This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.



            The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.



                , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes




            FOLLOWUP



            setup test case and observe execution plans and performance



            populate tables



            CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
            CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
            CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
            INSERT INTO talk_comments (id) VALUES (1),(2),(3);
            INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);


            query execution plans



            EXPLAIN
            SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
            LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
            LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
            LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
            LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
            WHERE c.id = 2 GROUP BY c.id
            ;

            EXPLAIN
            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP BY c.id
            ;


            output from explain



            --     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
            -- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
            -- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
            -- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index



            -- id select_type table type possible_keys key key_len ref rows Extra
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index


            measured performance:



            100 executions                        round 1      round 2     round 3
            ------------------------------------ ---------- ---------- ---------
            multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
            conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs





            share|improve this answer


























            • Thank you very much! It works like a charm.

              – Supun Kavinda
              Nov 28 '18 at 2:19











            • But, according to the performance test, still LEFT JOIN wins

              – Supun Kavinda
              Nov 28 '18 at 2:22











            • I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

              – spencer7593
              Nov 28 '18 at 19:41













            • A test case shows performance of conditional aggregation is 10% faster.

              – spencer7593
              Nov 28 '18 at 20:10














            0












            0








            0







            I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.



            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP
            BY c.id




            This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.



            The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.



                , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes




            FOLLOWUP



            setup test case and observe execution plans and performance



            populate tables



            CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
            CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
            CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
            INSERT INTO talk_comments (id) VALUES (1),(2),(3);
            INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);


            query execution plans



            EXPLAIN
            SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
            LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
            LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
            LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
            LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
            WHERE c.id = 2 GROUP BY c.id
            ;

            EXPLAIN
            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP BY c.id
            ;


            output from explain



            --     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
            -- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
            -- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
            -- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index



            -- id select_type table type possible_keys key key_len ref rows Extra
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index


            measured performance:



            100 executions                        round 1      round 2     round 3
            ------------------------------------ ---------- ---------- ---------
            multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
            conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs





            share|improve this answer















            I'd use conditional aggregation. A join to a single reference to tall_comment_votes, and then check conditions in expressions.



            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP
            BY c.id




            This avoids the problem of the partial cross product, when there are multiple rows returned from v1, v2, v3 and v4.



            The MySQL IF() expression could replaced with a more ANSI standards compliant CASE expression, e.g.



                , SUM(CASE WHEN v.status = 1 THEN 1 ELSE 0 END)  AS upvotes




            FOLLOWUP



            setup test case and observe execution plans and performance



            populate tables



            CREATE TABLE talk_comments (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT);
            CREATE TABLE talk_comment_votes (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, comment_id INT UNSIGNED NOT NULL, user_id INT UNSIGNED NOT NULL, is_anonymous TINYINT(1) UNSIGNED NOT NULL, STATUS TINYINT UNSIGNED, time_ INT UNSIGNED);
            CREATE INDEX talk_comment_votes_IX1 ON talk_comment_votes (comment_id, STATUS, user_id, is_anonymous) ;
            INSERT INTO talk_comments (id) VALUES (1),(2),(3);
            INSERT INTO talk_comment_votes (id, comment_id, user_id, is_anonymous, STATUS, time_) VALUES (1,2,2,0,1,0),(2,1,1,0,1,0),(3,2,1,0,2,NULL),(4,7,1,0,2,NULL),(5,1,14,1,1,NULL),(6,2,14,1,1,NULL);


            query execution plans



            EXPLAIN
            SELECT c.id, COUNT(DISTINCT v1.id) AS upvotes, COUNT(DISTINCT v2.id) AS userUpvoted, COUNT(DISTINCT v3.id) AS downvotes, COUNT(DISTINCT v4.id) AS userDownvoted FROM talk_comments c
            LEFT JOIN talk_comment_votes v1 ON v1.comment_id = c.id AND v1.status = 1
            LEFT JOIN talk_comment_votes v2 ON v2.comment_id = c.id AND v2.status = 1 AND v2.user_id = 1 AND v2.is_anonymous = 0
            LEFT JOIN talk_comment_votes v3 ON c.id = v3.comment_id AND v3.status = 2
            LEFT JOIN talk_comment_votes v4 ON c.id = v4.comment_id AND v4.status = 2 AND v4.user_id = 1 AND v4.is_anonymous = 0
            WHERE c.id = 2 GROUP BY c.id
            ;

            EXPLAIN
            SELECT c.id
            , SUM(IF(v.status = 1 ,1,0)) AS upvotes
            , SUM(IF(v.status = 1 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userUpvoted
            , SUM(IF(v.status = 2 ,1,0)) AS downvotes
            , SUM(IF(v.status = 2 AND v.user_id = 1 AND v.is_anonymous = 0 ,1,0)) AS userDownvoted
            FROM talk_comments c
            LEFT
            JOIN talk_comment_votes v
            ON v.comment_id = c.id
            WHERE c.id = 2
            GROUP BY c.id
            ;


            output from explain



            --     id  select_type  table   type    possible_keys           key                     key_len  ref                        rows  Extra        
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ----------------------- ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v1 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 2 Using index
            -- 1 SIMPLE v2 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index
            -- 1 SIMPLE v3 ref talk_comment_votes_IX1 talk_comment_votes_IX1 6 const,const 1 Using index
            -- 1 SIMPLE v4 ref talk_comment_votes_IX1 talk_comment_votes_IX1 11 const,const,const,const 1 Using index



            -- id select_type table type possible_keys key key_len ref rows Extra
            -- ------ ----------- ------ ------ ---------------------- ---------------------- ------- ------ ------ -------------
            -- 1 SIMPLE c const PRIMARY PRIMARY 4 const 1 Using index
            -- 1 SIMPLE v ref talk_comment_votes_IX1 talk_comment_votes_IX1 4 const 3 Using index


            measured performance:



            100 executions                        round 1      round 2     round 3
            ------------------------------------ ---------- ---------- ---------
            multiple left join, count(distinct 0.123 secs 0.130 secs 0.125 secs
            conditional aggregation sum(if 0.113 secs 0.114 secs 0.111 secs






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 28 '18 at 20:05

























            answered Nov 26 '18 at 22:00









            spencer7593spencer7593

            86.6k118197




            86.6k118197













            • Thank you very much! It works like a charm.

              – Supun Kavinda
              Nov 28 '18 at 2:19











            • But, according to the performance test, still LEFT JOIN wins

              – Supun Kavinda
              Nov 28 '18 at 2:22











            • I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

              – spencer7593
              Nov 28 '18 at 19:41













            • A test case shows performance of conditional aggregation is 10% faster.

              – spencer7593
              Nov 28 '18 at 20:10



















            • Thank you very much! It works like a charm.

              – Supun Kavinda
              Nov 28 '18 at 2:19











            • But, according to the performance test, still LEFT JOIN wins

              – Supun Kavinda
              Nov 28 '18 at 2:22











            • I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

              – spencer7593
              Nov 28 '18 at 19:41













            • A test case shows performance of conditional aggregation is 10% faster.

              – spencer7593
              Nov 28 '18 at 20:10

















            Thank you very much! It works like a charm.

            – Supun Kavinda
            Nov 28 '18 at 2:19





            Thank you very much! It works like a charm.

            – Supun Kavinda
            Nov 28 '18 at 2:19













            But, according to the performance test, still LEFT JOIN wins

            – Supun Kavinda
            Nov 28 '18 at 2:22





            But, according to the performance test, still LEFT JOIN wins

            – Supun Kavinda
            Nov 28 '18 at 2:22













            I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

            – spencer7593
            Nov 28 '18 at 19:41







            I'm a little surprised that the conditional aggregation would be slower; we could include criteria in the ON clause AND v.status IN (1,2), but I wouldn't expect that to make much of a difference, unless there's a slew of rows with status values other than 1 or 2. Of course with large sets, performance will horrendous if appropriate indexes are not defined.

            – spencer7593
            Nov 28 '18 at 19:41















            A test case shows performance of conditional aggregation is 10% faster.

            – spencer7593
            Nov 28 '18 at 20:10





            A test case shows performance of conditional aggregation is 10% faster.

            – spencer7593
            Nov 28 '18 at 20:10


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53488595%2fmysql-left-join-returns-unexpected-results%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen