Select only the “most complete” record

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I need to solve the following problem.

Let's suppose I have a table with 4 fields called a, b, c, d.

I have the following records:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |    4        row 2 

   1   |    2    |         |    4        row 3

   1   |    2    |    3    |             row 4

As it's possible to observe, rows 1,3,4 are "sub-records" of row 2.

What I would like to do is, to extract only 2nd row.

Could you help me please?

Thanks in advance for the answer

EDIT: I need to be more specific.

I could have also the cases:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |         |    4        row 2 

   1   |         |         |    4        row 3

where I need to extract the 2nd row,

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |             row 2 

   1   |         |    3    |             row 3

and again I need to extract the 2nd row.

Same for couples,

   a   |    b    |    c    |    d  

-------------------------------------

   1   |         |         |             row 1 

   1   |         |    3    |             row 2 

       |         |    3    |             row 3

and so on for the other examples.

(Of course, it's now always 2nd row)

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

what if there is more than 1 row with same sub-records?

– Vamsi Prabhala
Nov 26 '18 at 16:11

this case shouldn't be contemplated, however I still need only the most complete record

– M.Orlandi
Nov 26 '18 at 16:12

add a comment |

I need to solve the following problem.

Let's suppose I have a table with 4 fields called a, b, c, d.

I have the following records:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |    4        row 2 

   1   |    2    |         |    4        row 3

   1   |    2    |    3    |             row 4

As it's possible to observe, rows 1,3,4 are "sub-records" of row 2.

What I would like to do is, to extract only 2nd row.

Could you help me please?

Thanks in advance for the answer

EDIT: I need to be more specific.

I could have also the cases:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |         |    4        row 2 

   1   |         |         |    4        row 3

where I need to extract the 2nd row,

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |             row 2 

   1   |         |    3    |             row 3

and again I need to extract the 2nd row.

Same for couples,

   a   |    b    |    c    |    d  

-------------------------------------

   1   |         |         |             row 1 

   1   |         |    3    |             row 2 

       |         |    3    |             row 3

and so on for the other examples.

(Of course, it's now always 2nd row)

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

what if there is more than 1 row with same sub-records?

– Vamsi Prabhala
Nov 26 '18 at 16:11

this case shouldn't be contemplated, however I still need only the most complete record

– M.Orlandi
Nov 26 '18 at 16:12

add a comment |

I need to solve the following problem.

Let's suppose I have a table with 4 fields called a, b, c, d.

I have the following records:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |    4        row 2 

   1   |    2    |         |    4        row 3

   1   |    2    |    3    |             row 4

As it's possible to observe, rows 1,3,4 are "sub-records" of row 2.

What I would like to do is, to extract only 2nd row.

Could you help me please?

Thanks in advance for the answer

EDIT: I need to be more specific.

I could have also the cases:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |         |    4        row 2 

   1   |         |         |    4        row 3

where I need to extract the 2nd row,

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |             row 2 

   1   |         |    3    |             row 3

and again I need to extract the 2nd row.

Same for couples,

   a   |    b    |    c    |    d  

-------------------------------------

   1   |         |         |             row 1 

   1   |         |    3    |             row 2 

       |         |    3    |             row 3

and so on for the other examples.

(Of course, it's now always 2nd row)

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

I need to solve the following problem.

Let's suppose I have a table with 4 fields called a, b, c, d.

I have the following records:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |    4        row 2 

   1   |    2    |         |    4        row 3

   1   |    2    |    3    |             row 4

As it's possible to observe, rows 1,3,4 are "sub-records" of row 2.

What I would like to do is, to extract only 2nd row.

Could you help me please?

Thanks in advance for the answer

EDIT: I need to be more specific.

I could have also the cases:

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |         |    4        row 2 

   1   |         |         |    4        row 3

where I need to extract the 2nd row,

-------------------------------------

   a   |    b    |    c    |    d  

-------------------------------------

   1   |    2    |         |             row 1 

   1   |    2    |    3    |             row 2 

   1   |         |    3    |             row 3

and again I need to extract the 2nd row.

Same for couples,

   a   |    b    |    c    |    d  

-------------------------------------

   1   |         |         |             row 1 

   1   |         |    3    |             row 2 

       |         |    3    |             row 3

and so on for the other examples.

(Of course, it's now always 2nd row)

sql postgresql

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

edited Nov 26 '18 at 16:25

asked Nov 26 '18 at 16:09

M.Orlandi

498

asked Nov 26 '18 at 16:09

M.Orlandi

498

asked Nov 26 '18 at 16:09

M.Orlandi

498

what if there is more than 1 row with same sub-records?

– Vamsi Prabhala
Nov 26 '18 at 16:11

this case shouldn't be contemplated, however I still need only the most complete record

– M.Orlandi
Nov 26 '18 at 16:12

add a comment |

what if there is more than 1 row with same sub-records?

– Vamsi Prabhala
Nov 26 '18 at 16:11

this case shouldn't be contemplated, however I still need only the most complete record

– M.Orlandi
Nov 26 '18 at 16:12

what if there is more than 1 row with same sub-records?

– Vamsi Prabhala
Nov 26 '18 at 16:11

this case shouldn't be contemplated, however I still need only the most complete record

– M.Orlandi
Nov 26 '18 at 16:12

add a comment |

5 Answers
5

active

oldest

votes

Using a NOT EXISTS the records that have a better duplicate can be filtered out.

create table abcd (

 a int,

 b int,

 c int,

 d int

);

insert into abcd (a, b, c, d) values

 (1, 2, null, null)

,(1, 2, 3, 4)

,(1, 2, null, 4)

,(1, 2, 3, null)



,(2, 3, null,null)

,(2, 3, null, 5)

,(2, null, null, 5) 



,(3, null, null, null)

,(3, null, 5, null)

,(null, null, 5, null)

SELECT *  

FROM abcd AS t 

WHERE NOT EXISTS 

(

   select 1

   from abcd as d  

   where (t.a is null or d.a = t.a)

     and (t.b is null or d.b = t.b)

     and (t.c is null or d.c = t.c)

     and (t.d is null or d.d = t.d)

     and (case when t.a is null then 0 else 1 end +

          case when t.b is null then 0 else 1 end +

          case when t.c is null then 0 else 1 end +

          case when t.d is null then 0 else 1 end) < 

         (case when d.a is null then 0 else 1 end +

          case when d.b is null then 0 else 1 end +

          case when d.c is null then 0 else 1 end +

          case when d.d is null then 0 else 1 end)

);



 a |    b |    c |    d

-: | ---: | ---: | ---:

 1 |    2 |    3 |    4

 2 |    3 | null |    5

 3 | null |    5 | null

db<>fiddle here

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

add a comment |

You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:

   (CASE WHEN a IS NULL THEN 0 ELSE 1) +

   (CASE WHEN b IS NULL THEN 0 ELSE 1) +

   (CASE WHEN c IS NULL THEN 0 ELSE 1) +

   (CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex

Then SELECT the top 1 ordered by CompletionIndex in descending order.

This is obviously not very scalable across a large number of columns. But if you have a large number of sparsely populated columns you might consider a row-based rather than column-based structure for your data. That design would make it much easier to count the number of non-NULL values for each entity.

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

add a comment |

Most complete rows, by your definition, are the ones with the least null columns:

SELECT * FROM tablename

WHERE (

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)

) =

(SELECT MAX(

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)) 

FROM tablename)

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

1

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

|
show 6 more comments

Hmmm . . . I think you can use not exists:

with t as (

      select t.*, row_number() over (order by a) as id

      from t

     )

select t.*

from t

where not exists (select 1

                  from t t2

                  where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and

                         (t2.b is not distinct from  t.b or t2.b is not null and t.b is null) and

                         (t2.c is not distinct from  t.c or t2.c is not null and t.c is null) and

                         (t2.d is not distinct from  t.d or t2.d is not null and t.d is null)

                        ) and

                        t2.id <> t.id

                 );

The logic is that no more specific row exists, where the values match

Here is a db<>fiddle.

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

add a comment |

As mentioned by Gordon Linoff, we do have to use something like not exists too,

Edit Using EXCEPT helps

This might work...

SELECT * from table1 

EXCEPT

(

SELECT t1.*

FROM table1 t1

JOIN table1 t2

ON  COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)

AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)

AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)

AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)

)

Here, t1 is every subset row.

Note: We are assuming value -1 as sentinel value and it does not occur in any column.

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53485031%2fselect-only-the-most-complete-record%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

Using a NOT EXISTS the records that have a better duplicate can be filtered out.

create table abcd (

 a int,

 b int,

 c int,

 d int

);

insert into abcd (a, b, c, d) values

 (1, 2, null, null)

,(1, 2, 3, 4)

,(1, 2, null, 4)

,(1, 2, 3, null)



,(2, 3, null,null)

,(2, 3, null, 5)

,(2, null, null, 5) 



,(3, null, null, null)

,(3, null, 5, null)

,(null, null, 5, null)

SELECT *  

FROM abcd AS t 

WHERE NOT EXISTS 

(

   select 1

   from abcd as d  

   where (t.a is null or d.a = t.a)

     and (t.b is null or d.b = t.b)

     and (t.c is null or d.c = t.c)

     and (t.d is null or d.d = t.d)

     and (case when t.a is null then 0 else 1 end +

          case when t.b is null then 0 else 1 end +

          case when t.c is null then 0 else 1 end +

          case when t.d is null then 0 else 1 end) < 

         (case when d.a is null then 0 else 1 end +

          case when d.b is null then 0 else 1 end +

          case when d.c is null then 0 else 1 end +

          case when d.d is null then 0 else 1 end)

);



 a |    b |    c |    d

-: | ---: | ---: | ---:

 1 |    2 |    3 |    4

 2 |    3 | null |    5

 3 | null |    5 | null

db<>fiddle here

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

add a comment |

Using a NOT EXISTS the records that have a better duplicate can be filtered out.

create table abcd (

 a int,

 b int,

 c int,

 d int

);

insert into abcd (a, b, c, d) values

 (1, 2, null, null)

,(1, 2, 3, 4)

,(1, 2, null, 4)

,(1, 2, 3, null)



,(2, 3, null,null)

,(2, 3, null, 5)

,(2, null, null, 5) 



,(3, null, null, null)

,(3, null, 5, null)

,(null, null, 5, null)

SELECT *  

FROM abcd AS t 

WHERE NOT EXISTS 

(

   select 1

   from abcd as d  

   where (t.a is null or d.a = t.a)

     and (t.b is null or d.b = t.b)

     and (t.c is null or d.c = t.c)

     and (t.d is null or d.d = t.d)

     and (case when t.a is null then 0 else 1 end +

          case when t.b is null then 0 else 1 end +

          case when t.c is null then 0 else 1 end +

          case when t.d is null then 0 else 1 end) < 

         (case when d.a is null then 0 else 1 end +

          case when d.b is null then 0 else 1 end +

          case when d.c is null then 0 else 1 end +

          case when d.d is null then 0 else 1 end)

);



 a |    b |    c |    d

-: | ---: | ---: | ---:

 1 |    2 |    3 |    4

 2 |    3 | null |    5

 3 | null |    5 | null

db<>fiddle here

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

add a comment |

Using a NOT EXISTS the records that have a better duplicate can be filtered out.

create table abcd (

 a int,

 b int,

 c int,

 d int

);

insert into abcd (a, b, c, d) values

 (1, 2, null, null)

,(1, 2, 3, 4)

,(1, 2, null, 4)

,(1, 2, 3, null)



,(2, 3, null,null)

,(2, 3, null, 5)

,(2, null, null, 5) 



,(3, null, null, null)

,(3, null, 5, null)

,(null, null, 5, null)

SELECT *  

FROM abcd AS t 

WHERE NOT EXISTS 

(

   select 1

   from abcd as d  

   where (t.a is null or d.a = t.a)

     and (t.b is null or d.b = t.b)

     and (t.c is null or d.c = t.c)

     and (t.d is null or d.d = t.d)

     and (case when t.a is null then 0 else 1 end +

          case when t.b is null then 0 else 1 end +

          case when t.c is null then 0 else 1 end +

          case when t.d is null then 0 else 1 end) < 

         (case when d.a is null then 0 else 1 end +

          case when d.b is null then 0 else 1 end +

          case when d.c is null then 0 else 1 end +

          case when d.d is null then 0 else 1 end)

);



 a |    b |    c |    d

-: | ---: | ---: | ---:

 1 |    2 |    3 |    4

 2 |    3 | null |    5

 3 | null |    5 | null

db<>fiddle here

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

Using a NOT EXISTS the records that have a better duplicate can be filtered out.

create table abcd (

 a int,

 b int,

 c int,

 d int

);

insert into abcd (a, b, c, d) values

 (1, 2, null, null)

,(1, 2, 3, 4)

,(1, 2, null, 4)

,(1, 2, 3, null)



,(2, 3, null,null)

,(2, 3, null, 5)

,(2, null, null, 5) 



,(3, null, null, null)

,(3, null, 5, null)

,(null, null, 5, null)

SELECT *  

FROM abcd AS t 

WHERE NOT EXISTS 

(

   select 1

   from abcd as d  

   where (t.a is null or d.a = t.a)

     and (t.b is null or d.b = t.b)

     and (t.c is null or d.c = t.c)

     and (t.d is null or d.d = t.d)

     and (case when t.a is null then 0 else 1 end +

          case when t.b is null then 0 else 1 end +

          case when t.c is null then 0 else 1 end +

          case when t.d is null then 0 else 1 end) < 

         (case when d.a is null then 0 else 1 end +

          case when d.b is null then 0 else 1 end +

          case when d.c is null then 0 else 1 end +

          case when d.d is null then 0 else 1 end)

);



 a |    b |    c |    d

-: | ---: | ---: | ---:

 1 |    2 |    3 |    4

 2 |    3 | null |    5

 3 | null |    5 | null

db<>fiddle here

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

answered Nov 26 '18 at 20:20

LukStorms

14.1k31734

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

add a comment |

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

that's exactly what I was looking for! I have only added a select distinct over a,b,c,d instead of select * and it worked perfectly! Thank you

– M.Orlandi
Nov 27 '18 at 11:08

@M.Orlandi Thx for the rep. Just a warning though. This won't filter out special cases like the tupples (4, null, 6, null), (4, 5, null, null), (4, 6, null, null). But the nice thing about this method is that you can select the "lesses" dups by simply removing the NOT. For example if you want to delete the less complete dups.

– LukStorms
Nov 27 '18 at 12:01

add a comment |

You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:

   (CASE WHEN a IS NULL THEN 0 ELSE 1) +

   (CASE WHEN b IS NULL THEN 0 ELSE 1) +

   (CASE WHEN c IS NULL THEN 0 ELSE 1) +

   (CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex

Then SELECT the top 1 ordered by CompletionIndex in descending order.

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

add a comment |

You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:

   (CASE WHEN a IS NULL THEN 0 ELSE 1) +

   (CASE WHEN b IS NULL THEN 0 ELSE 1) +

   (CASE WHEN c IS NULL THEN 0 ELSE 1) +

   (CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex

Then SELECT the top 1 ordered by CompletionIndex in descending order.

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

add a comment |

You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:

   (CASE WHEN a IS NULL THEN 0 ELSE 1) +

   (CASE WHEN b IS NULL THEN 0 ELSE 1) +

   (CASE WHEN c IS NULL THEN 0 ELSE 1) +

   (CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex

Then SELECT the top 1 ordered by CompletionIndex in descending order.

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:

   (CASE WHEN a IS NULL THEN 0 ELSE 1) +

   (CASE WHEN b IS NULL THEN 0 ELSE 1) +

   (CASE WHEN c IS NULL THEN 0 ELSE 1) +

   (CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex

Then SELECT the top 1 ordered by CompletionIndex in descending order.

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

answered Nov 26 '18 at 16:43

Larry Lustig

40.5k1284130

add a comment |

Most complete rows, by your definition, are the ones with the least null columns:

SELECT * FROM tablename

WHERE (

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)

) =

(SELECT MAX(

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)) 

FROM tablename)

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

1

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

|
show 6 more comments

Most complete rows, by your definition, are the ones with the least null columns:

SELECT * FROM tablename

WHERE (

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)

) =

(SELECT MAX(

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)) 

FROM tablename)

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

1

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

|
show 6 more comments

Most complete rows, by your definition, are the ones with the least null columns:

SELECT * FROM tablename

WHERE (

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)

) =

(SELECT MAX(

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)) 

FROM tablename)

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

Most complete rows, by your definition, are the ones with the least null columns:

SELECT * FROM tablename

WHERE (

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)

) =

(SELECT MAX(

    (CASE WHEN a IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN b IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN c IS NULL THEN 0 ELSE 1 END) + 

    (CASE WHEN d IS NULL THEN 0 ELSE 1 END)) 

FROM tablename)

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

edited Nov 26 '18 at 16:46

answered Nov 26 '18 at 16:19

forpas

19.8k4830

answered Nov 26 '18 at 16:19

forpas

19.8k4830

answered Nov 26 '18 at 16:19

forpas

19.8k4830

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

1

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

|
show 6 more comments

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

1

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

Or more simply, since you're checking all columns: WHERE NOT tablename IS NULL

– 404
Nov 26 '18 at 16:20

@eurotrash What is this? Is this a valid statement?

– forpas
Nov 26 '18 at 16:24

i have edited my question to be more specific.

– M.Orlandi
Nov 26 '18 at 16:26

@forpas Yes, assuming the table name is tablename, by saying NOT tablename IS NULL it checks the whole row, and the statement will return TRUE only if none of the fields are NULL. So it's a shorter way than specifying every field.

– 404
Nov 26 '18 at 16:27

@eurotrash have you got any documentation about it?

– forpas
Nov 26 '18 at 16:28

|
show 6 more comments

Hmmm . . . I think you can use not exists:

with t as (

      select t.*, row_number() over (order by a) as id

      from t

     )

select t.*

from t

where not exists (select 1

                  from t t2

                  where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and

                         (t2.b is not distinct from  t.b or t2.b is not null and t.b is null) and

                         (t2.c is not distinct from  t.c or t2.c is not null and t.c is null) and

                         (t2.d is not distinct from  t.d or t2.d is not null and t.d is null)

                        ) and

                        t2.id <> t.id

                 );

The logic is that no more specific row exists, where the values match

Here is a db<>fiddle.

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

add a comment |

Hmmm . . . I think you can use not exists:

with t as (

      select t.*, row_number() over (order by a) as id

      from t

     )

select t.*

from t

where not exists (select 1

                  from t t2

                  where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and

                         (t2.b is not distinct from  t.b or t2.b is not null and t.b is null) and

                         (t2.c is not distinct from  t.c or t2.c is not null and t.c is null) and

                         (t2.d is not distinct from  t.d or t2.d is not null and t.d is null)

                        ) and

                        t2.id <> t.id

                 );

The logic is that no more specific row exists, where the values match

Here is a db<>fiddle.

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

add a comment |

Hmmm . . . I think you can use not exists:

with t as (

      select t.*, row_number() over (order by a) as id

      from t

     )

select t.*

from t

where not exists (select 1

                  from t t2

                  where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and

                         (t2.b is not distinct from  t.b or t2.b is not null and t.b is null) and

                         (t2.c is not distinct from  t.c or t2.c is not null and t.c is null) and

                         (t2.d is not distinct from  t.d or t2.d is not null and t.d is null)

                        ) and

                        t2.id <> t.id

                 );

The logic is that no more specific row exists, where the values match

Here is a db<>fiddle.

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

Hmmm . . . I think you can use not exists:

with t as (

      select t.*, row_number() over (order by a) as id

      from t

     )

select t.*

from t

where not exists (select 1

                  from t t2

                  where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and

                         (t2.b is not distinct from  t.b or t2.b is not null and t.b is null) and

                         (t2.c is not distinct from  t.c or t2.c is not null and t.c is null) and

                         (t2.d is not distinct from  t.d or t2.d is not null and t.d is null)

                        ) and

                        t2.id <> t.id

                 );

The logic is that no more specific row exists, where the values match

Here is a db<>fiddle.

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

edited Nov 26 '18 at 16:51

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

answered Nov 26 '18 at 16:21

Gordon Linoff

794k37318423

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

add a comment |

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

Thank you for your answer! i have edited my question to be more specific because this is only a pattern that can happen

– M.Orlandi
Nov 26 '18 at 16:27

@M.Orlandi . . . The additional clarifications in the question helped me refine the logic.

– Gordon Linoff
Nov 26 '18 at 16:30

add a comment |

As mentioned by Gordon Linoff, we do have to use something like not exists too,

Edit Using EXCEPT helps

This might work...

SELECT * from table1 

EXCEPT

(

SELECT t1.*

FROM table1 t1

JOIN table1 t2

ON  COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)

AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)

AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)

AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)

)

Here, t1 is every subset row.

Note: We are assuming value -1 as sentinel value and it does not occur in any column.

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

add a comment |

As mentioned by Gordon Linoff, we do have to use something like not exists too,

Edit Using EXCEPT helps

This might work...

SELECT * from table1 

EXCEPT

(

SELECT t1.*

FROM table1 t1

JOIN table1 t2

ON  COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)

AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)

AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)

AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)

)

Here, t1 is every subset row.

Note: We are assuming value -1 as sentinel value and it does not occur in any column.

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

add a comment |

As mentioned by Gordon Linoff, we do have to use something like not exists too,

Edit Using EXCEPT helps

This might work...

SELECT * from table1 

EXCEPT

(

SELECT t1.*

FROM table1 t1

JOIN table1 t2

ON  COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)

AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)

AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)

AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)

)

Here, t1 is every subset row.

Note: We are assuming value -1 as sentinel value and it does not occur in any column.

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

As mentioned by Gordon Linoff, we do have to use something like not exists too,

Edit Using EXCEPT helps

This might work...

SELECT * from table1 

EXCEPT

(

SELECT t1.*

FROM table1 t1

JOIN table1 t2

ON  COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)

AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)

AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)

AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)

)

Here, t1 is every subset row.

Note: We are assuming value -1 as sentinel value and it does not occur in any column.

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

edited Nov 27 '18 at 1:14

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

answered Nov 26 '18 at 16:55

Sahil Dhoked

373212

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

K3yJ zT 983YYEA2Yw7yQicq,n02brT,0 pyj SVPJMq72SrpUMQzfG mwwaGXWVih

搜尋此網誌

Ytukyg