Unexpected result of NATURAL JOIN and USING Clause
So I am learning about obtaining data from multiple tables and I have a question regarding NATURAL JOIN and the USING clause. So I have 2 tables that I'm extracting data from; employees and departments.
SQL> describe employees
Name Null? Type
----------------------------------------- -------- ------------------------
EMPLOYEE_ID NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
SQL> describe departments
Name Null? Type
----------------------------------------- -------- ------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
When I use NATURAL JOIN and USING in two different expressions, I have two different outputs. I know that USING matches specifically one column in both tables, but how does this affect the output? How come the expression with USING produces one extra value compared to the NATURAL JOIN?
SELECT department_id, manager_id, last_name, location_id
FROM employees NATURAL JOIN departments
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Abel 2500
80 149 Grant 2500
80 149 Taylor 2500
SELECT department_id, departments.manager_id, last_name, location_id
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Zlotkey 2500 <-Additional Line*
80 149 Grant 2500
80 149 Taylor 2500
80 149 Abel 2500
Any help and advice is appreciated!
sql database oracle
add a comment |
So I am learning about obtaining data from multiple tables and I have a question regarding NATURAL JOIN and the USING clause. So I have 2 tables that I'm extracting data from; employees and departments.
SQL> describe employees
Name Null? Type
----------------------------------------- -------- ------------------------
EMPLOYEE_ID NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
SQL> describe departments
Name Null? Type
----------------------------------------- -------- ------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
When I use NATURAL JOIN and USING in two different expressions, I have two different outputs. I know that USING matches specifically one column in both tables, but how does this affect the output? How come the expression with USING produces one extra value compared to the NATURAL JOIN?
SELECT department_id, manager_id, last_name, location_id
FROM employees NATURAL JOIN departments
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Abel 2500
80 149 Grant 2500
80 149 Taylor 2500
SELECT department_id, departments.manager_id, last_name, location_id
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Zlotkey 2500 <-Additional Line*
80 149 Grant 2500
80 149 Taylor 2500
80 149 Abel 2500
Any help and advice is appreciated!
sql database oracle
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.
– Bob Jarvis
Nov 25 '18 at 1:42
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47
add a comment |
So I am learning about obtaining data from multiple tables and I have a question regarding NATURAL JOIN and the USING clause. So I have 2 tables that I'm extracting data from; employees and departments.
SQL> describe employees
Name Null? Type
----------------------------------------- -------- ------------------------
EMPLOYEE_ID NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
SQL> describe departments
Name Null? Type
----------------------------------------- -------- ------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
When I use NATURAL JOIN and USING in two different expressions, I have two different outputs. I know that USING matches specifically one column in both tables, but how does this affect the output? How come the expression with USING produces one extra value compared to the NATURAL JOIN?
SELECT department_id, manager_id, last_name, location_id
FROM employees NATURAL JOIN departments
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Abel 2500
80 149 Grant 2500
80 149 Taylor 2500
SELECT department_id, departments.manager_id, last_name, location_id
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Zlotkey 2500 <-Additional Line*
80 149 Grant 2500
80 149 Taylor 2500
80 149 Abel 2500
Any help and advice is appreciated!
sql database oracle
So I am learning about obtaining data from multiple tables and I have a question regarding NATURAL JOIN and the USING clause. So I have 2 tables that I'm extracting data from; employees and departments.
SQL> describe employees
Name Null? Type
----------------------------------------- -------- ------------------------
EMPLOYEE_ID NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
SQL> describe departments
Name Null? Type
----------------------------------------- -------- ------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
When I use NATURAL JOIN and USING in two different expressions, I have two different outputs. I know that USING matches specifically one column in both tables, but how does this affect the output? How come the expression with USING produces one extra value compared to the NATURAL JOIN?
SELECT department_id, manager_id, last_name, location_id
FROM employees NATURAL JOIN departments
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Abel 2500
80 149 Grant 2500
80 149 Taylor 2500
SELECT department_id, departments.manager_id, last_name, location_id
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
DEPARTMENT_ID MANAGER_ID LAST_NAME LOCATION_ID
------------- ---------- ------------------------- -----------
80 149 Zlotkey 2500 <-Additional Line*
80 149 Grant 2500
80 149 Taylor 2500
80 149 Abel 2500
Any help and advice is appreciated!
sql database oracle
sql database oracle
asked Nov 24 '18 at 23:16
jbonejbone
152
152
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.
– Bob Jarvis
Nov 25 '18 at 1:42
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47
add a comment |
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.
– Bob Jarvis
Nov 25 '18 at 1:42
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.– Bob Jarvis
Nov 25 '18 at 1:42
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.– Bob Jarvis
Nov 25 '18 at 1:42
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47
add a comment |
2 Answers
2
active
oldest
votes
Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not
If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT *
and you’ll see..)
When natural joined this difference in manager_id between the two tables will remove the row from the results
It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.
Edit:
Run this:
SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
Look at the two columns showing manager ID. The data on the Zlotkey row will be different
You natural join effectively did this:
SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;
And on e.department_id = d.department_id AND e.manager_id = d.manager_id
is not true for the zlotkey row
This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
|
show 6 more comments
My first piece of advice is to apply the 'single responsibility' principle to your schema. Think about the role/persona who would update the data and under what circumstances they would do it. Bottom line: the employees
table has 'too much responsibility'. Employee name and hire date normally change less frequently than salary. Salary would be accessed every pay day. Phone number might be private. Consider creating new tables by splitting up the employees
table based on responsibilities.
Second, think about differentiating between (a) tables that model 'entities' such as employees, and (b) tables that model relationships between those entities. Your two tables currently mix these two concepts. Consider a new table for the org char relationship (who manages whom) and a further table to model the relationship between employees and departments.
Third, think about whether you are giving the same data element the same name throughout your schema e.g. may 'an employee's manger' and 'a department's manager' have different meanings (e.g. in queries involving both you find you have to rename at least one of them to avoid confusion). If so, consider making the name more explicit and don't reply on the table name to give in context.
After redesigning your schema in line with the above advice, you will likely find your queries work as expected. TL;DR if you are finding it hard to write simple SQL DML, consider modifying your SQL DDL.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463218%2funexpected-result-of-natural-join-and-using-clause%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not
If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT *
and you’ll see..)
When natural joined this difference in manager_id between the two tables will remove the row from the results
It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.
Edit:
Run this:
SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
Look at the two columns showing manager ID. The data on the Zlotkey row will be different
You natural join effectively did this:
SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;
And on e.department_id = d.department_id AND e.manager_id = d.manager_id
is not true for the zlotkey row
This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
|
show 6 more comments
Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not
If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT *
and you’ll see..)
When natural joined this difference in manager_id between the two tables will remove the row from the results
It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.
Edit:
Run this:
SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
Look at the two columns showing manager ID. The data on the Zlotkey row will be different
You natural join effectively did this:
SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;
And on e.department_id = d.department_id AND e.manager_id = d.manager_id
is not true for the zlotkey row
This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
|
show 6 more comments
Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not
If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT *
and you’ll see..)
When natural joined this difference in manager_id between the two tables will remove the row from the results
It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.
Edit:
Run this:
SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
Look at the two columns showing manager ID. The data on the Zlotkey row will be different
You natural join effectively did this:
SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;
And on e.department_id = d.department_id AND e.manager_id = d.manager_id
is not true for the zlotkey row
This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in
Because NATURAL JOIN joins on all columns with equal names, hence your NJ query will join on manager_id as well, whereas the USING form will not
If you examine the USING form you’ll see that the manager_id in each table is different. (Make your query SELECT *
and you’ll see..)
When natural joined this difference in manager_id between the two tables will remove the row from the results
It’s probably worth noting that natural join and using should be seldom used. For sure learn about them so you can appreciate their existence, but you should stick to using regular explicit joins for reasons of consistent and clear behaviour - if the app is upgraded in future and two new columns are added to these two tables, named the same but will different data the NJ will fail.
Edit:
Run this:
SELECT *
FROM employees JOIN departments
USING (department_id)
WHERE department_id = 80
ORDER BY location_id desc;
Look at the two columns showing manager ID. The data on the Zlotkey row will be different
You natural join effectively did this:
SELECT *
FROM employees e JOIN departments d
on e.department_id = d.department_id AND e.manager_id = d.manager_id
WHERE department_id = 80
ORDER BY location_id desc;
And on e.department_id = d.department_id AND e.manager_id = d.manager_id
is not true for the zlotkey row
This is why natural join is risky and possibly useless - just because two columns have the same name doesn’t mean the data in them is related. Indeed in your case your departments have managers and your employees have managers. I also have this setup at my work, but the man who is my manager is not the manager of the department I work in
edited Nov 25 '18 at 0:20
answered Nov 24 '18 at 23:34
Caius JardCaius Jard
12.1k21240
12.1k21240
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
|
show 6 more comments
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Thanks for your response! So when you say that there's a difference between manager_id in both tables, what do you mean specifically? When I check manager_id in both tables, all I see is that there are more rows in table employees compared to table departments, but they have the same value. table departments - 7 rows selected. table employees - 51 rows selected.
– jbone
Nov 25 '18 at 0:07
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Ok after looking at the row, the manager_id = 100 in table employees and manager_id = 149 in table departments. So because of this difference, NJ leaves this row out? If so, I think I know what happened with USING. So in the USING clause, the column specified was department_id. This caused the row to be produced despite having a different manager_id.
– jbone
Nov 25 '18 at 0:18
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Yes, if you see the edit to my answer I point this out toward the end. If you were to write USING(department_id, manager_id) the row would disappear again. Moral of the story: don’t use natural join, because it’s nearly always a bad idea. Saving a few key presses now isn’t worth the pain it’ll cause later. Don’t use USING because it’s not standard SQL
– Caius Jard
Nov 25 '18 at 0:21
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
Hi I have another question. So to avoid a situation like I had with the USING clause (where I had Zlotkey incorrectly added), I should always make sure that every matching column from each table is set equal to each other (i.e. ON e.department_id = d.department_id AND e.manager_id = d.manager_id)? Since I had only one condition for matching columns the error was produced.
– jbone
Nov 25 '18 at 4:43
2
2
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
Looks more like an astute observation than hate to me
– Caius Jard
Jan 8 at 14:54
|
show 6 more comments
My first piece of advice is to apply the 'single responsibility' principle to your schema. Think about the role/persona who would update the data and under what circumstances they would do it. Bottom line: the employees
table has 'too much responsibility'. Employee name and hire date normally change less frequently than salary. Salary would be accessed every pay day. Phone number might be private. Consider creating new tables by splitting up the employees
table based on responsibilities.
Second, think about differentiating between (a) tables that model 'entities' such as employees, and (b) tables that model relationships between those entities. Your two tables currently mix these two concepts. Consider a new table for the org char relationship (who manages whom) and a further table to model the relationship between employees and departments.
Third, think about whether you are giving the same data element the same name throughout your schema e.g. may 'an employee's manger' and 'a department's manager' have different meanings (e.g. in queries involving both you find you have to rename at least one of them to avoid confusion). If so, consider making the name more explicit and don't reply on the table name to give in context.
After redesigning your schema in line with the above advice, you will likely find your queries work as expected. TL;DR if you are finding it hard to write simple SQL DML, consider modifying your SQL DDL.
add a comment |
My first piece of advice is to apply the 'single responsibility' principle to your schema. Think about the role/persona who would update the data and under what circumstances they would do it. Bottom line: the employees
table has 'too much responsibility'. Employee name and hire date normally change less frequently than salary. Salary would be accessed every pay day. Phone number might be private. Consider creating new tables by splitting up the employees
table based on responsibilities.
Second, think about differentiating between (a) tables that model 'entities' such as employees, and (b) tables that model relationships between those entities. Your two tables currently mix these two concepts. Consider a new table for the org char relationship (who manages whom) and a further table to model the relationship between employees and departments.
Third, think about whether you are giving the same data element the same name throughout your schema e.g. may 'an employee's manger' and 'a department's manager' have different meanings (e.g. in queries involving both you find you have to rename at least one of them to avoid confusion). If so, consider making the name more explicit and don't reply on the table name to give in context.
After redesigning your schema in line with the above advice, you will likely find your queries work as expected. TL;DR if you are finding it hard to write simple SQL DML, consider modifying your SQL DDL.
add a comment |
My first piece of advice is to apply the 'single responsibility' principle to your schema. Think about the role/persona who would update the data and under what circumstances they would do it. Bottom line: the employees
table has 'too much responsibility'. Employee name and hire date normally change less frequently than salary. Salary would be accessed every pay day. Phone number might be private. Consider creating new tables by splitting up the employees
table based on responsibilities.
Second, think about differentiating between (a) tables that model 'entities' such as employees, and (b) tables that model relationships between those entities. Your two tables currently mix these two concepts. Consider a new table for the org char relationship (who manages whom) and a further table to model the relationship between employees and departments.
Third, think about whether you are giving the same data element the same name throughout your schema e.g. may 'an employee's manger' and 'a department's manager' have different meanings (e.g. in queries involving both you find you have to rename at least one of them to avoid confusion). If so, consider making the name more explicit and don't reply on the table name to give in context.
After redesigning your schema in line with the above advice, you will likely find your queries work as expected. TL;DR if you are finding it hard to write simple SQL DML, consider modifying your SQL DDL.
My first piece of advice is to apply the 'single responsibility' principle to your schema. Think about the role/persona who would update the data and under what circumstances they would do it. Bottom line: the employees
table has 'too much responsibility'. Employee name and hire date normally change less frequently than salary. Salary would be accessed every pay day. Phone number might be private. Consider creating new tables by splitting up the employees
table based on responsibilities.
Second, think about differentiating between (a) tables that model 'entities' such as employees, and (b) tables that model relationships between those entities. Your two tables currently mix these two concepts. Consider a new table for the org char relationship (who manages whom) and a further table to model the relationship between employees and departments.
Third, think about whether you are giving the same data element the same name throughout your schema e.g. may 'an employee's manger' and 'a department's manager' have different meanings (e.g. in queries involving both you find you have to rename at least one of them to avoid confusion). If so, consider making the name more explicit and don't reply on the table name to give in context.
After redesigning your schema in line with the above advice, you will likely find your queries work as expected. TL;DR if you are finding it hard to write simple SQL DML, consider modifying your SQL DDL.
answered Jan 8 at 14:20
onedaywhenonedaywhen
43.8k1079122
43.8k1079122
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463218%2funexpected-result-of-natural-join-and-using-clause%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
NATURAL JOIN
is one of the "features" of SQL which IMO should never EVER be used.– Bob Jarvis
Nov 25 '18 at 1:42
So from this example I don't necessarily think NJ was to blame. If I'm not mistaken it actually produced the correct output. Though my understanding of NJ and it's unreliability is small. If you don't mind, could you explain to me or direct me to a resource that demonstrate's in layman's terms how NJ is unreliable?
– jbone
Nov 25 '18 at 4:47