BigQuery Global Alias of Wildcard for Self Join
Often I need to join a table to itself, and based on the match predicate, print out all the columns from both tables. e.g
SELECT t1.*, t2.*
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Issue with this is that you get column name collisions since they are the same source table / cols in each.
Is there a way to avoid this but still use a '*'? I checked the doco and cant see any option in standardSQL
Ideally Id just want a prefix to be applied to the wildcard projections, rather then hand alias each col individually / manually, as a lot of our tables are really wide, and change, and hand cranking is just laborious
e.g Id like to be able to do something like this to prefix each of T2's columns with the string "t2_"
SELECT t1.*, t2.* PREFIX("t2_")
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Currently there is likely not a solution as clean as this, unless I missed it in the doco (hope so!), if there are any interim SQL solution using other methods please let me know!
Thanks!
|
show 4 more comments
Often I need to join a table to itself, and based on the match predicate, print out all the columns from both tables. e.g
SELECT t1.*, t2.*
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Issue with this is that you get column name collisions since they are the same source table / cols in each.
Is there a way to avoid this but still use a '*'? I checked the doco and cant see any option in standardSQL
Ideally Id just want a prefix to be applied to the wildcard projections, rather then hand alias each col individually / manually, as a lot of our tables are really wide, and change, and hand cranking is just laborious
e.g Id like to be able to do something like this to prefix each of T2's columns with the string "t2_"
SELECT t1.*, t2.* PREFIX("t2_")
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Currently there is likely not a solution as clean as this, unless I missed it in the doco (hope so!), if there are any interim SQL solution using other methods please let me know!
Thanks!
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13
|
show 4 more comments
Often I need to join a table to itself, and based on the match predicate, print out all the columns from both tables. e.g
SELECT t1.*, t2.*
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Issue with this is that you get column name collisions since they are the same source table / cols in each.
Is there a way to avoid this but still use a '*'? I checked the doco and cant see any option in standardSQL
Ideally Id just want a prefix to be applied to the wildcard projections, rather then hand alias each col individually / manually, as a lot of our tables are really wide, and change, and hand cranking is just laborious
e.g Id like to be able to do something like this to prefix each of T2's columns with the string "t2_"
SELECT t1.*, t2.* PREFIX("t2_")
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Currently there is likely not a solution as clean as this, unless I missed it in the doco (hope so!), if there are any interim SQL solution using other methods please let me know!
Thanks!
Often I need to join a table to itself, and based on the match predicate, print out all the columns from both tables. e.g
SELECT t1.*, t2.*
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Issue with this is that you get column name collisions since they are the same source table / cols in each.
Is there a way to avoid this but still use a '*'? I checked the doco and cant see any option in standardSQL
Ideally Id just want a prefix to be applied to the wildcard projections, rather then hand alias each col individually / manually, as a lot of our tables are really wide, and change, and hand cranking is just laborious
e.g Id like to be able to do something like this to prefix each of T2's columns with the string "t2_"
SELECT t1.*, t2.* PREFIX("t2_")
FROM `t` t1
JOIN `t` t2
USING(Id)
WHERE xxxx
Currently there is likely not a solution as clean as this, unless I missed it in the doco (hope so!), if there are any interim SQL solution using other methods please let me know!
Thanks!
asked Nov 21 '18 at 12:19
user2992225
226
226
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13
|
show 4 more comments
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13
|
show 4 more comments
2 Answers
2
active
oldest
votes
Below approach is a little controversial as it points back to features in BigQuery Legacy SQL - but why not to use them when it is handy :o)
So, below is for BigQuery Legacy SQL and you can use it to get result that you need. In case if you will need features of Standard SQL to further process that result - you can just simply preserve it in the temp table and than use for further processing
Meantime, below code does not require you to specify columns and result is flattened
#legacySQL
SELECT *
FROM [project:dataset.table] t1
JOIN [project:dataset.table] t2
ON t1.Id = t2.Id
WHERE xxxx
The trick here is that all columns from table aliased as t1 - will be prefixed with t1_ like t1_col1, t1_col2 and so on ; and all column from t2 will be prefixed with t2_ like t2_col1, t2_col2 and so on - so no column name collisions
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
add a comment |
Just select t1 and t2 instead:
SELECT t1, t2 FROM `t` t1 JOIN `t` t2 USING(Id) WHERE xxxx
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53411884%2fbigquery-global-alias-of-wildcard-for-self-join%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Below approach is a little controversial as it points back to features in BigQuery Legacy SQL - but why not to use them when it is handy :o)
So, below is for BigQuery Legacy SQL and you can use it to get result that you need. In case if you will need features of Standard SQL to further process that result - you can just simply preserve it in the temp table and than use for further processing
Meantime, below code does not require you to specify columns and result is flattened
#legacySQL
SELECT *
FROM [project:dataset.table] t1
JOIN [project:dataset.table] t2
ON t1.Id = t2.Id
WHERE xxxx
The trick here is that all columns from table aliased as t1 - will be prefixed with t1_ like t1_col1, t1_col2 and so on ; and all column from t2 will be prefixed with t2_ like t2_col1, t2_col2 and so on - so no column name collisions
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
add a comment |
Below approach is a little controversial as it points back to features in BigQuery Legacy SQL - but why not to use them when it is handy :o)
So, below is for BigQuery Legacy SQL and you can use it to get result that you need. In case if you will need features of Standard SQL to further process that result - you can just simply preserve it in the temp table and than use for further processing
Meantime, below code does not require you to specify columns and result is flattened
#legacySQL
SELECT *
FROM [project:dataset.table] t1
JOIN [project:dataset.table] t2
ON t1.Id = t2.Id
WHERE xxxx
The trick here is that all columns from table aliased as t1 - will be prefixed with t1_ like t1_col1, t1_col2 and so on ; and all column from t2 will be prefixed with t2_ like t2_col1, t2_col2 and so on - so no column name collisions
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
add a comment |
Below approach is a little controversial as it points back to features in BigQuery Legacy SQL - but why not to use them when it is handy :o)
So, below is for BigQuery Legacy SQL and you can use it to get result that you need. In case if you will need features of Standard SQL to further process that result - you can just simply preserve it in the temp table and than use for further processing
Meantime, below code does not require you to specify columns and result is flattened
#legacySQL
SELECT *
FROM [project:dataset.table] t1
JOIN [project:dataset.table] t2
ON t1.Id = t2.Id
WHERE xxxx
The trick here is that all columns from table aliased as t1 - will be prefixed with t1_ like t1_col1, t1_col2 and so on ; and all column from t2 will be prefixed with t2_ like t2_col1, t2_col2 and so on - so no column name collisions
Below approach is a little controversial as it points back to features in BigQuery Legacy SQL - but why not to use them when it is handy :o)
So, below is for BigQuery Legacy SQL and you can use it to get result that you need. In case if you will need features of Standard SQL to further process that result - you can just simply preserve it in the temp table and than use for further processing
Meantime, below code does not require you to specify columns and result is flattened
#legacySQL
SELECT *
FROM [project:dataset.table] t1
JOIN [project:dataset.table] t2
ON t1.Id = t2.Id
WHERE xxxx
The trick here is that all columns from table aliased as t1 - will be prefixed with t1_ like t1_col1, t1_col2 and so on ; and all column from t2 will be prefixed with t2_ like t2_col1, t2_col2 and so on - so no column name collisions
edited Nov 21 '18 at 18:17
answered Nov 21 '18 at 13:22
Mikhail Berlyant
55.9k43368
55.9k43368
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
add a comment |
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
Cool! I always have the problem of a standardSQL view not being able to call a legacySQL one for such feature gaps, but, for my use case that might not be a problem in this instance....let me give it a go.....thanks and keep up the fantastic community hero work for BQ! :)
– user2992225
Nov 21 '18 at 19:43
add a comment |
Just select t1 and t2 instead:
SELECT t1, t2 FROM `t` t1 JOIN `t` t2 USING(Id) WHERE xxxx
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
add a comment |
Just select t1 and t2 instead:
SELECT t1, t2 FROM `t` t1 JOIN `t` t2 USING(Id) WHERE xxxx
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
add a comment |
Just select t1 and t2 instead:
SELECT t1, t2 FROM `t` t1 JOIN `t` t2 USING(Id) WHERE xxxx
Just select t1 and t2 instead:
SELECT t1, t2 FROM `t` t1 JOIN `t` t2 USING(Id) WHERE xxxx
answered Nov 21 '18 at 12:34
Elliott Brossard
16.2k21035
16.2k21035
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
add a comment |
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
lol - nice simple option :) . I was trying to flatten the whole result set is the only thing, but, maybe this will need to do. Appreciated.
– user2992225
Nov 21 '18 at 13:11
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53411884%2fbigquery-global-alias-of-wildcard-for-self-join%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
what the practical use of having duplicates of all columns in the output?
– Mikhail Berlyant
Nov 21 '18 at 13:30
i think I can see few such use cases :o) but still curious about what exactly yours - if you can share
– Mikhail Berlyant
Nov 21 '18 at 14:01
My use case is comparing the output (delta/differences) of a bq release (e.g a view) against a golden reference table thats certified from the previous release. Form of regression testing of our views. They are the same table output, same columns, but I want to find the differences, and when I do, output all the cols from both golden and current release.
– user2992225
Nov 21 '18 at 14:04
also the reason for flattening is Tableau (where we report on bq regression results) prefers such data flat unfortunately....Elliotts response below achieves the general need except for flattening.
– user2992225
Nov 21 '18 at 14:06
was building some generic table compare SQL using TO_JSON_STRING, was just trying to flatten the output rather then have a top level record/struct....thanks!
– user2992225
Nov 21 '18 at 14:13