Moving denormalized data to Reporting DB
up vote
1
down vote
favorite
Current situation:
We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.
To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.
The problem:
This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)
The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:
- Using a stored procedure:
Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.
- Creating a client application:
Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.
- Using an SSIS job:
Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.
I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.
Any advice is really apreciated!
sql sql-server database reporting-services
|
show 4 more comments
up vote
1
down vote
favorite
Current situation:
We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.
To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.
The problem:
This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)
The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:
- Using a stored procedure:
Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.
- Creating a client application:
Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.
- Using an SSIS job:
Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.
I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.
Any advice is really apreciated!
sql sql-server database reporting-services
1
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago
|
show 4 more comments
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Current situation:
We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.
To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.
The problem:
This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)
The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:
- Using a stored procedure:
Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.
- Creating a client application:
Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.
- Using an SSIS job:
Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.
I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.
Any advice is really apreciated!
sql sql-server database reporting-services
Current situation:
We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.
To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.
The problem:
This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)
The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:
- Using a stored procedure:
Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.
- Creating a client application:
Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.
- Using an SSIS job:
Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.
I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.
Any advice is really apreciated!
sql sql-server database reporting-services
sql sql-server database reporting-services
edited 18 hours ago
asked 18 hours ago
colosso
2,16631739
2,16631739
1
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago
|
show 4 more comments
1
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago
1
1
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago
|
show 4 more comments
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371067%2fmoving-denormalized-data-to-reporting-db%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
18 hours ago
We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
17 hours ago
"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
17 hours ago
Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
16 hours ago
The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
16 hours ago