Is it a good way to generate static page for dynamic content in a large website and how to manage the static...












0















I have a website with millions of pages. The content on the page stored in database but the data is not changed very frequently. so for the sake of improving the performance of the wesite and reducing the costs of deployment of web applications, I want to generate the static pages for the dynamic content and refresh the pages if the contents are changed. But I am very concerned about how to manage these large amount of pages. how should I store these pages? Is it possible that it will cause IO problems when the web server handle many requests? Is there any better solutions for this issue? Should I use varnish to handle this issue?










share|improve this question




















  • 1





    You may want to look at using a CDN.

    – Ahmad
    Oct 29 '18 at 13:29
















0















I have a website with millions of pages. The content on the page stored in database but the data is not changed very frequently. so for the sake of improving the performance of the wesite and reducing the costs of deployment of web applications, I want to generate the static pages for the dynamic content and refresh the pages if the contents are changed. But I am very concerned about how to manage these large amount of pages. how should I store these pages? Is it possible that it will cause IO problems when the web server handle many requests? Is there any better solutions for this issue? Should I use varnish to handle this issue?










share|improve this question




















  • 1





    You may want to look at using a CDN.

    – Ahmad
    Oct 29 '18 at 13:29














0












0








0


1






I have a website with millions of pages. The content on the page stored in database but the data is not changed very frequently. so for the sake of improving the performance of the wesite and reducing the costs of deployment of web applications, I want to generate the static pages for the dynamic content and refresh the pages if the contents are changed. But I am very concerned about how to manage these large amount of pages. how should I store these pages? Is it possible that it will cause IO problems when the web server handle many requests? Is there any better solutions for this issue? Should I use varnish to handle this issue?










share|improve this question
















I have a website with millions of pages. The content on the page stored in database but the data is not changed very frequently. so for the sake of improving the performance of the wesite and reducing the costs of deployment of web applications, I want to generate the static pages for the dynamic content and refresh the pages if the contents are changed. But I am very concerned about how to manage these large amount of pages. how should I store these pages? Is it possible that it will cause IO problems when the web server handle many requests? Is there any better solutions for this issue? Should I use varnish to handle this issue?







performance nginx architecture varnish






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 9:57







yifan

















asked Oct 29 '18 at 13:27









yifanyifan

113




113








  • 1





    You may want to look at using a CDN.

    – Ahmad
    Oct 29 '18 at 13:29














  • 1





    You may want to look at using a CDN.

    – Ahmad
    Oct 29 '18 at 13:29








1




1





You may want to look at using a CDN.

– Ahmad
Oct 29 '18 at 13:29





You may want to look at using a CDN.

– Ahmad
Oct 29 '18 at 13:29












2 Answers
2






active

oldest

votes


















1














Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.



EDIT to cover the comments:




  • if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.

  • if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.

  • Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.






share|improve this answer


























  • But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

    – yifan
    Nov 23 '18 at 6:17











  • And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

    – yifan
    Nov 23 '18 at 11:57



















0














Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.



For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.



My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.



"Key is Url of the page. Value is the content of the page which has hidden updated date."



You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.






share|improve this answer
























  • Thank you for your advise, I will reconsider it.

    – yifan
    Nov 24 '18 at 22:58











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53046527%2fis-it-a-good-way-to-generate-static-page-for-dynamic-content-in-a-large-website%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.



EDIT to cover the comments:




  • if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.

  • if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.

  • Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.






share|improve this answer


























  • But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

    – yifan
    Nov 23 '18 at 6:17











  • And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

    – yifan
    Nov 23 '18 at 11:57
















1














Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.



EDIT to cover the comments:




  • if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.

  • if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.

  • Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.






share|improve this answer


























  • But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

    – yifan
    Nov 23 '18 at 6:17











  • And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

    – yifan
    Nov 23 '18 at 11:57














1












1








1







Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.



EDIT to cover the comments:




  • if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.

  • if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.

  • Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.






share|improve this answer















Varnish looks like a very good use case for that. Basically, you wouldn't be generating the full site statically, but incrementally, every time there's new requested content varnish hasn't cached yet.



EDIT to cover the comments:




  • if all the Varnish nodes are down, you can't get your content, same as if the database is down, or if your load-balancers are down. Just have two Varnish load-balanced for high availability with keepalived for example.

  • if varnish is restarted, the cache will get cleared, unless you are using Varnish Plus/Enterprise with MSE. It may not be an issue if you don't restart often (configuration changes don't need restarts), since the database still has the data to repopulate the cache.

  • Varnish has a ton of options to invalidate content: purges for just one object, revalidation, bans to target entire sub-domains or sub-trees, xkeys for tag-based invalidation.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 24 '18 at 20:41

























answered Nov 22 '18 at 5:54









Guillaume QuintardGuillaume Quintard

1595




1595













  • But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

    – yifan
    Nov 23 '18 at 6:17











  • And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

    – yifan
    Nov 23 '18 at 11:57



















  • But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

    – yifan
    Nov 23 '18 at 6:17











  • And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

    – yifan
    Nov 23 '18 at 11:57

















But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

– yifan
Nov 23 '18 at 6:17





But if vanish is down or need to restart, the cached content will be missing right? If so, how to handle this situation?

– yifan
Nov 23 '18 at 6:17













And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

– yifan
Nov 23 '18 at 11:57





And another question, if I want to update or remove some pages, can we clear the target content cached in vanish?

– yifan
Nov 23 '18 at 11:57













0














Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.



For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.



My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.



"Key is Url of the page. Value is the content of the page which has hidden updated date."



You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.






share|improve this answer
























  • Thank you for your advise, I will reconsider it.

    – yifan
    Nov 24 '18 at 22:58
















0














Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.



For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.



My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.



"Key is Url of the page. Value is the content of the page which has hidden updated date."



You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.






share|improve this answer
























  • Thank you for your advise, I will reconsider it.

    – yifan
    Nov 24 '18 at 22:58














0












0








0







Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.



For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.



My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.



"Key is Url of the page. Value is the content of the page which has hidden updated date."



You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.






share|improve this answer













Based on the description, your architecture looks like Webpages --> Services -->Database. The pages are generated dynamically based on the data in the database.



For Example, when you search for employed details, the services hits the database and get the details of employee and is rendered on the UI.
Now,if you create a content and store as webpage for every employee, this solution will not scale. Also, if employee information is changed in the database, you will have a stale data if you are not recreating the page.



My recommendation is the architecture should have a cache server and the new architecture should be like Webpages --> Services --> Cache server> Database. Services should query the database, create a page and store in the Cache. Key for cache should be page URL and value should be the page content. Now, when the URL hits the services, services will get the page from the cache rather than going to database. If the key is not available in the cache, services will query the database and fill the cache with key and value.



"Key is Url of the page. Value is the content of the page which has hidden updated date."



You can have the back-end job or a separate service to refresh the cache when data is updated in the database. Job can compare the updated date in the database vs the date in the cache value and flush the cache if the date is not matching. Job running in the back-end to refresh the cache will run behind the scene and will not impact user or UI performance.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 24 '18 at 21:36









challengerchallenger

1,1481109




1,1481109













  • Thank you for your advise, I will reconsider it.

    – yifan
    Nov 24 '18 at 22:58



















  • Thank you for your advise, I will reconsider it.

    – yifan
    Nov 24 '18 at 22:58

















Thank you for your advise, I will reconsider it.

– yifan
Nov 24 '18 at 22:58





Thank you for your advise, I will reconsider it.

– yifan
Nov 24 '18 at 22:58


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53046527%2fis-it-a-good-way-to-generate-static-page-for-dynamic-content-in-a-large-website%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Wiesbaden

Marschland

Dieringhausen