The Blog
Latest news, updates and service changes!
Our first blog post: A look at our webstore infrastructure.
We're glad to announce that we now have our own blog, to create feature spotlights, or interesting stories related to CraftingStore. We have been pushing out a lot of updates, but because we had no way of publicly announcing it to the masse, we've created this blog!
Now lets get started with our first item; How we keep your store online, at all times.
At CraftingStore we're really serious about our uptime, when your store is not available, you might miss sales or get a reputation of being offline when players want to donate to your server. We have a lot of counter measures in place to prevent this from ever happening. So lets talk about that.
Our set-up is designed to not rely on anything, we use different providers, located all over the globe, we use those providers because we have good experience with them, and because we know they can offer great service. At this moment, we run servers at; AWS, DigitalOcean, OVH, Hetzner, SmartDC, Online.net that can run fully independent from each other. So how do we make sure they don't rely on each other? Well we push changes to them, instead of connecting to our main database.
When any of the content of your store should change, we push the new content & changes directly to it, so all front-end servers have their own copy of the webstores, and even when our main webserver goes down, it will work flawlessly. At this moment we run a few different servers, located all over the globe, and they're all running HaProxy instances. It will first try to forward the request to the local backend server, and if that one fails, it will forward the request to another HaProxy instance, that will follow the same process. Even if the backend process (The webserver for example) does not respond to any requests, haproxy will forward them to another instance.
But what happens when one of the machines goes down, due hardware failure for example? Well then our monitoring system will kick in, every server will be health checked at least once a minute, by 3 different services, they check if; The server can respond to pings, if the HTTP server is responding within 500ms or lower, if the database backend can perform a count query. If any one of those services flags one of your IP's as offline or unstable, we push out a DNS change, and the affected server won't serve any more traffic until it's fully stable for at least 15 minutes (this means.. all criteria met, and it may not go down within those 15 minutes).
So your webstore will be online, but we have other resources, like our main website (the Dashboard), the API for the plugin, the socket servers, how does that work? Well, let me explain.
While the stores don't need anything to link them, the API server does.. so to solve the problem related to the API, we use a Galera cluster, this will sync all our databases, on all our services syn, in real time. And if one of our servers goes down for any reason, it will be picked up by the same system that's used for the stores. And another server will take over the requests. And when it comes back online, it will re-sync with the databases servers, and it will start accepting request (after the 15 minutes of being stable of course).
Everything in our system is redundant, to minimize service disruption to the absolute minimum.
So in a nutshell, we take our uptime & security very seriously, to offer the best uptime & Performance any webstore can have. Good luck with accepting donations on your store!
Posted by Tim_kwakman, in Infrastructure, on 2018-07-08 15:53:40
Back to overview