I am about to start a service which will attract a lot of traffic.
At first guess, the traffic might look like this:
1st month : 60,000 a day
3rd month : 100,000 a day
6th month : 500,000 a day
All visitors will need to be stored in a database when they enter the site. So... For each visor, an entry will be inserted in a MySQL database, meaning at 3rd month I'll also have like 100,000 * 3 months = about 9 million entries.
One visitor will be a very easy task for the server to handle. Nothing fancy with flash or anything, not even images, and they're only on the slice in less than 2 seconds at highest.
Also, I might only keep the visitor records in the database for one month, yet this will still result in about 3 million entries being added and deleted from it every month.
My question is: How big a slice will I need to do this? Is one slice enough (if I upgrade the RAM or something)? And the bandwidth?
I have given a multislice solution some thought, but seems like a difficult task to setup. But don't know if one slice will be sufficient. Anyone got some "heavy traffic" experience like this who can help me out?
I don't really have the experience you are looking for, but based on my initial thoughts and without knowing the specifics of what you are trying to do...
You will need at least two slices. One dedicated to the database and one for everything else. I would start with each of them as 256mb slices and expand as needed. A well tuned mysql server should be able to do pretty well even with the low memory provided your database is mostly optimized. If you do end up with the traffic you listed, you will probably need to upgrade the mysql slice as the extra memory will help.
Slice 1: MySQL (256 mb in the beginning)
Slice 2: Website services+mail server (256 mb in the beginning)
Let's assume this works fine... for a time, and then a sudden traffic increase happens and my Slice 2 can't handle it any longer. What do I do? Upgrade it, or try to do some load balancing between two "website slices"?
Well... Guess we're going to talk load balancing then. :)
Which is the best way to do it? I've been researching it a bit, and found that many people like nginx. Will it require a lot of changes in my current setup to go through and add a load balancing solution? Thing is that I've already made my final "website slice".
So I need to duplicate my "website slice", add a database slice and a load balancing slice so it looks like this:
Posted By: joek168umm...am i reading something wrong...or is a 256MB slice $20 and a 512MB $38...so 2 256MB would be $40 instead of $38. or am i missing something?
oops, my mistake. Personally I'd go for another 256MB slice anyway since I'd prefer to have the load split among servers.
Scalability is a fun problem, but generally speaking I think a lot of people put the cart before the horse. It's a hell of a lot harder to actually get 500,000 uniques/day than it is to scale your app to it.
So I guess I would say, worry about scalability when you need to scale. Until then, work on getting the traffic that will force you to look at scaling. :)
Well... Problem is that it's either no traffic or heavy traffic for this service I am releasing. Not something coming along. Either it will grow from 0 to 60,000 a day in a month, or it will be like 100 a day. Which means it might happen too quickly to react and scale the slice, so I need a solid setup before I launch it. It can't affort a crash due to an overload of traffic.
I'm not knocking load balancing by any means and it will be required when you get to a certain size, but I don't think load balancing should be your first step when you are exhausting a 256mb slice. First of all it is much easier to upgrade a 256mb slice to a 512mb slice then it is to introduce load balancing. Second of all the OS has a certain amount of overhead that is going to take up a certain amount of RAM, and moving from a 256mb slice to a 512mb slice is going to reduce the percent of RAM being consumed by the OS itself, but having many 256MB slices is going to have the highest percent of wasted RAM.
I don't think I would introduce load balancing until you are running in to constraints (defined as using 75% or RAM or CPU or slowness from anything else (like IO requests)) with a 2048 MB slice. Plan for load balancing when you design your apps, but keep things simple as long as you can.
Assuming you're right about getting 500,000 visits a day - why would you even consider running on a VPS structure you do not COMPLETELY own?
Even a 1G Slice would be hard pressed to provide enough Bandwidth, and that's assuming that all of those 500,000 visitors per day only loads a page with not much more info then Googles Homepage (i.e. 15K or so) and gasp, don't visit the site more then once a day.
Slicehost is a great VPS host, excellent support, good equipment, etc etc, but it's STILL a VPS service.
As to running MySQL with a large database, it can be done (Sabre runs MySQL for their database - around 20 million + records - on a dedicated cluster of 45 (yes 45) high end boxes) - although you'd probably be wise to consider PostgreSQL instead.
Overall, you're either WAY WAY WAY optimistic about your projected traffic, or your traffic needs are real and you're way way way optimistic about your ability to architect a working system by asking people on a forum what their thoughts are.
Thanks for the replies, especially Vonskippy for pointing out the bad sides of it. ;)
It's not ordinary visitors though. Not websites the service is built for. The load for one "visitor" will be less than 5kb, and about 85 % will be less than 2kb. Only thing that'd increase the load from there, would be speaking to the database, which accounts for almost nothing.
I am not being optimistic. This is what my research tells me, and I as I pointed out the amount of visits can be less than my "first guess", but I'd rather prepare for the worst.
So... to conclude on the great comments, I think I'll go for a single 512mb slice and database slice (upgrading along with necessity), and when the bandwidth (as Vonskippy pointed out) gets pressed, I'll get a third slice and prepare for load balancing. Since Slicehost offers bandwidth pooling, I can have my service running quiet solidly (well, just buying some time actually) while working on the load balancing solution.
I personally feel that a single 256mb Slice could handle it initially (at the very least for the 0-60000 level). Then just watch it to decide whether or not you need to scale to a larger slice or split to multiple slices.
60000 very small hits to a page that inserts a single record into the database isn't much at all. If there are 86400 seconds in a day then it's less than 1 hit per second. A second is a long time when it comes to inserting a record into a DB and showing some text. It would probably do the insert in less than 100ms. Sure, hits will not be evenly distributed across a day unless the service is global, but still, even if there are peaks of a 10-20k hits per hour, that's not really that much if the code is simple and optimized as well as the webserver and database settings.
Lighttpd is very light (nginx probably is too, but I haven't personally used it) and would probably be the best to use.
My main webserver (1gb slice) is running Lighttpd (30+ WP, custom DB and a few static sites). I have the DBs on a separate 256mb slice that sits there virtually idle even with 100,000+ queries a day and WP queries are much more complex than a simple DB insert. When it comes to traffic on my main 1GB slice, the number of total web requests (PHP, HTML, images and all other files) is probably 300,000+ requests per day and again, even this server is pretty much idle most of the time (load less than 0.25) . The only time I get higher loads is when search engines come through periodically and when I run a cron for webalizer stats in the early morning.
Perhaps I'm way off here, as I've never done something like what Rune is doing, but I thought I'd toss in my 2ยข anyways. :-)
Thanks a lot! Really helpful information you shared there hyperial. :) Facts and numbers. Then it's just a matter of doing my math right.
As to the search engines, then they won't be a problem because well... They're not allowed to crawl the domain. All the front-end, promoting sites and such, which needs to be listed in search engines, will be somewhere else.
You're welcome Rune... Good luck with your service.
Also, after looking at my DB server. I do have one DB with over 3,000,000 records in it that gets re-updated every week via cron. Each record has around 15 fields with information in all of them and it's only 438mb the frontend does some pretty complex queries between multiple tables and with indexes on certain key fields, I can return those queries to the enduser in less than 1 sec. I have 108 days uptime on that machine now, so it's pretty stable. :-) I know there are a few people who use the DB rather often and there's no noticeable effect on the machine. The indexes were the key thing on my app, without them the queries would lockup MySQL quickly. So if you're planning on running reports/filters on the data (on the server), then look into using indexes.