
It seems that I have three options, in This Modern World:
- Virtual server.
Many options; the big ones are Amazon, Digital Ocean, Google. They are probably all about the same. Price is somewhere between $450 and $800/month, maybe?
Pro:- Everyone does it this way.
- When it is Upgrade Season, spinning up a new instance for rebuild/migration is easy.
- I will never have to think about disk, RAM or power supplies going bad.
- Expensive.
- The way I would be using it would be to have a single instance. Nobody does it that way, so it probably doesn't work very well.
- I need 2TB of file system storage. Nobody does it that way, so it's expensive.
- Figuring out exactly which of their many options is the configuration that I need is really difficult.
- Whatever IP address they give me is probably already on every email spam blacklist in the world.
- Dedicated server.
I'm seeing numbers anywhere from $100/month to $500/month. It's all over the map, which does not inspire confidence.
Pro:- It's a real damned computer, with predicable behavior.
- When disk, RAM or power supplies go bad, someone else fixes it.
- I never need to physically visit it.
- It is hard to tell whether the companies that offer this service will still be in business two years from now.
- It's hard to tell whether they are real companies, or "one flaky guy".
- Spinning up a new instance in Upgrade Season is somewhat more involved, and maybe costs me a couple hundred bucks.
- Though it can be located anywhere, since all of my customers are in San Francisco, it probably should be on the West Coast. That narrows the already narrow field of options.
- People keep recommending companies that are not hosted in the country I live in. This strikes me as extremely foolish for several reasons.
- Bare rack slot, with my own home-built 1U computer in it.
Probably something like $100/month, plus the cost of the computer (say, $1000, will last 4 years).
Pro:- It's a real damned computer, with predicable behavior.
- Cheap.
- Hardware failures are my problem.
- Spinning up a new instance in Upgrade Season is a huge pain in the ass.
- The data center has to be local, because I probably need to go physically visit it every year or two.
Figuring this out is such a pain in the butt. I really want to believe that option 1 is the way to go, but I'd need to get the price down (without first needing to completely re-design the way I do absolutely everything, thanks), and it just sounds like it's going to be flaky.
Options 2 and 3 sound flaky in their own ways. Pro: I already understand those ways. Con: one of those ways is why I'm looking to move in the first place.
You likely figured this out from your previous post on the subject, but option one breaks down in two ways:
1. Virtual machines. This is Linode and DigitalOcean, and then a bunch of also-rans and startups. They cater to operators who believe that each machine in their fleet is important, from a fleet size of one to a fleet size of hundreds, and that no data shall be lost other than through the misbehaviour of the hosted OS or the catastrophic failure of the underlying hardware. This whole line of business traces back to people who figured out that it was cheaper to rent out VMs than to rent out physical servers, but who basically serve the same customers as would be served by a physical server.
2. EC2-like compute clouds. This is AWS, Google Cloud, Azure, Heroku, and a bunch of others. They cater to operators who believe that no machine is important, and who are comfortable building data recovery and continuity into their systems from the ground up. You can run single instances, but they're not designed around that, and often the underlying substrate either makes no promises or actively disclaims data resilience in the face of incidental failures. This line of business traces back to people who figured out that if you're going to deal with thousands to millions of computers anyways, you might as well throw a hypervisor into the mix so that you don't have to upgrade all the OSes at the same time.
Knowing your predilictions, of the two I would expect a VM-like host to be more your style if you didn't use a physical server. I don't have any suggestions or price quotes you haven't already discussed, but I would strongly recommend avoiding the second category based on your descriptions of your goals. Your uptime is Amazon's negative externality, and they will destroy your EC2-stored instance-local data without hesitation or notice, because they expect you to build your systems the way they build theirs, and that's how they manage their own services.
Huh. From reading Digital Ocean's site, it's not at all clear to me that they are your #1 version and not just another instance of the #2 version.
I know several people have recommended Linode, but many more people have recommend against them with extreme vehemence.
I use Linode and have for years, but some inside baseball on their handling of security incidents would make me extremely reluctant to recommend them. I'm not migrating off because the competition is basically a wash.
A DigitalOcean "droplet" is yet another name for a Xen-hypervised virtual machine, with reasonably long life expectancy unless you intentionally destroy it. Their guarantees about data integrity and resilience are unicode shrug emoji, and like everyone else in that business they charge out the nose for long-term storage (their quote gizmo says $200/mo for 2TB, just for the disk, which is insane if you know what a disk costs), but they do fit the long-lived server mentality pretty well in practice.
#1 (aka "virtual server", aka VPS) was the thing that a lot of places were offering for a while now as a price-conscious compromise between a dedicated server and shared hosting. When Amazon showed up with AWS/EC2 (which is effectively the definition of #2), they made #2 very popular. Lots of places that were offering #1 suddenly found themselves considered the "old and busted" solution, while AWS/EC2 was "the new hotness." Providers of #1 adopted some #2-like features and #2-like nomenclature to cash into Amazon's success, with varying degrees of success.
A bit late to the party, but here's some feedback using DO here.
- Migrated ~30 instances from our own metal to DO (FRA1, AMS3). Although those are Kms, The closeness to the metal things was a definitive factor, so that zero change was needed to our habits and tooling. For that reason EC2/EBS/S3 was out of the question.
- Support has been stellar so far
- Convenience of tools and API is through the roof
- Serious about IPv6
- VM snapshots require poweroff
- VM backups (weekly) are live (with all caveats) but give a warm feeling of having an additional safety net
- Backing up to S3 (FRA) saturates the link
- Attached volumes are external to the VM thus not handled by VM backup. They are hosted on the same DC, directly attached to the VM and appear as additional SCSI block devices (e.g /dev/disk/by-id/scsi-0DO_Volume_volume-fra1-02) so you even get to pick the FS. Moving those around machines in the same DC is easy: unmount/unplug/plug/mount.
- Downtime is generally planned with 1 to 2 weeks notice (can be rescheduled via support) and goes on two kinds mostly: network or host operation. The former has really low impact (generally latency spikes for a few seconds), the latter happens when something off is detected on the hypervisor hardware, often requires migration of the machine to a new one, hence downtime is required while throwing bits towards the new hypervisor, so proportional to VM disk size (not attached volumes which are separate). This is where having smaller VMs matter. Or a failover machine.
Here are some data points about the cloudy things which may or may help you in making sense of the general performance (CPU, disk):
https://www.webstack.de/blog/e/cloud-hosting-provider-comparison-2017/
s/Kms/VMs/
> but I'd need to get the price down (without first needing to completely re-design the way I do absolutely everything, thanks)
I think "re-design the way you do everything" is the primary problem with option #1 and .. unavoidable. I agree that cloud offers advantages but you gotta adapt your process, or bring lots and lots of cash..
I would like to reiterate my love of the cheap ali express boxes https://blog.codinghorror.com/the-scooter-computer/ -- you could buy three of these for ~$400-$500 each, slap a 2TB 2.5" HDD in every one of them, colocate 'em, and have great redundancy. The only real risk is "meteor hits datacenter".
(I would also suggest a Samsung 128GB SSD as boot drive for each one, and that'll fit in those boxes as well, alongside the 2TB 2.5" HDD)
It's basically the "mac mini colo" plan, but .. cheaper and a lot faster.
Of course, "meteor hits datacenter" will affect option 2, too. And even though it's not supposed to affect option 1, it probably will.
I run a business in San Francisco. If we have a "meteor" here big enough to take down the data center, that business's web site being offline will be the least of my problems.
As a boot drive for linux, you probably don't need more than about 16GB, although it's hard to find SSDs that small these days.
Well, lots of free space on a SSD also means lots of space to reallocate bad cells for redundancy. Also, larger SSDs tend to be a fair bit faster due to more chips / parallelism on the drive. So I don't like to recommend anything under 128GB.
I am loathe to get into these discussions with you normally, but since I ~actually know about this stuff~ I guess I'll bite.
The big thing that stuck out to me as "probably not actually a problem" here is the single instance thing. You may only need a single instance, but if you use an autoscale-group-of-one technique, you'll get a bunch of high availability benefits and high end monitoring/management options that you may not need, but can't hurt and will remove the "nobody does it this way" factor. That said, a ton of people DO do it that way without the autoscale group- it's just not generally considered a good idea because most people are running a bunch of stuff and care about high availability.
The gist of autoscale-group-of-one is you build the box, image it, then make autoscale rebuild the box. If it dies for whatever reason, autoscale notices and just builds it again. The downside is that if you make changes to it, you'll have to reimage it and point autoscale to the new image so it comes back up correctly. You can get around that problem by keeping your changes in some sort of config management tool or code repo if you want.
I've built a ton of environments like this in AWS. I haven't tried it in Google or DigitalOcean (although I've poked around DigitalOcean and I like it quite a bit) but I'm sure they have similar capabilities.
Presuming that you stay in AWS, you could also get around the email black hole issue by integrating SNS into your mailing software. I realize you probably won't want to do that because it would tie you to the platform more than you'd prefer, but using some sort of managed service is probably the only way to guarantee the blacklist thing will not be your problem, because it makes it somebody else's problem.
Finally, regarding the cost and the filesystem thing: Does it have to be filesystem storage, and does it have to be fast? Because filesystem storage on EBS is expensive, but S3 is cheap as hell. If speed isn't a huge issue, you can even use s3fs-fuse to mount an S3 bucket to a Linux host.
A pizza box in a local DC is probably your best bet, honestly.
I mean, if cost was the primary motivator I would suggest investigating how you might lower your disk / memory / CPU requirements so it all fits on the super cheap throwaway VMs - I strongly suspect your requirements are much lower than you imagine - but doing so requires time and effort I expect you want nothing to do with. Totally fine.
Hardware is pretty reliable these days and throwing up a Dell R230 with RAID mirrored disks would probably never even require you to visit it for several years. There's got to be a reliable DC in the SF area that the nerds like. So that's my recommendation.
For Option #2 I've had very good results with a dedicated server at M5 Hosting, located at their data center in San Diego. Mine is $125/mo (current offerings look like they're around $250/mo) and has been working for 9 years (since January 3, 2008) with better than 99.95% uptime. On the very rare occassions when something goes wrong they do root cause analysis, FIX IT, and send out detailed analysis of what happened, why, and what they've done about it.
On the rare occassions when I've needed to contact their Technical Support they've been responsive and clueful. I run FreeBSD but they also offer Linux and perhaps other OSes.
Judging by the prices of cloud/dedicated, looks like you're looking for some top-level two-socket server. I think it cost more like $10k, not $1k. Otherwise, there are Hetzner (8-core Ryzen for 60 euro, oh wow) and Leaseweb (2×6-core SNB-EP for 99 euros, how can it be), for example.
Reiterating my previous reply:
Amazon / Google / Azure are for short duration project servers. Shorter than 1-2 years. Not what you want, and you also don't want to pay for the flexibility (They cost about twice what you used to pay for a reason).
You also want a stable solution (company still exists in 4 years, or actually product still exists). That leaves out Google / Azure (not their core business). So Amazon (their cloud business is bigger than their retail business, so it's now their core business) is the only cloud solution you can look at. Still, expensive...
That leaves less flexible solutions. For those, go for the bigger (in this context) hosting providers, like Leaseweb (20 yrs old, 200+ employees, 16 datacenters, one of them in SFO). Yes, it's a Dutch company, so what.
I don't think I agree with the 1-2yr upper timebound on cloud instances - AWS even offers a price break for 3 yr terms. (~$50 less than 1yr term rates).
There is no timebound, just a pricebound. AWS is flexible, and you pay for that if you're not.
Amazon's prices for 3+ year servers are about double what you can get elsewhere. And their data transfer out is ludicrously expensive at $0.09 per GB.
The point of Amazon is their flexibility. You can get a server from them for 5 minutes and pay cents.
For AWS, you can get 2TB of "throughput optimized" spinning rust (EBS st1) for $92/month; if you want "generic" spinning rust it's even cheaper at $52/month (EBS sc1) - Oregon DC prices.
I have a big package repo server in AWS: it's a mediocre t2.large instance with 16 GB of 'instance storage' plus four 1 TB 'sc1' volumes and one 50 GB SSD volume; with ZFS-on-Linux I use the 50 GB as a cache for the pool. This works well.
Since it's been many years since I've had to worry about low-level nonsense like disk latency, I'm not entirely clear on what speed I actually require from these virtual disks. How "slow" is sc1 in practice? If I have a 33GB .git repo on there, will I be hating life?
Through what mechanism?
You can change the volume's storage class and IOPS after making it, so pick the slow one and then make it faster if you want.
As to the caching: ZFS pools support this (zpool add tank cache /dev/diskh).
@Eric re: s3fs-fuse Have you ever actually tried that fuse module? I tried, and it was incredibly unstable (and there's Elastic File System now anyway, if shared, slowish NFS floats your boat), but maybe there have been improvements.
My employer is all-in with AWS, but I use both AWS and DO (tiny) instances for personal use; I also only treat EC2 as a virtual server, and stay away from the Amazon-specific junk.
Based upon your current specs, I think this would be the most similar AWS configuration (per month):
1-year reserved m4.2xlarge (32GB, 8 cores of an Intel Xeon E5-2676 v3) = 181.040
2TB st1 EBS * .045/GB = 90.00
2000 AWS out * .09/GB = 180.00
$450 or so is certainly more than your current provider, for roughly the same setup.
I can say that I've had a single instance EC2 server go down on prod... perhaps the whole host server tanked? I was able to restart it, and it was back within 15 minutes, with no data loss. I'm not sure that I would consider that a good deal, if it was my own server, but I'm happy to spend my employer's money on it., and not have to call up Rackspace or whoever and wait an hour-plus for a reboot (never mind if something is actually wrong with the hardware, which is considerably less likely in EC2, since the instance will just restart on another host).
I also once worked the night shift in a "dedicated" server datacenter, and perhaps things are better now than 2005-2010, but I doubt it (on the low-end, anyway); the cheapest servers were basically desktop computers, with awful, used/refurb hard drives (no RAID, of course) and completely inadequate cooling. That was generally anything less than $200/month (although there were real 1U server options, too)... at which point, since you are already considering colo, you might as well buy a real 1U server. (the dedicated hardware itself might not be your problem, but it certainly is a problem when it fails).
Colo sounds like the most appropriate option if the AWS cost is too much, or the specs too constraining. Considering your current budget is $300 and you're considering $100/month colo (I guess that price is current? I'm out of the loop on 1U colo pricing), does 2U + 2 servers make sense? That way you can backup to a different server and have a failover scenario (doesn't sound like serious redundancy is essential here, so at least this way you could just manually change DNS and so-on if/when shit hits the fan). When upgrade season comes, just swap the backup server with the new primary, then repeat for the other two servers, at your leisure.
$1000 per server, however, does sound a little low for something with 8+ cores, 32GB RAM, and a few HDs... maybe as low as $1200, but $1600 sounds more reasonable for new hardware, especially if you improve on the specs of your current server. But that is for a "real" server... If the datacenter allows you to fit 2 or more of Jeff's scooter PCs, that will certainly bring the cost down (I doubt any of those machines currently support more than 16GB RAM or multiple magnetic disks, however).
That pricing (or a little less if he goes with 3-yr term reserved m4.2xlarge) also gets him a CDN to help manage the server load, allowing him to potentially downgrade to a smaller server in the future, which may get him down to under $400... that said, he probably won't get down to his $300 current budget given that $270 of it is eaten by just disk and bandwidth.
Putting a CDN in front of the server may not help too much if it is many sparsely-hit sites, and the primary users are all on the West Coast. I guess Cloudfront US bandwidth is slightly cheaper than EC2 out (.09 vs .085), but hideously expensive for other regions. There's also his sensitivity to breaking links, as well as CDN latency, CDN header/URL manipulation, etc, etc, nevermind the effort of CDN configuration for many domains.
Amazon's DDoS protection, however, does seem very legit, but I can't speak from experience; it hasn't been a problem. They've got their free "AWS Shield" on the whole network, and application-level for extra. Smaller hosts are likely to just null route the victim IP, not work to stop the source of the attack (but maybe some are no longer stuck in the 90s?)
@jwz, I'm still not sure how you've gotten to the conclusion of "it sounds less good when each of those virtual servers is an additional $250/month"... would each of them need to be so large if things were split up onto a few smaller servers? Maybe I'm just used to nginx's memory footprint, and haven't seen a staggeringly large Apache instance in a while. Smaller EC2 instances do cost a little more than the equivalent CPU/RAM on a larger instance, and means a few more "beasts" to care for, but you can still realize most of the advantages of modularization without completely buying into the "Amazon Way" (also, the ability to extend/upgrade without it being all-or-nothing).
If you don't care about webspeed to other regions, Cloudfront can be limited to only the cheap regions (and thus the lowest cost). A CDN will happily proxy whatever links you want, so breaking links isn't an issue. Figuring out the caching setup is likely to be the most problematic, but shouldn't be too difficult - cache based on session cookie headers or whatever login mechanism is being used. And none of it has to be done from the start - it can be phased in as he feels like it.
On the email blacklisting point: you can register an IP with AWS as a source of email. They will then whitelist it for higher quantities of outbound mail than whatever threshold might otherwise apply, and also engage with some crosssection of blacklist operators to ensure your IP isn't flagged. I've been running two servers like this for a few years, one of which hosts a number of mailing lists.
(Full disclosure, this particular sausage factory employes me, and I'm not speaking in an official capacity - just as a customer)
Can confirm, I've had luck with outbound SMTP from an EC2 instance. The RDNS/SMTP permission slip is about the clunkiest AWS experience I've had, but it certainly doesn't come up often.
Some SMTP receivers might consider an EC2 IP inherently "spammier", but it is less and less of an issue, especially with proper SPF records.
Also, if you plan to mail from AWS, you would first setup an Elastic IP address before assigning it to an EC2 instance; you can check that randomly assigned IP against an RBL list like http://www.anti-abuse.org/multi-rbl-check/, and throw it back and take another one, all within the web interface and without having to contact support.
Your summary is correct. I would say that point "Whatever IP address they give me is probably already on every email spam blacklist in the world." is true for all three options, however.
My recommendation would be a combination of option #1 and #2. Rent a server for cheap from somewhere like OVH as your main machine, but keep a "warm standby" in AWS at the smallest instance size you can use and still rsync between them hourly. If the worst happens to #2, bump the AWS instance size to something larger and flick DNS over to the AWS node until things stabilise. And don't use us-east-1.
I'd also recommend looking at somebody like Sendgrid to handle email whitelisting & so forth. Unless you're sending to a lot of people, their rates aren't that high and you can just forward to their servers via sendmail/postfix.
Also - worth noting that AWS disk prices you mentioned in your previous post are also partly as the more disk you rent, the more IOPS you rent - eg, we rent 1tb of disk for our production servers not because we need the space, but we need the IOPS.
There are other, slower, disk types you can use in AWS that will be significantly cheaper if what you're after is cold storage.
what in the flipping fuck is IOPS. PSHCYOPS IOPS WE NEED MORE IOPS GROUP. OVH. OTOHDC. SOV. RTD. ALM. OH. what fucking planet do you live on dude
A planet where you can google terms you don't know.
Well, perhaps not OVH, since they are a french company and don't have much of a presence in the U.S. West Coast (https://www.ovh.com/us/about-us/datacenters.xml). Otherwise I would have recommended them too.
Since I'm here writing, I'd like to reiterate what was pointed out above by Owen Jacobson, which is that option 1) can be subdivided between "VPS" and "cloud". From what I understand, a "VPS" is pretty much the same than a dedicated server, in the sense that you get root, a conventional filesystem, a reasonable guarantee that the server won't vanish into thin air... only the server is virtualized, and is sharing physical resources with other virtual servers inside a big machine.
The "cloud" proper, OTOH, is Amazon, Azure, etc., with all their weirdness, "buckets" instead of proper disks and so on.
Lastly, have you been following @Pinboard on Twitter? He has been doing some hardware upgrades on these last couple of days, and he has Opinions about the Cloud and the hype around it.
Ah - I assumed (wrongly) that as OVH now had a real US DC (rather than one in Canada, which we used for our US production hosting for a while) that they'd offer a west coast option.
Personally, I treat VPS as "worst of both worlds".
If you are not going to redesign your whole pile of things, options 2 or 3 probably make more sense. Option 1 is probably viable with some small-ish changes but will still cost more. If you do redesign the whole thing it is quite possible option 1 would end up costing something between options 2 and 3 afterwards with less^Wdifferent things to worry about.
I run a service that brings half my income on Heroku, which sits atop AWS - very cost effective and robust (less than $100 / month at my level of usage).
It has been perfect for my application.
Could you keep your 2TB of data on AWS S3 (videos) and use heroku to redirect people to that? And I think they have a specific service for streaming videos etc...
I run ruby/rails but I know they support a bunch of other languages these days
+1
If any portion of that 2TB is static assets (secured or unsecured), stop doing that: Amazon/Azure can do that faster/cheaper/easier
If any portion of the 2TB is transformed/transcoded on the fly prior to being shoved out the door, stop doing that: either use a service that supports on-the-fly transformation, or find another strategy.
Either way: the real problem is offloading your big chunky data to a better place, once you do that, everything else becomes easier.
Dear everybody, please stop replying to this until you have read at least my comments on the previous post.
I would prefer not to have to explain yet again why I'm not interested in "merely" changing literally everything about the way I do literally everything.
I have both of these options: dedicated server I bought that hosts my personal email & web, as well as a business built on AWS. Given your familiarity with the single server model and existing software base, options 2 or 3 are all you should consider.
At that point it's a pricing exercise. Want to get the price down? Buy your own server. More money than time? Rent a dedicated server someone else bought.
If your time didn't matter and you wanted to save the most time, you'd rewrite your software to run off S3 with a tiny EC2 host doing some of the logic. You could probably get costs down below $50/month. But that's not you. So keep being true to yourself.
"People keep recommending companies that are not hosted in the country I live in. This strikes me as extremely foolish for several reasons."
Working theory: AWS, et al have uniquely corroded the minds of the domestic technorati, so the advances in bare metal hosting have only matured in foreign countries where political and cost pressures make people saner.
If you get really interesting you'll take a while to realize you're not reaching the foreign servers anyway.
Maybe this should be on the other post, but anyways...
Virtual servers charge a lot more for disk because sharing disk IOPs is challenging. So, they want to charge for IOPs, but can't, so they charge a lot for small amounts of disk as the alternative. Especially if they can use SSDs to provide the disk, which has better properties for sharing. Also see on GCE, where the IOPs for persistent disks scale with the amount of disk you reserve.
For that reason, I haven't retired my 1U while I've been trying GCE and Linode for years.
Also, jfyi, Google's product, GCE, blocks outbound port 25, so probably not on email blacklists, but also useless for sending email.
I'm a proponent and practitioner of option 3. My servers live a good few hours away from me, and I never, ever need to visit them. A good tip that can save you lots of time in case of disasters is to tuck a bootable SD card inside the case with a read-only rescue partition and make the boot manager fail over to it if the main operating system isn't bootable. You can then fix any irritating software failures without the time and expense of going to the site and pulling the server for a day. Generally speaking the on site techs in these places are willing to go power cycle a server for you, or even plug a monitor in and hit some buttons on the boot menu if you really need it.
Hardware basically lasts forever in a decent climate controlled rack. As long as you have at least a 2 disk RAID array, you'd have to be extraordinarily unlucky to experience a failure that necessitates more downtime than the time it takes to rebuild the RAID.
Well, my experience has always been that once you find yourself needing to do a major-version upgrade of kernel or libc -- always necessitated now now now by some catastrophic security bug -- the only safe way to do it is on a brand new drive. Otherwise you risk a week of downtime while chasing down al the failures from cascading dependencies. Like, oh, you wanted a new kernel and surprise! Now you have to have a new apache and php, and nobody gives two shits about backward compatibility, so go "improve" a dozen config files. CADT.
I suppose if you were feeling really enthusiastic, there's nothing stopping you from virtualising your own server and having both a live instance and dev instance on the same box. You could have your dist-upgrade related disasters on the redundant copy, and just swap them over once the drama is resolved.
I suppose it all depends on just how much effort you're willing to put in to being sysadmin guy. If you're saving 100 bucks a month, that's about 2 hours you can probably "pay yourself" to do it, if it comes to more than that then maybe the virtual server route is better.
What if you need to upgrade the VMs' host?
The pesky dependencies don't live there. Known-working OS image + VM software combo is an easy enough target to hit.
I read some of your comments on the previous post and would still recommend you go with something like S3 if you do go ahead and decide on option #1. This would save a lot of money on the 2TB storage (to ~$50/month, before any possible optimizations).
The issues you mentioned with regards to URLs, HTTPS, etc. are a solved problem (I implemented this a few times before myself).
Solved how? Got a link?
Just to make sure I don't repeat anything written before, I went back and looked at the previous thread, but let me know if I missed anything.
Here's what I suggest:
1. Use aws-cli's s3 sync feature to 'rsync' your server's static (i.e. no server processing needed, so not Perl scripts) to an S3 bucket you created named static.jwz.org (for example). Your directory structure, for all intents and purposes, is preserved. It isn't, but that doesn't matter.
2. Mark the bucket for static website hosting. At this point you can now access your content via the same URLs, with a different domain.
3. Create a redirect on your www server (Apache/nginx/...) for all media requests on www to 301 to static.jwz.org instead (thus preserving urls).
4. Create or use an existing SSL certificate and create a CloudFront distribution (AWS's CDN) over that bucket and add your SSL certificate to it.
5. Point your static.jwz.org domain to the CloudFront distribution you created.
After that there are some optimizations you can do to lower your cost, but that's the initial plan.
[1] http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
[2] http://docs.aws.amazon.com/AmazonS3/latest/dev/website-hosting-custom-domain-walkthrough.html#root-domain-walkthrough-configure-bucket-aswebsite
[4] https://aws.amazon.com/cloudfront/custom-ssl-domains/
That's not the same url, then.
You do realize this is point #2 in my list, which continues to #5 and was prefixed by "At this point", right?
Most web servers have some sort of "resolve urls with prefix /x/y/z/ by proxying to prefix https://s3.example.com/a/b/c/" feature, so the original URLs don't have to change, but there's no need for a crapton of block storage for the compute host to serve them. It does hit the compute host twice with bandwidth, though, but maybe bandwidth is cheap and plentiful.
There's usually a caching feature with the proxy feature, so if the compute host has a little block storage it can serve popular URLs out of that.
This would be necessary when you need millisecond improvement in latency, otherwise I'd skip that optimization.
If S3 is the backend, and it's in the same place as the frontend, the cache probably adds latency. On the other hand, if everyone's downloading the same 100MB video file today, the cache will save some download bandwidth costs.
Also, the backend doesn't have to be S3. It could be Backblaze B2 or some cheap startup living in Europe. This would trade cost against latency and the cache can get some of that back.
I think you're referring to proxying S3 through an EC2 instance, while I'm referring to 301ing from an EC2 instance to an S3 url, meaning the instance doesn't even server the content. Add to that my points #4 and #5 which mean you have a CDN, so S3 isn't even hit most of the time for more frequently accessed content.
If you do it with 301s, there are now two URLs for everything (www.jwz.org and static.jwz.org), and the second URL is not only visible to the user, but the user also assumes the latency hit of making a second request from whatever crappy uplink they have to the new URL. The latency between user and EC2 instance is probably much larger than the latency between EC2 instance and S3, so the requests will be slower.
I'm assuming jwz doesn't want the second URLs visible to users who might bookmark or publish them, which means the host at 'www.jwz.org' would necessarily serve everything. Nothing says lock-in like having a million links in cyberspace to your content hosted at someone else's domain name...that can't ever go away.
New links go to static, old links remain alive. I think you over-estimate the impact to serving in this way. Moving away from this can simply involve having static CNAME www.
Everyone keeps saying "use S3 and CloudFront to reduce your costs", but I don't see it. It looks to me like 2TB of EC2 outbound bandwidth is $183, whereas 2TB of outbound S3/CF bandwidth to the US region is $174. That's a lot of work to save $9/month.
It isn't to reduce bandwidth costs, but to reduce storage costs (but HDD EBS is pretty cheap). Moreover, it offloads work off of your server, allowing a smaller server. Also, S3 is more reliable than EC2, so if you can get html onto S3 or CF then the static part of the website continues working during a reboot, but that sounds like a lot more work than just moving images.
Well S3 looks to be basically the same price as EBS sc1. $0.245 versus $0.025. That's $1/month difference. (Assuming EBS sc1 is suitable for serving as a non-CDN http root, which I'm guessing it is, but I'm not sure.)
Yeah, that was what I meant by my parenthetical. I think people advocating S3 are forgetting HDD EBS (though I don't know if it is actually a good idea). But reducing the size of the server also saves money. Even if it's not worth it for you, it's useful to understand why the normal advice involves such small servers.
Is it 2TB of outbound bandwidth? I was under the impression that it was 2TB of storage space.
We need to make the distinction of S3 storage ($46/mon) and EC2 storage (EBS at $400/mon), both at 2TB vs. S3+CF outbound traffic ($20/mon + $170/mon) vs. EC2 outbound traffic ($0), both at 2TB.
So S3+CF=$236, EC2=$400, but that's based on the 2TB/2TB numbers.
If you decide to drop the requirement for CF for static content (if you drop SSL support or can use Amazon's own domain) you can drop the price of outbound traffic pretty dramatically[*].
[*] Needs measuring before we can say this for certain.
References:
https://aws.amazon.com/ebs/pricing/
https://aws.amazon.com/s3/pricing/
https://aws.amazon.com/cloudfront/pricing/
My intuition is that anything other than 3 will just end up aggravating you. I feel you should embrace your unfrozen caveman nature, buy a few 1U supermicro servers off ebay and some spare parts, and enjoy being the last man on earth to own his very own cloud.
Ok though this did make me laugh out loud, I don't think "enjoy" is quite the right word here.
Greetings!
I went ahead and contacted sales at HE to get prices for you. You can get 7U of space, with power, and a 100 megabit drop, in their fremont data center for $150 bucks. If you would like, I can forward you their quote. Your unfrozen caveman dreams of administering your very own servers until the EMP which returns us to the stone age can continue uninterrupted :D
For the whole redundancy aspect, the resource requirements for your redundancy box will be far less than those of your primary site.
The redundancy box only needs enough RAM/CPU/disk/bandwidth for MySQL log replication and the occasional rsync job for any media uploads. So, if your primary is costing $400/mo, you could spin-up a perfectly decent failover instance for under $40/mo.
At the risk of being reprimanded by someone I respect, and now that option #3 is on the table, you might want to look into Sonic for colo. I don't use them for that (cringing, awaiting the forked tongue) but as an ISP they're hard to beat. Very technically oriented, the one time I needed support the guy was smarter than me and I'm a middle-aged nerd with lots of IT and software experience. They spoke my language. Plus, their bay area colo has showers and something called a 'mantrap'. I don't know what that is but that's certainly something I'd want in my colo, if, ah, I didn't host my mail and stupid web pages in AWS. Derp.
https://www.sonic.com/business/colocation
I have been a Sonic DSL customer for many years, both at home and work, but my googling and the resultant 404s led me to believe that 1U hosting is something that they used to do and do no longer...
Maybe so, but at the very least their 'let us send you a quote' form has 1U as an option.
I was a Sonic customer for 17 years, 3 of which included colo (started with 1U, ended with half cab before). Please allow me to reminisce in a nostalgic yet useless manner.
Their Santa Rosa datacenter is second only to the SLAC facility in my list of favorite datacenters. Pleasant, safe, reliable, capable, and human-sized. (Unlike certain other football-field-sized datacenters with all the humanity of a former furniture factory.)
The man-trap is amazing - weight sensors, sonar mapping, and biometrics (hand print) all conspiring to reject unauthorized attempts at entry. I tried to get a server in with me one time, and in the process pissed off the man-trap so badly that it refused to let me exit. It was about 3AM, and so I tried and re-tried for about 15 minutes before giving up and pulling the emergency release lever, prompting the loud security alarm to go off, prompting an immediate call from the CEO. For someone who had just been awoken by a security alarm at 3AM by a low-rent customer doing something dumb, he was remarkably diplomatic.
I ultimately moved my gear out because of much more attractive pricing in San Francisco facilities. Network pricing was the biggest issue - they simply couldn't approach the pricing that SF facilities could offer, especially for small fry like me. Plus, freeway traffic between Santa Rosa and SF is often maddeningly slow.
TL;DR: The Sonic datacenter is cool. But probably too spendy and distant.
The way to do email on Amazon EC2 is to use Amazon SES for SMTP rather than running your own.
IBM Bluemix (formerly known as IBM Softlayer) will sell you a bare metal server for $697/month with:
- 8 core 2GHz
- 32 GB RAM
- 2x 2 TB disk
- Outbound bandwidth: 5 TBytes / month
They have a data center in San Jose. It took me a little clicking to find a chassis available in that data center. The most expensive component is the bandwidth -- if 1 TB/month will do, it's only $468/month.
From personal experience as a consumer of their stuff, it's relatively easy to have them add more RAM or disk later.
Disclaimer: I work for IBM.
So IBM will rent him what he has now, and which costs him $300/month, for merely $700/month?
I see why IBM has been in trouble for decades.
C.
Looks like you can just point your EC2's Postfix at Amazon SES as a relayhost. Is there any benefit to doing that other than, "your outbound mail is coming from Amazon's SMTP IP space instead of from Amazon's hosting IP space"?
The benefit is much better email deliverability, because AWS maintains that part of their infrastructure with deliverability as a high priority. You won't need to (and can't realistically) check their IPs against email blocklists, because AWS handles that for you. Your usage might fall within their free tier.
I have no idea what you mean by "deliverability" in the context of "they are my postfix relayhost".
"Deliverability" as in likelihood of actually showing up in the recipient's Inbox, rather than being rejected by the receiving MTA or flagged as spam and filtered away by the email provider.
And why do you think that adding a delivery hop through SES improves that over my current (working) setup which has the requisite SPF, DKIM and DMARC buzzword compliance?
The only thing that would change would be the source IP address changing from what it is now to an EC2 IP. Or, paying more money to have the last hop be an SES IP for some reason.
Your current setup works now, but as you describe it's pending a big change. It may prove to be difficult to get reputable new IPs. Even if the new IPs don't show up on publicly-available blocklists, by changing sending IPs after so long, you may also trigger some externally-opaque logic that bitbuckets or spamfolders your email. Of course, it may also work out just fine - only time will tell.
SES may make more sense because it provides greater certainty - potentially for free, depending on your email volume.
I have used ARP Networks for five years, and they've been around since 1999. I don't participate in cloud-land, so I just have one machine that does everything I need. You can even get on IRC and complain to the owner if something goes wrong.
They offer all three of these options. Their new semi-dedicated option offers data redundancy so if a physical machine fails it can be brought up elsewhere without intervention on your part. They also offer traditional dedicated servers and colocation.
https://arpnetworks.com/
Regardless of where you go, I think #2 or #3 is your best option. You're probably not going to find a #1 that's (a) cheap and (b) reliable. It took me about 10 different hosts until I settled on ARP though. Probably best to go on the wayback machine and see who's been around long enough.
This may well be. I keep trying to talk myself into #1, though, because I am very attracted to the idea of: "It is Upgrade Season, click a button to have a freshly-installed computer to port everything to."
And also because #1 is what everyone else in the world does, so maybe doing anything else is just intentionally making life hard on myself.
Getting in a car and visiting a data center with a new hard drive in hand to replace the one that failed just sounds medieval, so I really want to avoid #3.
I do #3, for what it's worth, and have for the past 8 or so years. On getting into a car and visiting a datacenter:
Having someone who lives near the datacenter helps a lot, but if you're willing to spend some money to avoid driving out for each incident, any competent colo facility will have remote hands that can handle the vast majority of cases. ("I am an idiot and did not build an IPMI card into this machine; can you install the one that's about to arrive in a box" is the kind of thing that they do at a price; "can you swap out hard drive #3 for the one that's about to arrive in a box" is the kind of thing they will do for free; but "can you swap out the power supply for this one that's arriving in the mail that is physically incompatible and will require you to solder wires together because Asus are total fucking morons" is the kind of thing that you actually do need something more than a meat-drone there for.)
All that said, I still have my machine in a datacenter in Chicago (Steadfast, for whom the best I can really say is 'well, they are kind of cheap'), even though I no longer live anywhere near there. I know someone who lives there; that helps. But even still, I don't feel the urge to move it anywhere near here, to be sure. The latency just isn't that bad.
At some point (in fact, when that machine's power supply died), I did the numbers on migrating Into The Cloud. I, as well, found that it was more than I was willing to spend -- especially given as that the capital cost of the now-8-year-old-machine has long since been amortized away.
#1 and #2 both give you this in practice. If you are just going to run one monolithic server then the significant difference between #1 and #2 is the time it takes for the button click to take effect and how much it costs per click. #2 might take a few hours and cost you a new host setup fee (which many places will waive if you sign up for a year or more).
I've been using a pair of #1 (secondary) and #2 (primary) hosts for almost ten years now. #3 now seems like an unimaginable hassle to me.
"And also because #1 is what everyone else in the world does, so maybe doing anything else is just intentionally making life hard on myself."
Beware lemming logic. Just because everyone else is jumping off the cliff does not mean it's a good idea.
With the same caveats as listed by the knowledgeable folks above: Hurricane Electric has done a stand-up job with conventional & virtual hosting for me for nigh 16 years now. They're right across the water in Fremont, by all appearances seem to have a solid grasp on network infrastructure*, and they're super responsive to requests to fix things on the rare occasion they needed fixed. It's also worth noting that in those years of service, by my count they've been down a total of twice.
* I'm aware that any statement about someone having a solid grasp on technology is a lightening rod and fully expect at least one person to author a discursive explanation about How That Is Just Dead Wrong Because Rant.
Oh, and I suppose I shoulda included a link: he.net
They were mentioned before, but their price list is "call us on the phone and let's haggle" so I had ignored them...
We have used HE.NET since the mid 90s to host our web site and a bunch of different database. We have had almost no problems, and their support has been very very good (almost instantaneous). I guess I won't mention what we pay, but it is quite affordable for a tiny company, less than many of the options that you are currently considering. You might want to take another look at them (or talk to them on the phone, I'm pretty sure they speak Old School).
And no, I don't work for them. I just feel like they have treated us well and want to give back with a little positive review.
I would like to suggest OVH.ca as a hosting provider. 2 TB would run 89 CAD (75 USD) a month. But it's on the East coast and the wrong country.
Dumb idea: Get a bunch of cheap VPS instances on various hosts, f around with them for a couple months, decide later what your one true host is.
Yeah, you spend a few hundred. You also do research for yourself.
I think Pinhead (the Cenobite) famously had it right when he said, "Your suffering will be legendary, even in Hell." Heck, even in San Francisco!
I'd add my voice to those above recommending buying generic 1Us (Dell R330s, etc), stocking them with suitable kit, doing a few days of burn-in, then driving them to the local colo with your choice of price/service/reputation. Your basic racks these days typically come with redundant power supplies, RAID support, dual-port GB NICs, etc. Properly appointed and configured, you could afford to lose almost one of anything (short of the motherboard) and keep on trucking. If you're getting used equipment, you might be able to do it cheaply enough to afford another entire (identical) system for patch testing, rollovers, etc.
Besides, I think we all look forward to hearing
(NO CARRIER)
jokes in perpetuity... Rock on, Caveman.Several people said this, but it got drowned out by people saying stupid things. Raw S3 requires changing URLs, but S3+Cloudfront does not. Configure Cloudfront to get images from S3 and to proxy other content from your server.
I was in a similar situatian two years ago.
I had a dedicated server for my websites for ever. Every few years I had to move because of hardware failure or upgrades.
Then I moved to a virtual server, which made it a little bit better as I could at least do some upgrades through the interface without contacting support.
Finally I moved to AWS, just because I wanted to be as far away as possible from the hardware. It is a bit more expensive, but the price is falling every year (which the providers above never do) and it definitely slowed down my greying of hair.
The API is not especially nice, but allows you to automate at least some routine tasks.
So what's wrong with something like this:
https://www.nocix.net/cart/?id=240
3.2GHz / 3.6GHz turbo 4 Cores / 8 threads
32GB DDR3
4TB SATA 33TB Monthly Transfer
5 usable IPv4 Address
/64 IPv6 Address Block**
It's like 1/10th of the price of every other comparable option, so what's the catch?
Probably really shitty support.
And on what's probably pretty old hardware - that's a 2012-vintage processor.
The catch is that NOCIX is an unmanaged provider. Their techs respond to tickets in under 10 minutes but don't expect any help unless the issue is hardware or network related. With software or OS issues they are completely hands off and you're option, if you're really screwed, is to reload the system from scratch as opposed to some place like Softlayer that will help you out if you bork your OS.