Archive for October, 2007

Mr Kirkland

Facebook Apps on EC2 update

Bookmark and Share

I wrote an overview on using ec2 for hosting facebook apps a few months back. I’ve been poking around a little more with EC2 lately and have a couple of items to report back.

Facebook ‘hello world’ Public EC2 image

I spotted this public ami for getting started with facebook, shipping with:

1. Facebook Client Libraries
2. HelloWorld Facebook Application that lists the objects in your Amazon S3 bucket
3. Footprints Facebook Application that is shipped with Facebook Client Libraries

New Large and Extra Large Instance Types

I mentioned previously about setting up a cluster of servers to deal with the potential high traffic for facebook apps, however amazon have released some more beefy images that could absorb a lot more traffic before showing the strain.
They now have 3 types of instance small, large, extra large:

Small Instance (default) (1)

1.7 GB memory, 32-bit platform
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage (150 GB plus 10 GB root partition)
Instance Type name: m1.small (used in EC2 APIs)
Price: $0.10 per instance hour

Large Instance

7.5 GB memory, 64-bit platform
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage (2 x 420 GB plus 10 GB root partition)
Instance Type name: m1.large (used in EC2 APIs)
Price: $0.40 per instance hour

Extra Large Instance

15 GB memory, 64-bit platform
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage (4 x 420 GB plus 10 GB root partition)
Instance Type name: m1.xlarge (used in EC2 APIs)
Price: $0.80 per instance hour

So potentially you could quickly shift from a small instance to large, then to extra large as required.
NB There are still great benefits to going down the cluster route rather than single server though, particular if you’re doing anything with a database (of course you are!) then you’ll really want to have replication onto a separate machine to ensure minimum data loss in the event of failure.

Mr Kirkland

Amazon EC2 as webhosting replacement

Bookmark and Share

So I’m currently researching hosting options for our infrustructure expansion on www.theartistsweb.co.uk, we already make heavy use of S3 and so I wanted to look into taking advantage of EC2 aswell. I’m not a complete newbie to EC2 and used it to scale a popular facebook application, which worked well.

However more traditional website hosting has a few specific needs that aren’t standard to EC2, anyway read on to see the results of my research so far

Disadvantages

Start with the bad news:)

No Persistant Storage

Simply If you shut down an EC2 instance all the data is lost, reboots are okay but it is still possible that an instance can fail (the program is still BETA btw) and you’ll loose your data. I actually consider this an advantage as it forces you to plan for failure and so create a more robust set up and ideally you’d use S3 to provide your persistant storage whilst using EC2 for quickly replacable nodes to serve the data/application.

And in reality cheapo dedicated servers have hardware failures, in my own personal experience with uk2.net I went through 4 dedicated servers in a month as each had faulty ram - I couldn’t believe it was a hard fault in all machines until the 4th one just worked.

No Static IP Address (yet)

This is potentially a big disadvantage in my opinion. At the moment you can’t reserve an IP address, so each time you create a new instance you get a new IP, and as you should generally expect failure at some point this means you have to assume (at least occassional) IP address change. Whilst you could get by with changing IP address by using low TTL and/or a third party dynamic dns service/api the following cases could be show stoppers if not handled properly:

  • DNS caching, theoretically DNS caches should obey your TTL, but there may be some links in the chain to your website that don’t (proxies, client applications etc.)
  • Reverse IP, PTR (Mail reverse IP lookups) - now this is a big deal for any mailhost in my opinion. Sending mail involves jumping through tighter and tigher hoops now-a-days and reverse dns records and PTR for your mail server are a must.On top of this many large ISP’s (yahoo etc.) associate reputation information with IP addresses so a change in IP address can directly affect your mail deliverability. I’m sure this can largely be dealt with configuration and careful planning, but it’s a non standard setup with potential for problems - poor mail deliverability is poor/lost business. See the aws ec2 forum, plenty of discussion on this problem
  • General pain in the arse - static IP address are probably what you (and your customers) have been used to for the last x years, so apart from the 2 key points above there’s bound to be some fixed IP skeletons in the closet that will come out if you are migrating an existing infrastructure.

You can see on the EC2 forum that quite a few people want optional static IP’s and this is apparently going to be a new feature ’soon’

Some other forum posts.

Lack of Support

You’re really on your own here, Amazon is providing a platform rather than controlled environment dedicated servers, so you can hardly expect Amazon tech support to hold your hand then through your software issues. If you need built in software support stick with a standard supported plan on a regular ISP.

Cost

The bandwith prices can rack up see the cost calculator and compare with some other offers

Advantages

Cost!

The cost of running an instance is comparable to regular dedicated servers and increases linearly with the number of instances you deploy, so no nasty surprises as your needs grow - see the cost calculator. You could also be using an AMI instance periodically for testing, compute tasks etc., you are billed by the hour not the month. However compared with other offerings, the bandwith is where the expense lies so it really depends on your needs/usage.

Quick Set Up

You can fire up an instance in a few minutes and choose from a wide variety of public images to get started. Once you’ve fully tweaked and set up your instance, you can then make your own AMI to launch more instances from. This is a big time saver over the traditional set up and tuning of a dedicated server, how long would it take you to drop in another server in your pool?

Scale

The real beauty is the on demand aspect, with proper planning you can quickly fire up a few extra instances and absorb huge spikes. If your infrastructure is one or 2 servers (and will be for the foreseable future) then there’s no appeal here. However if you need to plan a path for growth then EC2 might still be worth considering (as I still am!).

Mr Kirkland

Wahoo Comment Spam

Bookmark and Share

In my experience comment spam on individual blog/cms (like a wordpress based site) are fairly easy to spot, typically they’ll be riddled with obvious keyword links to viagra, loans, porn etc. Even without some self learning spam detection system, simply blocking comments with more than a couple of links can filter out a lot of this stuff, and of course making all comments go in to a moderation queue ensuring they are manually checked by a human will stop all such comments.

Smarter Spammers

Smarter spam will not be so blatant and attempt to mimmick a genuine comment. This is actually quite easy to do - a generic post along the lines of “Thanks, that’s really useful” would probably sound genuine on the majority of informative posts and if the post doesn’t contain any links then why suspect such a comment? The give away is the commentor’s url, assuming your comment system allows the commentor to add their url (and in particular without rel=”nofollow” on the link) arguably the spammer’s goal has been accomplished - a cheap link back to their site. Even with specific keyword link text, a link on a page relevant to the spammers desired keywords will be a good catch.

How To Stop It

If you are maintaining a single isolated blog, then you may not have a means of picking up on comments like this. Visiting the websites of the commentors can certainly help weed out the obvious offenders, though arguably this is a for of referer spam - getting you to visit the site!

If you maintain a number of sites, then you can spot this sublte breed of comment spam, for example just today I received 2 identical comments to 2 completely unrelated blogs linking to wahoo.com. Furthermore the blogs weren’t english language sites either, so an english comment is a dead give away.

One action that one can take is to make sure you’re using a third party comment spam tool, such as akismet and mark the comments as spam.

I’ve chosen to go a step further and mention that wahoo.com has comment spammed several of the sites I maintain - surprisingly a seemingly legitimate enterprise, not some cheap viagra selling outfit! I wonder if this is the work of a dodgy SEO outfit…


About

You are currently browsing the Mr Kirkland 2.0 weblog archives for the month October, 2007.

Longer entries are truncated. Click the headline of an entry to read it in its entirety.

Timezone

  • JST: 2008-05-13 21:19
  • BST: 2008-05-13 13:19
  • PDT: 2008-05-13 05:19