machetEC2, the Infochimps Amazon Machine Image (AMI) designed for data processing, analysis, and visualization, has been released!

Amazon’s Cloud Computing services give you transformatively cheap and scalable computing power, and their Public Data Sets (AWS/PDS) collection (which infochimps is contributing to) is helping to put the world of free, open data at your fingertips.  MachetEC2 lets you summon a “batteries included” computer — or a hundred computers — from the cloud.  As soon as it loads, you’re ready to start crunching and transforming and visualizing data, whether from AWS/PDS, or infochimps.org, or your own pool.

When you SSH into an instance of machetEC2 (brief instructions after the jump), check the README files: they describe what’s installed, how to deal with volumes and Amazon Public Datasets, and how to use X11-based applications.  You can also visit the the machetEC2 GitHub page to see the full list of packages installed, the list of gems, and the list of programs installed from source.

This machete is only as sharp as it is complete. If there’s software that you find indispensable, we encourage you to suggest it here, or even better to help add it to the toolkit (instructions are within).

To launch an instance of machetEC2, log into the AWS Console, click “AMIs”, search for “machetEC2″ or ami-29ef0840, and click “Launch”. If you’re on the command-line, simply run

$ ec2-run-instances ami-29ef0840 -k [your-keypair-name]

By the time you’ve grabbed some coffee, you’ll be able to access an EC2 instance with all the tools you need for working with data already installed, configured, and ready to hack.

You can obtain a copy of the machetEC2 build scripts at the Infochimps machetEC2 GitHub page. If you improve them, send us a pull request on GitHub and we’ll include your contributions in the next build of machetEC2!

This is our first build of machetEC2 and we’re very excited to have the community’s input on what’s missing and what needs to be improved. We’ve incorporated many of the suggestions from our RFC post, but not all — either for reasons of time or (disk) space — have made it in to this initial release.  We’re tracking things to add though, so either post below (comments on this post will become entries in a wiki soon to be hosted at machetEC2.org) or as we said, send along a pull request.