Mike Conigliaro

Troubleshooting Headless Tests on a Remote CI Server

I just ran into a problem that was causing the Jasmine tests on our Jenkins CI box to hang forever, and I figured I should document this handy little troubleshooting tip in case someone else might find it helpful.

If you hop onto your CI box while your headless browser tests are running, you should see an Xvfb process that looks something like this:

# ps -efww | grep Xvfb
jenkins  18833     1  1 21:18 ?        00:00:00 /usr/bin/Xvfb :99 -screen 0 1280x1024x24 -ac

Since Xvfb is using display :99, you’ll want to run x11vnc accordingly:

$ x11vnc -display :99

Now you should have a VNC server listening on port 5900. Just fire up your VNC viewer and connect as usual:

$ vncviewer host.example.com:5900

But what if you’re accessing your CI server over an insecure network? You can use SSH local port forwarding to create a secure tunnel:

$ ssh -L 5900:localhost:5900 host.example.com x11vnc -display :99

Now you can connect to the VNC server over the secure tunnel:

$ vncviewer localhost:5900

From WordPress to nanoc

I finally decided to start blogging like a hacker, which is ironic, considering that I’ve actually come full-circle now. Back in the late-nineties (before I knew about databases, and before the term “blog” even existed), I spent a lot of time working on a Perl-based blogging engine that actually worked pretty similarly (though much less sophisticatedly) to how a lot of today’s static site generators work. Instead of working with sane formats like YAML or Markdown (which didn’t exist back then), I ended up managing everything in my own pseudo-XML format. As ugly and hackish as this system was, the resulting output was pretty nice, and the project was definitely a good learning experience for me as a teenager. But now that static site generators are all the rage, there are a lot of much better options out there, which gives us “hackers” a good opportunity to migrate away from WordPress without a ton of effort.

After reading this thread on Hacker News about yet another new static site generator, I decided to give nanoc a spin (mostly because it seemed to be the most popular option by people commenting on that thread). I skimmed through the nanoc Getting Started guide and (for the first time in several years) set out building a new layout for my new site. Since I didn’t want to repeat the pain of the old days, I made sure to get Compass integration working right away.

Once the layout was done, I spent about a week slowly reimplementing features and moving data over from my old WordPress site. The static pages were basically just a copy & paste (with some manual converting to Markdown), but there was no way I was going to repeat that process for 100+ blog posts. There are a few example WordPress-to-nanoc scripts floating around, but they all left a lot to be desired, so I ultimately ended up writing my own. The result of that effort can be found in wp2nanoc.rb. Besides the addition of some nice command line option parsing, my script also does some basic conversion from HTML to Markdown and from WP-Syntax to SyntaxHighlighter.

The last thing to bring over were my comments, and Disqus handled most of that for me. I basically just installed the Disqus WordPress plugin and ran through the automatic import process. Then to make the comments show up on my new site, I set disqus_shortname and disqus_url to the appropriate values in my embed code. Note: If you find the developer documentation as confusing as I did, just know that these are the only values that need to be set. I originally tried using disqus_identifier, but that didn’t work, because the plugin uses unconfigurable, WordPress specific values for this option which obviously won’t be available in nanoc.

So what did I gain from my migration to nanoc?

  • I can now keep my entire site (data and all) in git
  • Static files mean blazing speed
  • No database means I can host my site in Amazon S3 for pennies
  • No more worrying about PHP/WordPress security issues
  • I got to experiment with new (to me) technologies like Haml, Sass/SCSS, Compass and Blueprint

PHP Considered Harmful

I know what you’re thinking. “Not another anti-PHP blog post!” But instead of complaining about specific deficiencies in the language, or how I think PHP encourages you to be a bad programmer or whatever, I want to talk about a fairly eye-opening conversation I just had with a friend. We were talking about possible optimizations for some of our Project Euler solutions. Since I don’t have enough of a background in number theory to come up with many of those kinds of tricks, I suggested that threading might be a simple way to make a big difference for some problems.

Friend: “I don’t know what you’re talking about.”

Well, for example, instead of iterating through a huge range of numbers with one giant loop, you could break the range up into several smaller ranges (according to the number of cores on your machine), then run your algorithm several times in parallel and just sum up your results at the end.

Friend: “Haha, I still have no idea what you’re talking about.”

And then it dawned on me; my friend doesn’t have a formal computer science or engineering background, and he’s been working with PHP (a language with no concept of concurrent programming) almost exclusively for about a decade now. He’s a smart guy, and not what I would describe as a stereotypical “bad PHP programmer,” but the fact that he hasn’t spent much time with any other languages means he’s never been exposed to this very fundamental concept in computer science. Now, I’m fully aware that most web developers don’t normally have to deal with concurrent programming at all, but I still think it’s something every programmer should have some rudimentary knowledge of. And I’m willing to bet that if my friend had been working with Python or Ruby all these years, he would have at least seen some mention of threads in library documentation or someone else’s code.

But maybe the point isn’t just that PHP is harmful. Maybe it’s that spending all your time working with any one language is harmful. This conversation made me wonder what kinds of things I might be missing out on by spending all of my time working with Ruby and Python!

Project Euler

I’ve been working on Project Euler problems in my spare time for the last few weeks. I don’t really know what a “good” score is, but here’s my ranking so far:

I think one of the reasons these problems are so much fun to solve is because no single algorithm is necessarily more “correct” than any other (assuming you get the correct answer at the end). Consequently, there are widely varying philosophies when it comes to how they should be solved. Some people go for the most efficient algorithms, using all kinds of arcane bit-shifting tricks in low-level languages like C and assembly. Other people are only concerned with programmer productivity, opting for simple brute-force solutions that may take several hours to finish.

My personal goal has been to come up with elegant and reasonably efficient algorithms in as few lines as possible. Nice looking code almost always wins out over efficiency for me, but if my solution takes more than 30 seconds to finish, then I know I have some serious optimization to do. For a non-math guy, I think I’ve been doing pretty well so far, considering that most of my solutions finish in a fraction of a second, and the average across all my solutions is currently less than 3 seconds.

Which brings me to my next point. As a person who went through school absolutely dreading my next math class, I find it kind of amazing that I’ve been willing to spend hours solving these problems in my spare time. This tells me that there’s something seriously wrong with the way math is currently being taught. I suspect Bret Victor is on to something when he says “math needs a new interface.”

Initial Thoughts on Migrating from Amazon EC2 to Rackspace Cloud

Block Devices are Tied to the Instance Type

On EC2, each instance type has a predefined CPU and memory size, but thanks to “Elastic Block Storage” (which is managed independently from the actual instance), you can make your block devices as large or as small as you want. You can also attach additional block devices as needed. This gives you a lot of flexibility to provision the appropriate resources for your specific application and to grow things as you need to. RackSpace Cloud has no EBS equivalent, so the size of your disk seems to be static and tied to the instance type. This means if you start to run out of space, you apparently have no choice but to upgrade to the next instance size, regardless of whether you actually need the additional CPU/memory. Based on a conversation I had with support, I’m guessing this has to do with the fact that all block devices are created locally on the physical VM host, rather than on a SAN. So I can definitely see how this architecture would make it difficult (or even impossible) for RackSpace to implement any of the features made possible by Amazon’s EBS.

See The ability to choose amount of RAM and HD space separately on the RackSpace Cloud feedback forum.

Password Logins by Default

On EC2, one of the first things you do is set up an SSH keypair for your account. This saves you from having to set a root password for new instances. You just select the appropriate keypair when creating the new instance and log in with your SSH key. As far as I know, there is no such feature in the RackSpace Cloud. After you request a new instance, you have to wait for a randomized root password to be emailed to you. Let me repeat that in case you missed it. Your root password is emailed to you in plain text over the Internet. Hmm…

See Do not send root password by email on the RackSpace Cloud feedback forum.

Unable to Stop Instances

Yup, it’s just like being in the old days of EC2 before root EBS volumes. Once an instance is started, you can reboot or terminate it, but you can’t actually stop it to save money. At my previous company, part of our continuous deployment process was to automatically spin up a staging environment to test new code before actually deploying it into production. We also had a dedicated testing environment which we would spin up on demand for testing various things. Traditionally, it was very expensive to run duplicate (or triplicate) environments for testing, but EC2 makes this sort of thing trivially inexpensive, since the instances don’t actually have to be running most of the time. I don’t think something like this would be feasible in the RackSpace Cloud, because constantly terminating and rebuilding every instance in every environment would make things a lot slower and more difficult to manage in general. I realize the process could be sped up a bit by creating a bunch of VM images, but I don’t even want to get started on why I hate that idea. Configuration management has made images obsolete as far as I’m concerned.

See Need option to suspend servers to save money on the RackSpace Cloud feedback forum.

No Concept of Security Groups

I guess I just got used to the peace and security of EC2 security groups, because I took it for granted that RackSpace Cloud would have something similar. So boy was I surprised when I discovered that my first new instances were essentially sitting wide open on the Internet! Now if you’re using a configuration management system, it’s not a huge deal to set up a local firewall on all your instances. But it can definitely be scary, because the lack of real console access in the cloud means there’s a very real possibility that you could accidentally lock yourself out of an instance while testing new firewalls rules.

See Create EC2-like security groups, so you don’t have to configure iptables for each instance on the RackSpace Cloud feedback forum.

DNS Annoyances

One of the nice things about the way DNS is configured on EC2 is that when you resolve a public hostname from an instance, you’ll actually get the internal IP address. This means you can use your public hostnames everywhere, and everything will continue to work just fine. Since DNS doesn’t work this way in RackSpace, things just get a bit more complicated, but again, this is mostly just an annoyance to me right now.

Unable to change filesystem

The default filesystem on Ubuntu is EXT3. Want to convert to EXT4 in order to (for example) run MongoDB according to 10gen’s official recommendations? Oops, too bad.

Cloud Load balancers do not Support SSL Termination

In EC2, it’s possible to upload your SSL certificates to an Elastic Load Balancer (ELB) and have your SSL connections terminate right there (i.e. to accept and decrypt SSL traffic on the ELB and forward it in plain text to back end).

             |         (HTTPS)
       +-----+-----+
       | Amazon ELB|
       +-----+-----+
             |         (HTTP)
      +------+------+
      |             |
+-----+-----+ +-----+-----+
|   app01   | |   app02   |
+-----------+ +-----------+

It’s nice to be able to offload some work to the ELB, but it’s (almost) necessary if you have something like HAProxy or Varnish in front of your application servers (HAProxy and Varnish will not be able to read your SSL encrypted traffic, and therefore, will not be able to make decisions based on the requested URL, headers, etc.). This means you’ll have to stick something like stunnel between the RackSpace load balancer and HAProxy/Varnish/Whatever to handle the SSL decryption.

See Support SSL termination on Cloud Load Balancers on the RackSpace Cloud feedback forum.

Cloud Load balancers do not Support X-Forwarded-For, X-Forwarded-Port or X-Forwarded-Proto Headers

These are pretty important (especially X-Forwarded-For) if you want to know anything about the clients connecting to your servers. Not having them means all your HTTP requests will appear to come from your load balancer, which is essentially useless. RackSpace support told me X-Forwarded-For would be available in Q3 of this year, and that X-Cluster-Client-Ip can be used in the meantime (though it appears that X-Cluster-Client-Ip still isn’t sent with HTTPS requests!), but there are apparently no plans to support X-Forwarded-Port or X-Forwarded-Proto.

See add the x-forwarded-for header to traffic from your cloud load balancer. on the RackSpace Cloud feedback forum.

HTTPS Health Checks on Cloud Load Balancers Occur in Plain Text

How on Earth did this get past QA? Basically what this means is if you set up an HTTPS load balancer (e.g. listening on port 443 and forwarding to 443 on the backend), and you set up an HTTPS health check from the load balancer (i.e. to check the HTTPS version of your site at https://host.example.com/health), you’ll discover that the load balancer essentially makes requests for http://host.example.com:443/health, which will obviously never work, and will result in the load balancer removing all of your instances from rotation. The only workaround is to use the CONNECT health check method, which can only ensure that a port is listening.

Update: This should be fixed as October 4th, 2011.

Conclusion

Based on what I’ve seen so far, I don’t think RackSpace’s Cloud offering even comes close to Amazon’s right now in terms of features and flexibility. EC2 feels to me like something that was designed from the ground up to be essentially “programmable infrastructure,” whereas RackSpace cloud feels essentially like a thin wrapper around a Xen or VMware cluster. Though I fully admit that I’ve only been using it for a couple weeks at this point, so I could be totally missing things, in which case, I would love to get some feedback on some of the issues I’ve raised above.

One thing I think RackSpace does have over Amazon is the ability to mix virtual instances with physical servers. I could definitely see the value in, for example, running some application servers in the cloud for flexibility and running your database on physical hardware for performance (I think the problems with EBS’s IO are pretty well known at this point).