http://conigliaro.org/Michael Conigliaro2012-01-10T07:00:00ZMichael Paul Thomas Conigliarohttp://conigliaro.orgtag:conigliaro.org,2012-01-10:/2012/01/10/troubleshooting-headless-tests-on-a-remote-ci-server/Troubleshooting Headless Tests on a Remote CI Server2012-01-10T07:00:00Z2012-01-10T07:00:00Z<p>I just ran into a problem that was causing the <a href="http://pivotal.github.com/jasmine/">Jasmine</a>
tests on our <a href="http://jenkins-ci.org/">Jenkins CI</a> box to hang forever, and I
figured I should document this handy little troubleshooting tip in case someone
else might find it helpful.</p>
<p>If you hop onto your CI box while your headless browser tests are running,
you should see an <strong>Xvfb</strong> process that looks something like this:</p>
<pre class="brush: bash">
# ps -efww | grep Xvfb
jenkins 18833 1 1 21:18 ? 00:00:00 /usr/bin/Xvfb :99 -screen 0 1280x1024x24 -ac
</pre>
<p>Since Xvfb is using display <code>:99</code>, you’ll want to run <strong>x11vnc</strong> accordingly:</p>
<pre class="brush: bash">
$ x11vnc -display :99
</pre>
<p>Now you should have a VNC server listening on port 5900. Just fire up your VNC
viewer and connect as usual:</p>
<pre class="brush: bash">
$ vncviewer host.example.com:5900
</pre>
<p>But what if you’re accessing your CI server over an insecure network? You can
use <a href="https://help.ubuntu.com/community/SSH/OpenSSH/PortForwarding#Local_Port_Forwarding">SSH local port forwarding</a>
to create a secure tunnel:</p>
<pre class="brush: bash">
$ ssh -L 5900:localhost:5900 host.example.com x11vnc -display :99
</pre>
<p>Now you can connect to the VNC server over the secure tunnel:</p>
<pre class="brush: bash">
$ vncviewer localhost:5900
</pre>
tag:conigliaro.org,2011-09-08:/2011/09/08/from-wordpress-to-nanoc/From WordPress to nanoc2011-09-08T17:22:19Z2011-09-08T17:22:19Z<p>I finally decided to start <a href="http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html">blogging like a hacker</a>, which is ironic, considering that I’ve actually come full-circle now. Back in the late-nineties (before I knew about databases, and before the term “blog” even existed), I spent a lot of time working on a Perl-based blogging engine that actually worked pretty similarly (though much less sophisticatedly) to how a lot of today’s static site generators work. Instead of working with sane formats like <a href="http://yaml.org/">YAML</a> or <a href="http://daringfireball.net/projects/markdown/">Markdown</a> (which didn’t exist back then), I ended up managing everything in my own pseudo-XML format. As ugly and hackish as this system was, the resulting output was pretty nice, and the project was definitely a good learning experience for me as a teenager. But now that static site generators are all the rage, <a href="http://iwantmyname.com/blog/2011/02/list-static-website-generators.html">there are a lot of much better options out there</a>, which gives us “hackers” a good opportunity to migrate away from WordPress without a ton of effort.</p>
<p>After reading <a href="http://news.ycombinator.com/item?id=2928919">this thread on Hacker News about yet another new static site generator</a>, I decided to give <a href="http://nanoc.stoneship.org/">nanoc</a> a spin (mostly because it seemed to be the most popular option by people commenting on that thread). I skimmed through the <a href="http://nanoc.stoneship.org/docs/3-getting-started/">nanoc Getting Started guide</a> and (for the first time in several years) set out building a new layout for my new site. Since I didn’t want to repeat <a href="/2008/02/23/i-hate-web-design/">the pain of the old days</a>, I made sure to get <a href="https://github.com/chriseppstein/compass/wiki/nanoc-Integration">Compass integration</a> working right away.</p>
<p>Once the layout was done, I spent about a week slowly reimplementing features and moving data over from my old WordPress site. The static pages were basically just a copy & paste (with some manual converting to Markdown), but there was no way I was going to repeat that process for 100+ blog posts. There are a few <a href="http://nanoc.stoneship.org/wiki-old/Tips_ConvertingFromWordPress/">example WordPress-to-nanoc scripts</a> floating around, but they all left a lot to be desired, so I ultimately ended up writing my own. The result of that effort can be found in <a href="https://github.com/mconigliaro/blog/blob/master/bin/wp2nanoc.rb">wp2nanoc.rb</a>. Besides the addition of some nice command line option parsing, my script also does some basic conversion from HTML to Markdown and from <a href="http://wordpress.org/extend/plugins/wp-syntax/">WP-Syntax</a> to <a href="http://alexgorbatchev.com/SyntaxHighlighter/">SyntaxHighlighter</a>.</p>
<p>The last thing to bring over were my comments, and <a href="http://disqus.com/">Disqus</a> handled most of that for me. I basically just installed the <a href="http://wordpress.org/extend/plugins/disqus-comment-system/">Disqus WordPress plugin</a> and ran through the <a href="http://docs.disqus.com/help/24/">automatic import process</a>. Then to make the comments show up on my new site, I set <strong>disqus_shortname</strong> and <strong>disqus_url</strong> to the appropriate values in my <a href="http://docs.disqus.com/developers/universal/">embed code</a>. <strong>Note:</strong> If you find the <a href="http://docs.disqus.com/help/2/">developer documentation</a> as confusing as I did, just know that these are the only values that need to be set. I originally tried using <strong>disqus_identifier</strong>, but that didn’t work, because the plugin uses unconfigurable, WordPress specific values for this option which obviously won’t be available in nanoc.</p>
<p>So what did I gain from my migration to nanoc?</p>
<ul>
<li>I can now keep my entire site (data and all) <a href="https://github.com/mconigliaro/blog">in git</a></li>
<li>Static files mean blazing speed</li>
<li>No database means I can host my site in <a href="http://aws.amazon.com/s3/">Amazon S3</a> for pennies</li>
<li>No more worrying about PHP/WordPress security issues</li>
<li>I got to experiment with new (to me) technologies like <a href="http://haml-lang.com/">Haml</a>, <a href="http://sass-lang.com/">Sass/SCSS</a>, <a href="http://compass-style.org/">Compass</a> and <a href="http://www.blueprintcss.org/">Blueprint</a></li>
</ul>
tag:conigliaro.org,2011-07-22:/2011/07/22/php-considered-harmful/PHP Considered Harmful2011-07-22T23:07:58Z2011-07-22T23:07:58Z<p>I know what you’re thinking. “Not another anti-PHP blog post!” But instead of complaining about specific deficiencies in the language, or how I think PHP encourages you to be a bad programmer or whatever, I want to talk about a fairly eye-opening conversation I just had with a friend. We were talking about possible optimizations for some of our <a href="/2011/07/21/project-euler/">Project Euler solutions</a>. Since I don’t have enough of a background in number theory to come up with many of <em>those</em> kinds of tricks, I suggested that threading might be a simple way to make a big difference for some problems.</p>
<p>Friend: “I don’t know what you’re talking about.”</p>
<p>Well, for example, instead of iterating through a huge range of numbers with one giant loop, you could break the range up into several smaller ranges (according to the number of cores on your machine), then run your algorithm several times in parallel and just sum up your results at the end.</p>
<p>Friend: “Haha, I still have no idea what you’re talking about.”</p>
<p>And then it dawned on me; my friend doesn’t have a formal computer science or engineering background, and he’s been working with PHP (a language with no concept of concurrent programming) almost exclusively for about a decade now. He’s a smart guy, and not what I would describe as a stereotypical “bad PHP programmer,” but the fact that he hasn’t spent much time with any other languages means he’s never been exposed to this very fundamental concept in computer science. Now, I’m fully aware that most web developers don’t normally have to deal with concurrent programming at all, but I still think it’s something every programmer should have some rudimentary knowledge of. And I’m willing to bet that if my friend had been working with Python or Ruby all these years, he would have at least seen some mention of threads in library documentation or someone else’s code.</p>
<p>But maybe the point isn’t <em>just</em> that PHP is harmful. Maybe it’s that spending all your time working with <em>any</em> one language is harmful. This conversation made me wonder what kinds of things I might be missing out on by spending all of my time working with Ruby and Python!</p>
tag:conigliaro.org,2011-07-21:/2011/07/21/project-euler/Project Euler2011-07-21T17:20:34Z2011-07-21T17:20:34Z<p><a href="https://github.com/mconigliaro/project-euler">I’ve been working on Project Euler problems in my spare time</a> for the last few weeks. I don’t really know what a “good” score is, but here’s my ranking so far:</p>
<p><a href="http://projecteuler.net/"><img src="http://projecteuler.net/profile/mconigliaro.png" alt="" class="center" /></a></p>
<p>I think one of the reasons these problems are so much fun to solve is because no single algorithm is necessarily more “correct” than any other (assuming you get the correct answer at the end). Consequently, there are widely varying philosophies when it comes to how they should be solved. Some people go for the most efficient algorithms, using all kinds of arcane bit-shifting tricks in low-level languages like C and assembly. Other people are only concerned with programmer productivity, opting for simple brute-force solutions that may take several hours to finish.</p>
<p>My personal goal has been to come up with elegant and reasonably efficient algorithms in as few lines as possible. Nice looking code almost always wins out over efficiency for me, but if my solution takes more than 30 seconds to finish, then I know I have some serious optimization to do. For a non-math guy, I think I’ve been doing pretty well so far, considering that most of my solutions finish in a fraction of a second, and the average across all my solutions is currently less than 3 seconds.</p>
<p>Which brings me to my next point. As a person who went through school absolutely dreading my next math class, I find it kind of amazing that I’ve been willing to spend hours solving these problems in my spare time. This tells me that there’s something seriously wrong with the way math is currently being taught. I suspect Bret Victor is on to something when he says “<a href="http://worrydream.com/KillMath/">math needs a new interface</a>.”</p>
tag:conigliaro.org,2011-07-20:/2011/07/20/initial-thoughts-on-migrating-from-amazon-ec2-to-rackspace-cloud/Initial Thoughts on Migrating from Amazon EC2 to Rackspace Cloud2011-07-20T17:52:11Z2011-10-06T06:00:00Z<h4 id="block-devices-are-tied-to-the-instance-type">Block Devices are Tied to the Instance Type</h4>
<p>On EC2, each instance type has a predefined CPU and memory size, but thanks to “Elastic Block Storage” (which is managed independently from the actual instance), you can make your block devices as large or as small as you want. You can also attach additional block devices as needed. This gives you a lot of flexibility to provision the appropriate resources for your specific application and to grow things as you need to. RackSpace Cloud has no EBS equivalent, so the size of your disk seems to be static and tied to the instance type. This means if you start to run out of space, you apparently have no choice but to upgrade to the next instance size, regardless of whether you actually need the additional CPU/memory. Based on a conversation I had with support, I’m guessing this has to do with the fact that all block devices are created locally on the physical VM host, rather than on a SAN. So I can definitely see how this architecture would make it difficult (or even impossible) for RackSpace to implement any of the features made possible by Amazon’s EBS.</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/997213-the-ability-to-choose-amount-of-ram-and-hd-space-s">The ability to choose amount of RAM and HD space separately</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="password-logins-by-default">Password Logins by Default</h4>
<p>On EC2, one of the first things you do is set up an SSH keypair for your account. This saves you from having to set a root password for new instances. You just select the appropriate keypair when creating the new instance and log in with your SSH key. As far as I know, there is no such feature in the RackSpace Cloud. After you request a new instance, you have to wait for a randomized root password to be emailed to you. Let me repeat that in case you missed it. Your root password is emailed to you in plain text over the Internet. Hmm…</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/1154383-do-not-send-root-password-by-email-">Do not send root password by email</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="unable-to-stop-instances">Unable to Stop Instances</h4>
<p>Yup, it’s just like being in the old days of EC2 before root EBS volumes. Once an instance is started, you can reboot or terminate it, but you can’t actually stop it to save money. At my previous company, part of our continuous deployment process was to automatically spin up a staging environment to test new code before actually deploying it into production. We also had a dedicated testing environment which we would spin up on demand for testing various things. Traditionally, it was very expensive to run duplicate (or triplicate) environments for testing, but EC2 makes this sort of thing trivially inexpensive, since the instances don’t actually have to be running most of the time. I don’t think something like this would be feasible in the RackSpace Cloud, because constantly terminating and rebuilding every instance in every environment would make things a lot slower and more difficult to manage in general. I realize the process could be sped up a bit by creating a bunch of VM images, but I don’t even want to get started on why I hate that idea. Configuration management has made images obsolete as far as I’m concerned.</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/996973-need-option-to-suspend-servers-to-save-money">Need option to suspend servers to save money</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="no-concept-of-security-groups">No Concept of Security Groups</h4>
<p>I guess I just got used to the peace and security of EC2 security groups, because I took it for granted that RackSpace Cloud would have something similar. So boy was I surprised when I discovered that my first new instances were essentially sitting wide open on the Internet! Now if you’re using a configuration management system, it’s not a huge deal to set up a local firewall on all your instances. But it can definitely be scary, because the lack of real console access in the cloud means there’s a very real possibility that you could accidentally lock yourself out of an instance while testing new firewalls rules.</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/997159-create-ec2-like-security-groups-so-you-don-t-have">Create EC2-like security groups, so you don’t have to configure iptables for each instance</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="dns-annoyances">DNS Annoyances</h4>
<p>One of the nice things about the way DNS is configured on EC2 is that when you resolve a public hostname from an instance, you’ll actually get the internal IP address. This means you can use your public hostnames everywhere, and everything will continue to work just fine. Since DNS doesn’t work this way in RackSpace, things just get a bit more complicated, but again, this is mostly just an annoyance to me right now.</p>
<h4 id="unable-to-change-filesystem">Unable to change filesystem</h4>
<p>The default filesystem on Ubuntu is EXT3. Want to convert to EXT4 in order to (for example) run MongoDB <a href="http://www.mongodb.org/display/DOCS/Production+Notes#ProductionNotes-LinuxFileSystems">according to 10gen’s official recommendations</a>? Oops, too bad.</p>
<h4 id="cloud-load-balancers-do-not-support-ssl-termination">Cloud Load balancers do not Support SSL Termination</h4>
<p>In EC2, it’s possible to upload your SSL certificates to an Elastic Load Balancer (ELB) and have your SSL connections terminate right there (i.e. to accept and decrypt SSL traffic on the ELB and forward it in plain text to back end).</p>
<pre class="brush: plain">
| (HTTPS)
+-----+-----+
| Amazon ELB|
+-----+-----+
| (HTTP)
+------+------+
| |
+-----+-----+ +-----+-----+
| app01 | | app02 |
+-----------+ +-----------+
</pre>
<p>It’s nice to be able to offload some work to the ELB, but it’s (almost) necessary if you have something like HAProxy or Varnish in front of your application servers (HAProxy and Varnish will not be able to read your SSL encrypted traffic, and therefore, will not be able to make decisions based on the requested URL, headers, etc.). This means you’ll have to stick something like <a href="http://www.stunnel.org/">stunnel</a> between the RackSpace load balancer and HAProxy/Varnish/Whatever to handle the SSL decryption.</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/1980143-support-ssl-termination-on-cloud-load-balancers">Support SSL termination on Cloud Load Balancers</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="cloud-load-balancers-do-not-support-x-forwarded-for-x-forwarded-port-or-x-forwarded-proto-headers">Cloud Load balancers do not Support X-Forwarded-For, X-Forwarded-Port or X-Forwarded-Proto Headers</h4>
<p>These are pretty important (especially X-Forwarded-For) if you want to know anything about the clients connecting to your servers. Not having them means all your HTTP requests will appear to come from your load balancer, which is essentially useless. RackSpace support told me X-Forwarded-For would be available in Q3 of this year, and that X-Cluster-Client-Ip can be used in the meantime (though it appears that X-Cluster-Client-Ip still isn’t sent with HTTPS requests!), but there are apparently no plans to support X-Forwarded-Port or X-Forwarded-Proto.</p>
<p>See <a href="http://feedback.rackspacecloud.com/forums/71021-product-feedback/suggestions/1807051-add-the-x-forwarded-for-header-to-traffic-from-you">add the x-forwarded-for header to traffic from your cloud load balancer.</a> on the RackSpace Cloud feedback forum.</p>
<h4 id="https-health-checks-on-cloud-load-balancers-occur-in-plain-text">HTTPS Health Checks on Cloud Load Balancers Occur in Plain Text</h4>
<p>How on Earth did this get past QA? Basically what this means is if you set up an HTTPS load balancer (e.g. listening on port 443 and forwarding to 443 on the backend), and you set up an HTTPS health check from the load balancer (i.e. to check the HTTPS version of your site at <strong>https://</strong>host.example.com/health), you’ll discover that the load balancer essentially makes requests for <strong>http://</strong>host.example.com:443/health, which will obviously never work, and will result in the load balancer removing <strong>all</strong> of your instances from rotation. The only workaround is to use the CONNECT health check method, which can only ensure that a port is listening.</p>
<p><strong>Update:</strong> This should be fixed as October 4th, 2011.</p>
<h4 id="conclusion">Conclusion</h4>
<p>Based on what I’ve seen so far, I don’t think RackSpace’s Cloud offering even comes close to Amazon’s right now in terms of features and flexibility. EC2 feels to me like something that was designed from the ground up to be essentially “programmable infrastructure,” whereas RackSpace cloud feels essentially like a thin wrapper around a Xen or VMware cluster. Though I fully admit that I’ve only been using it for a couple weeks at this point, so I could be totally missing things, in which case, I would love to get some feedback on some of the issues I’ve raised above.</p>
<p>One thing I think RackSpace does have over Amazon is the ability to mix virtual instances with physical servers. I could definitely see the value in, for example, running some application servers in the cloud for flexibility and running your database on physical hardware for performance (I think the problems with EBS’s IO are pretty well known at this point).</p>