An Adventure in DevOps

About a couple of weeks ago was when I decided it was finally time to dip my toes in the murky depths of server management. Up until then, pretty much all of my software developing experience was in front-end work. (An umbrella term encompassing everything from static websites to games made with Node.js and Socket.io) But one can only do the same thing for so long before getting bored, so I started paying closer attention when my friend Noah would talk about The Dark Arts - er, I mean servers.

Service-Oriented Architecture

In particular, Noah explained to me the idea of breaking an application into specific pieces, or services, and having different clusters of machines dedicated to providing those individual services. The technical term for this would be Service-Oriented Architecture (SOA), but at the time it just sounded like techno-babble mumbo-jumbo that made very little sense. However after many hours of Google-fu and Wikipedia research, the concept began to make somewhat little sense.

According to Noah (and seemingly a lot of the internet), the best way to create an application that adhered to SOA was to use a platform called Apache Mesos. If documentation had a TL;DR, Mesos' would say that it essentially allows a user to treat many computers as one, easily pooling together a vast amount of resources. The benefits of such a configuration is that it would be very easy to scale up resources for an application as it became popular, as it would simply mean adding another computer running Mesos. The way Mesos ties into the whole idea of SOA is that one would have some computers running the Mesos master software, which would point clients to slaved computers providing different services.

In addition to Mesos, there is also Google Kubernetes, which from what I understood allows users to create pods of containers that can easily communicate with each other and share resources. Tying back to SOA and Mesos, pods would encompass the necessary containers to make up a specific service, and a pod could run on each Mesos slave. Together, Mesos and Kubernetes would allow for simple application scaling and easy communication between all the different parts.

While all that sounded fine and dandy, it quickly became apparent to both Noah and I that installing and setting up Mesos and Kubernetes is a pain in the ass. While it's true that trying to create an application using SOA without Mesos or Kubernetes would be a major pain in the ass, it was still a hassle either way. While Noah forged onward attempting to find an easy solution, I reevaluated whether or not I actually needed Mesos or Kubernetes or any of that whole SOA business. Because while Noah wanted to familiarize himself with those tools in order to use them in actual applications later on, I just wanted a blog.

Docker

Although at first it seemed like I had wasted a good amount of my time in attempting to get Mesos to work just to give up on it, it should be noted that 1) Learning new things is never really a waste of time, and 2) Noah did most of the work anyways so I didn't really lose that much. Not to mention that in the process of trying to install Mesos and Kubernetes, I ended up having a Digital Ocean droplet running Ubuntu 14.04 that was secure, had ZSH, and insulted me whenever I messed up my sudo password. Silver linings, right?

During my prior research about Service-Oriented Architecture, I learned about an absolutely magical thing called Docker. Docker basically allows for applications to be run inside isolated containers. While at first it sounds like another VM solution like Vagrant, the beauty of Docker is how simple it makes adding dependencies to those containers, and how quick it is to get an application running. And unlike Vagrant, Docker (at least on Linux) doesn't require an entire virtual machine to be running, as all the Docker containers run inside the Linux userspace. All that unused overhead can be used for running even more containers! (And I must add, it is quite refreshing to just spin up a container for the hell of it)

After some looking around and sampling of various software, I decided which Docker containers I wanted to run on my server. That first round included the blog you're reading now, which runs on the easy to use Ghost blogging platform. I also included Jupyter Notebook, a web interface to manage my containers, and an nginx reverse proxy. I thought it would have been as simple as just running their respective containers, but life is never that easy.

Mounted Volumes

My first major issue was due to my misunderstanding of how to properly use Docker containers. What I failed to realize was that the entire point of containerizing an application was so that one would not have to constantly enter the container to make changes. However, that is exactly what I did when I wanted to modify the theme I was using for my blog. Although I mounted a folder to the Ghost container, I still foolishly entered the container itself to install the theme, and that was where I attempted to make my changes. As expected, that just didn't work. All of the changes I made did not persist when I "saved" as the container was still reading the mounted folder, which did not have those same changes. Only after a few hours of Google and asking people on the internet did I realize my mistake.

Websockets

The next problem I encountered was not so easy to fix, as it was not simply due to my being stupid. I should probably mention now that I am using Cloudflare (A great service you can read more about here). One thing you should know about Cloudflare - something I was not aware of at the time - is that Cloudflare does not yet fully support Websockets. While that seems like a minor detail, Jupyter Notebook relies on websocket connections for code execution, and since the ability to run different languages in your browser is the entire point of Jupyter, I was pretty annoyed when it wasn't working as expected. It took a little over 4 days before I became aware that Cloudflare was the problem, and yet solving that problem was as simple as clicking an small icon in the Cloudflare DNS panel. (I won't include the whole debugging saga here, but if you'd like, you can follow it based on my GitHub Issues here and here)

Finally, everything seemed to be in order! I had a working blog that I now knew how to properly configure, my own instance of Jupyter Notebook that I could use for saving code snippets and taking notes, and a simple interface to manage the containers remotely. After all that hard work, I figured it'd be stupid of me not to back it up. So, I powered off the droplet, made a snapshot in from the Digital Ocean control panel, and spun it back up. And then everything was broken.

"Uncomplicated" Firewall

I mentioned earlier that my Digital Ocean droplet was secure. What I meant by that was I followed most of the tips provided in this article. The relevant point from it involves installing and configuring Uncomplicated Firewall (UFW). While it's true that its setup was fairly uncomplicated, using it with Docker was anything but.

Prior to creating a snapshot, closer to the time when I was figuring out how to edit my Ghost theme, I realized that a person could access any of my Docker containers by using the droplet's IP and the port that the container was serving to. The issue with that was that I wanted all traffic to be routed through the reverse proxy, as it would ensure that requests made to the web interface (which could stop/start/kill any of the containers) required authentication. Accessing the interface through the IP and port circumvented that, which was a major security flaw. While researching how to fix that flaw, I found this article, which nicely summed up the fact that Docker defaulted to messing with the servers iptables, overriding the rules set by UFW. While the fix mentioned in the article worked at the time, it missed another crucial setting that for some reason wasn't needed until I restarted my droplet.

After creating the snapshot, attempting to access any of my applications resulted in a 503 error. I immediately assumed that it was nginx issue, but after dealing with getting Jupyter working, I considered that it could be a deeper problem. After many hours of attempting to find the solution, I remembered that article from Viktor's Ramblings and figured it might be another UFW issue. Sure enough, Google led me to this post on Ask Ubuntu. Changing UFW's DEFAULT_FORWARD_POLICY to "ACCEPT" fixed the issue! However, I never really figured out the security implications of setting that...

Nonetheless, it seemed that all was well again. Let's fast-forward a week.

External Network Requests

Now that everything had been working and seemed stable, I figured it was time to make my first blog post. While I was in the Ghost admin interface, I thought "Why not add a friend as guest author?" There was no reason not to, right? Except for the part where Ghost was unable to send the invitation email. Checking the logs revealed that the Ghost container wasn't able to connect to the mail server I was using, and I couldn't just let that be. After some asking around in the Ghost Slack channel, I was told to look into external network requests with Docker containers, and that led me to a few different resources, including this Github issue.

While preventing Docker from messing with the server's iptables allowed UFW to function properly, it also prevented my containers from making requests to external networks, like a mail server for example. After another quick Google session, I found another thread on Ask Ubuntu that detailed how to enable IP Forwarding, which would let my containers access the outside world and still be protected by UFW.

Finally, things were in order.

It only took about two weeks, but now I believe it's safe to say everything works the way it should. It'd be stupid of me if I didn't take a snapshot now to save all my hard work...