Building a homelab – a walk through history and investing in new hardware

This is the first post in a series of my experiences while Building a Homelab. The second post focuses on setting up a local DNS server and can be found here.

I’ve had a particular interest in home computers and servers for a long time now. One of my experiences was wiring my childhood home up with CAT-5 ethernet to the rooms with TVs or computers and having them all connected to a 24 port 100 Mbps switch in the crawlspace. This was part of a master plan to provide different computers in the house with internet connection (when WiFi wasn’t as good as it is today), TVs with smart media boxes (think Apple TV, Roku, and the like but 10 years ago), and to tie it all together a home server for serving media storing files.

The magazine Maximum PC was a major source for this inspiration as they had a number of captivating DIY articles for running your own home server, media streaming devices, and home networking. The memory is a bit rough around the edges, but these projects happened around the same time and on my own dollar – all for the satisfaction of having a bleeding edge entertainment system.

Around this time Windows had a product out for a year called Windows Home Server. It was a OS which catered towards consumers and their home needs. Some of the features it had was network file shares for storing files, computer backup and restore, media sharing, and a number of extensions available from the community. I built a $400 box to run this OS and store two hard drives. The network switch in the crawlspace was a perfect place to put this headless server. Over many years this server was successfully used for computer backups, file storage, network bandwidth monitoring, and media serving to a number of PCs and media streaming boxes attached to TVs.

Two of the TVs in the house had these Western Digital TV Live boxes for playing media off of the network. These devices were quite basic at the time where only Youtube, Flickr, and a handful of other services were available – lacking Netflix and the other now popular Internet streaming services. Instead, they were primarily built for streaming media off of the local network – in this case off of the home server file share. My family and I were able to watch movies and TV shows from the comfort of our couch, and on-demand. This was crazy cool at the time as most people were still using physical media (DVD/Blu-ray) and streaming media had not taken off yet. I also vaguely remember hacking one of the boxes to put on a community-built firmware.

Windows Home Server was great at the time since it offered all of this functionality out of the box with simple configuration. I remember playing with BSD-based FreeNAS on old computers and being overwhelmed at all of the extra configuration needed to achieve something that you get out of the box with Windows Home Server. Additionally, the overhead of having to administer FreeNAS while only having a vague knowledge of Linux and BSD at the time wasn’t a selling point.

Now back to current times. I’m in the profession of software development, have been using various Linux distros for personal use on laptops and servers, and would now consider myself a sysadmin enthusiast. Living in my own place, I’ve been using my own Ubuntu-based laptop to run a Plex media server and stream content to my Roku Streaming Stick+ attached to my TV. The laptop’s 1 TB hard drive was filling up. It was also inconvenient to have this laptop constantly on for serving content.

Browsing Reddit, I came across r/homelab, a community of people interested in owning and managing servers for their own fun. Everything from datacenter server hardware to Raspberry PIs, networking, virtualization, operating systems, and applications. This subreddit gave me the idea of purchasing some decommissioned server hardware from eBay. I sat on the idea for a few months. Covid-19 eventually happened and with all my spare time I gave in to buying some hardware.

After a bunch of research on r/homelab about which servers are quiet, energy efficient, extendable, and will last a number of years, I settled on a Dell R520 with 2 x 6 cores at 2.4 Ghz, 48 GB DDR3 RAM, 2 x 1 Gbit NICs and 8 x 3.5″ hard drive bays. I bought a 1 TB SSD as the boot drive and a refurbished 10 TB hard drive for storing data.

The front of the Dell R520, showing the 8 3.5″ drive bays and some of the internals.

Since I intended on running the ZFS filesystem on the data drive, many people gave the heads up that the Host Bus Adaptor (HBA) card (a piece of hardware which connects the SAS/SATA hard drives and SSDs to the motherboard) comes with the default Dell firmware. This default firmware caters towards always running some sort of hardware-based RAID setup, thus hiding the SMART status of all drives. With ZFS, accessing the SMART data for each drive is paramount for data integrity. To get around this limitation with the included HBA card, the homelab community has some unofficial firmware for it which exposes IT mode, basically a way to pass through each drive to the OS – completely bypassing any hardware RAID functionality. Some breath holding later and the HBA card now had the new firmware.

I bought a separate HBA card with the knowledge at the time that the one that comes with the Dell R520 didn’t have any IT mode firmware from the community. I ended up being wrong after a whole lot of investigation. Thankfully I should be able to flash new firmware on this card as well and sell it back on eBay.

A Dell Perc H310 Mini Mono HBA (Host Bus Adaptor) used in Dell servers for interfacing between the motherboard and SAS/SATA drives.

As the hardware was all being figured out, I was also researching and playing with different hypervisors – an operating system made for running multiple operating systems on the same hardware. The homelab community often refers to VMware ESXi, Proxmox VE, and even Unraid. I sampled out the first two, as Unraid didn’t have an ISO available to test with and wasn’t free.

Going through the pain of making a USB stick bootable for an afternoon, I eventually got ESXi installing on the system. Poking around, it was interesting to see that VM storage was handled by having a physical disk formatted to a VMware format specific to storing multiple VMs – vmfs. With the goal of having one of the VMs have full control over a drive formatted with the ZFS filesystem, ESXi provides a feature called hardware passthrough which bypasses virtualization of the physical hardware. One big blocker for myself was the restriction on the free version which limits VMs to a maximum of 8 vCPUs – a waste of resources when having 12 CPUs and not enough VMs to utilize them.

Next, I took a look at Proxmox by loading it up as a VM on ESXi. It was Debian based, which was a plus as I’m comfortable with systemd and Ubuntu systems already. The Proxmox UI appeared like it had quite a few useful features, but didn’t feel like what I needed. I was much more comfortable with the terminal, and these graphical interfaces to manage things felt more like a limitation than a benefit. I could always SSH into Proxmox and manage things there, but there’s always the aspect of learning the intricacies of how this turnkey system was setup. Who knows what was default Debian configured and what was modified by Proxmox. Not to mention, what if Docker or other software was out of date and couldn’t be upgraded? This would be an unnecessary limitation I could avoid if rolling my own.

Lastly, I went back to my roots – Ubuntu Server. I spun up a VM of it on ESXi. Since I’m quite used to the way Ubuntu works it was comfortable knowing what I could do. There were no 8 vCPU limitations with Ubuntu Server as the host OS – I can utilize all of the server’s resources. After some thinking I realized I didn’t have any need to run any VMs at the moment. In the past I’ve managed a number of VMs using QEMU using Ubuntu Server, therefore if the need arises again I can pull it off. The reason why I’m not using any VMs is because I’m using Docker for all of my application needs. I already have a few apps running in Docker containers on my laptop that I’ll eventually transfer over to the server. Next up, ZFS on Linux has been available for a while now in Ubuntu, giving me the confidence that the data drive will be formatted with ZFS without a problem.

The internals of the Dell R520 with the thermal cover removed. Note the row of six fans across the width of the case to keep things cool.

In the end I scrapped the idea of running a hypervisor such as EXSi and running multiple VMs on top of it because my workloads all live in Docker containers instead. Ubuntu Server is more suitable since I am able to configure everything from a SSH console. If I may conjecture why the r/homelab community loves their VMs, it may be because many of the hobbyists are used to using them for their day-jobs. There were a handful of folks who did run their own GUI-less, no-VM setups, but it was the minority.

In the end, Ubuntu Server 20.04 LTS was installed on a 1 TB SSD boot drive. A 10 TB HDD was formatted with ZFS in a single drive configuration. Docker daemon was installed from its official Apt repo, and a number of other non-root processes were installed from Nix and Nixpkgs.

Conclusion

There’s a few more things I want to discuss regarding the home server. Some of those include using Nix and Nixpkgs in a server environment and some of the difficulties, setting up a local DNS server to provide domain name resolution for devices on the network and in Docker containers, a reverse proxy for the webapps running in Docker containers using the Caddy webserver, and some DataDog monitoring.

In the future I have plans to expand the amount of storage while at the same time introducing some redundancy with ZFS RAIDz1, diving into being able to remotely access the local network via VPN or some other secure method, and better monitoring for uptime, ZFS notifications, OS notifications, and the like.

Better Cilk Development With Docker

I’m taking a course that focuses on parallel and distributed computing. We use a compiler extension for GCC called Cilk to develop parallel programs in C/C++. Cilk offers developers a simple method for developing parallel code, and as a plus it now comes included in GCC since version 4.9.

The unjust thing with this course is that the professor provides a hefty 4GB Ubuntu virtual machine just for running the GNU compiler with Cilk. No sane person would download an entire virtual machine image just to run a compiler.

Docker comes to the rescue. It couldn’t be more space effective and convenient to use Cilk from a Docker container. I’ve created a simple Dockerfile containing the latest GNU compiler for Ubuntu 16.04. Here are some Gists showing how to build and run a Dockerfile which contain the dependencies needed to build and run Cilk programs.

Push-button Deployment of a Docker Compose Project

I was recently working on figuring out how to automate the deployment of a simple docker compose project. This is a non mission-critical project that consisted of a redis container and a Docker image of Hubot that we’ve built. Here’s the gist of the docker-compose.yml file:

Whenever a new version of the zdirect/zbot image is updated and published to the registry a deploy script can be run. For example, the script used to automatically deploy a new version of a Docker Compose project is shown here:

Yup, that’s all. Its that simple. Behind the curtains, this command pulls the latest version of the image. Since the docker-compose.yml file doesn’t specify a tag it defaults to latest. The old container is then removed and a new one started. Any data specified in volumes are safe since its mounted on the host and not inside the container. Obviously a more complicated project would have a more involved deployment, but simpler is better!

Integrating this deployment script into Rundeck, Jenkins or your build tool of choice is a piece of cake and isn’t covered here, but might be in a future post. This automation allows you to bridge the gap between building your code and running it on your servers, aka the last-mile problem of continuous delivery.

Acquisition, Docker and Rundeck

travelclick-logoZDirect, the wonderful company I work for has been acquired by TravelClick, a billion dollar hospitality solutions company.

First of all: Woohoo! I can’t be more excited to be around at this time to jump-start my career.

One of the changes to occur as soon as possible is the consolidation of our datacentre into TravelClick’s. One of our devs recently found out about Docker and got interested about its power (Hallelujah! Finally it’s not just me who’s interested). Later I bring up Rundeck, a solution to organizing our ad-hoc and scheduled jobs that will assist in the move to Docker.

Docker

docker-largeHis plan is to Dockerize everything we have running in the datacentre to make it easier for our applications to be run/deployed/tested/you-name-it. The bosses are fine with that and are backing him up. I’m excited since this is a fun project right up my alley.

Since I’m working my ass off trying to finish my degree, I’m only in one day of the week to wash bottles and offer some Docker expertise. Last Friday I had a good chat with the dev working on the Docker stuff. We chatted about Kubernetes, Swarm, load balancing, storage volumes, registries, cron and the state of our datacentre. It was quite productive since we bounced ideas off of each other. He’s always busy, juggling a hundred things at once so I offered to give him a hand setting up a Docker registry.

By the end of the day I had a secure Docker registry running on one of our servers with Jenkins building an example project (ZBot, our Hubot based chatroom robot), and pushing the image to the registry after it is built. An almost complete continuous delivery pipeline. What would make this better is a way to easily deploy the newly created Docker image to a host.

Rundeck

rundeck-logoRundeck is a job scheduler and runbook automation tool. Aka it makes it easy to define and run tasks on any of the servers in your datacentre from a nice web UI.

Currently, we have a lot of cron jobs across many servers scheduled to run periodically for integration, backup and maintenance. We also ssh into servers to run various commands for support and operations purposes.

Here’s the next piece to our puzzle. Rundeck can fit into many of our use-cases. A few of them are as follows:

  • Deployment automation (bridge the gap between Jenkins and the servers)
  • Run and monitor scheduled jobs
  • Logging and accountability for ad-hoc commands
  • Integrate with our chatroom, for all the ChatOps
  • Automate more of production

As we move towards Dockerizing all of our apps, we have to deal with the issue of what we’re going to do with the cron jobs and scheduled tasks that we run. Since we’re ultimately going to move datacentres it makes the most sense to take the cron jobs and turn them into scheduled jobs in Rundeck. That way we can easily manage them from a centralized view. Need to run this scheduled job on another machine? No problem, change where the job runs. Need to rerun the scheduled job again? Just click a button and the scheduled job runs again.

The developers wear many hats. Dev and ops are two of them. Because we’re jumping from mindset to mindset it makes sense to save time by automating the tedious while trying not to get in the way of others. Rundeck provides the automation and visibility to achieve this speed.

With the movement of putting all our apps and services into Docker containers, Rundeck will allow us to manage that change while being able to move fast.


If you’re interested in joining us on this action packed journey, we’re hiring.

Introducing ChatOps to my Workplace: Hubot

Hubot? You may not have heard of it, but its pretty much the workhorse of ChatOps. Hubot is a scriptable chatroom robot. It can integrate with many chat services and comes with a huge community of plugins and extensions.

In the previous post I talk about ChatOps, Slack, and how I plan on introducing it to my workplace.

Hubot is an IRC/campfire bot designed to give some character to your team’s channel. It has various commands for inserting photos in your chat, fetching stuff, and, indeed, running pre-configured commands.

There exists many other chatbots, but Hubot is the most popular. You can get scripts for anything: showing images, interacting with Jenkins, ship it squirrel, pager-duty, and hubot-plusplus, where the points don’t matter.

All of the plugins are written in coffeescript and follow a simple input/output design using regular expressions. Persistence is included as well, using Redis as the datastore.

Writing Hubot scripts can automate tasks while presenting a simple interface to interact with. Writing custom scripts adds useful insights and actions into your business and software.

So far I’ve written a custom Docker image that contains everything needed to run Hubot, bundled with a bunch of scripts. Everything is kept in a Git repository on our SCM server.

In the future, I plan on making the Docker images more extensible by separating the configuration from the code, then publishing it to my GitHub.

I also want to use Docker Compose to define a Redis container as a dependency of the Hubot container. This primarily allows for the Hubot container to be destroyed and rebuilt while the data stays safe in its own container.