Building a homelab – local DNS

This is a second post in a series of my experiences while Building a Homelab. The first post focusing on the history, hardware, and OS can be found here.

Having a number of networked devices at home presents some management overhead. You may find yourself asking, what was the IP address of that one laptop? or just getting plain old tired of looking at IP addresses. One method people often use to manage their network is to assign DNS names to their devices. Instead of constantly typing in 192.168.1.1 you could instead assign it the domain name router.home. Entering router.home into your browser then transparently brings you to the same webpage as 192.168.1.1. This not only works for browsing the internet, services such as SSH, FTP, and other places where an IP address would normally be used can likely use the friendlier domain name instead.

So how can this be done? It’s actually quite simple given you have an always-on computer on the same network as the rest of your devices, a router with DNS serving capabilities, or even a DNS provider such as Cloudflare. This article will focus on the DIY solution of running a DNS server on an always on computer.

Before we get to how to set this up, let’s first explain what DNS is and how it works. Feel free to skip over this section if you’re already knowledgeable.

What is DNS?

DNS is a technology used to translate human-friendly domain names to IP addresses. For example, we can ask a DNS server what is the IP address for the domain google.com? The DNS server would then respond with the IP address for google.com: 172.217.1.174. DNS is used for almost every request your computer, phone, smart lightbulbs, and more when it communicates with the internet.

Anyone who runs a website is using DNS whether they know it or not. Usually the basic premise is that each domain name (eg. mysite.com) will have a DNS record which points to an IP address. The IP address is the actual computer on the internet which traffic for mysite.com will be sent to.

An example of DNS being used can be for jonsimpson.ca. This site is hosted on a server that I pay for at DigitalOcean. That server has an IP address of 1.2.3.4 (a fictitious example). I use Cloudflare as the DNS provider for jonsimpson.ca. Anytime a user’s browser wants to go to jonsimpson.ca, it uses DNS to figure out that jonsimpson.ca is located at 1.2.3.4, then the user’s browser opens up a connection with the server at 1.2.3.4 to load this site.

This is quite a simplified definition of DNS as the system is distributed across the world, hierarchical, and involves hundreds of thousands, if not millions, of different entities. Cloudflare provides a more detailed explanation as to how DNS works, and Wikipedia has comprehensive coverage of multiple concerns relating to DNS. But what was explained earlier will provide enough context for this article.

Running a local DNS server

If there’s an always-on computer – whether that’s a spare computer or Raspberry Pi – a DNS server can run on it and provide DNS capabilities for the local network. Dnsmasq is a lightweight but powerful DNS server that has been around for a long time. Many hobbyists use Dnsmasq for their home environments since it’s quite simple to configure and get going. One minimal text file is all that’s needed for configuring a functional DNS service.

I chose to run Dnsmasq on my always-on server in a Docker container. When configuring Dnsmasq, for each device that I wanted to provide a domain name for, I added a line in the configuration mapping its IP address to the name I wanted to give it. For example, my router which lives at 192.168.1.1 was assigned router.home.mysite.com, and my server which lives at 192.168.1.2 was assigned server.home.mysite.com.

I then configured my router’s DHCP to tell all clients to use the DNS provided by the server (contact 192.168.1.2 for DNS), and configure some manually networked devices to explicitly use the DNS provided by the server. Now on all of my devices I can type in server.home.mysite.com anywhere I would type 192.168.1.2 – so much nicer compared to having to type in an entire IP address.

The configuration

I use Docker as a way to simplify the configuration and running of different services. Specifically, I use docker-compose to define the Dnsmasq Docker image to use, which ports should be opened, and where to find its configuration. Here’s the docker-compose.yml file I use:

The docker-compose file defines one dns service that uses the base image of strm/dnsmasq, as its one of the more popular Dnsmasq images available on hub.docker.com. The volume option specifies that we map a config file located alongside the docker-compose.yml file at config/dnsmasq.conf into the container’s filesystem at /etc/dnsmasq.conf. This is done to allow the container to be recreated at any time while keeping the same configuration. Networking-wise, TCP and UDP port 53 are exposed (yes, DNS operates over TCP sometimes). The network-mode is set to the host’s network (Dnsmasq just doesn’t work without this). And lastly, the NET_ADMIN capability so that we can use privileged ports below 1024. The last option restart, (one of my favourite features of docker-compose) is to keep the container running even when the host reboots or the container dies.

All of these docker-compose.yml options can be understood in more detail in Docker’s reference docs.

More importantly, here’s the dnsmasq.conf file I use to actually configure Dnsmasq’s DNS capabilities:

A lot of these settings were based off of the following blog post. Many of these options can be looked up online in the official documentation, therefore I will focus on the ones relevant to this article.

I have my Ubiquity router handle providing DHCP for my network, therefore the no-dhcp-interface=eno1 is set here to not provide any DHCP services to the local network, as eno1 is the interface my server uses to connect to the network.

When Dnsmasq needs to find the DNS record for something that it doesn’t know, it performs a request to an upstream DNS server. server is used for this and can be specified multiple times to provide redundancy in case one of these DNS servers are down. I’ve specified both the Google and Cloudflare DNS servers. In addition to this, the all-servers option results in all defined server entries being queried simultaneously. This has the benefit that one DNS server may respond quicker than the others, resulting a net-faster response to the DNS query.

The most important part of this dnsmasq.conf configuration file are the last lines defined in the file that start with address=. This is Dnsmasq’s way to declare DNS mappings. For example, any device on my network performing a request for server.home.mysite.com will have 192.168.1.2 returned.

The really cool thing with DNS is that subdomains for any of these records return the same IP, unless declared explicitly otherwise. An example of this is blog.apps.site.jonsimpson.ca doesn’t exist in the configuration file, but performing a DNS request for it will return 192.168.1.2. This has the effect that “multiple services” can each have its own domain name, but all be served by the same IP address.

Conclusion

Hopefully this article gives a background about what DNS is, how it can be useful in a home environment, and how to setup and operate a Dnsmasq DNS server. A future post will build on top of the DNS functionality that has been setup here to provide multiple HTTP services running on separate domain names, all served by the same server, for the home network to use.

Twenty Six

2020 is the year I turn 26. October 1st is that day! As always with every year for the past seven years now I reflect back on the past year by sharing some accomplishments and progress with life.

Travel

Given Covid-19, things have been different but manageable. Before the craziness started I travelled to Nashville in November for RubyConf (a conference for developers) with a number of teammates. We had a great time exploring the city and sampling the different foods over the few days we were there. I never realized how much of a party city Nashville is. The live music and bar scene wants me to go back again with friends. I’ll have to grab a cowboy hat next time I’m there.

PEI lighthouse

In the new year there was a work trip to Montreal for a couple days. Great times were spent getting to know new colleagues from our greater team, and having fun with existing. The city never gets boring. This trip is my best celebrity claim to fame: at a speakeasy in old Montreal, Harrison Ford (of Bladerunner, Star Wars) showed up with a few people and had a discreet time. They didn’t want any attention, therefore the group I was with and myself weren’t able to actually meet him. Oh well, at least he walked around a bit so that we could try to remember him a whole lot better.

Right before things started shutting down in Ottawa at the end of February, my mom and sister gave me a surprise visit. The highlights were hitting up the town with my friends and going to some new restaurants. Recommendations are for the Duelling Pianos event at the Sens House Saturday nights, and breakfast at the Manx on Elgin.

Isolating in Nova Scotia

Part way through the pandemic, I had the opportunity to travel out east to Nova Scotia with a few friends out to their family’s beach house. We had to quarantine for two weeks while out there but that wasn’t too difficult when the weather was hot, the beach was there, and light beer was plentiful. It was so great the first time that we decided to go back a second time (including a second quarantine) for an entire month. During these two trips I attended one of my best buddy’s wedding, his bachelor party, made apple cider, experienced the east coast cultural norms, and rekindled my love of rocks. Home base was Merigomish, Nova Scotia, but we also stayed in Fredericton, Moncton, Charlottetown, and French River, PEI. It was surreal to experience the relaxed Covid restrictions out in the eastern bubble. I’m thankful that I was able to work remotely while out there and it having no impact on my work. I’m glad to have been able to travel during the pandemic.

Fitness

To stay fit throughout the winter I invested in a smart trainer for my bike. Tied with the virtual cycling app, Zwift, I started crushing seasons of Brooklyn Nine-Nine while keeping my fitness up.

With Fridays being days off during the summer months at Shopify, this gave my friends and I a lot of great summer days to get up to trouble. Many of the days were spent cycling around Ottawa and going to different beaches in the area. We frequented Aylmer beach on the Quebec side. It was a scenic hour long cycle and great sandy beach, perfect for very hot days. One time we went out to Sablon Beach for hanging out at the large beach and camping over for a night. With all this travel and sun, it’s been satisfying to work on getting a nice, even tan.

Many cycling trips into Gatineau Park

A few friends and I signed up to run the Ottawa Race Weekend 10k. The last time I did this was in 2017! The official race was cancelled, but could still be run any time over the summer, and anywhere to still get ranked. On the last day to submit the results I was running a 5k and decided to see how much further I could go. Pacing myself and keeping good enough form allowed me to run the 10k successfully! and in a fair time, albeit with some breaks.

Work

At work my team and I have transitioned to working in a different problem space. It’s quite a refreshing feeling being immersed in a little-known area as it keeps me on my toes. There was a several month period where I was leading two teams of developers on two different projects – one being the old team and project that is wrapping up, the other being the new team and project. This was challenging as my time was split between the two teams. Prioritizing, delegating, and providing the right nudges to influence the projects were critical throughout this period, especially when one team was coming up to a big launch, and the other team was trying to get off the ground. At the end of the day, the launch was wildly successful, and the new team is just about to ship its first version of the service we’re building.

Earlier in 2020, our greater team moved from the Support org into the Trust org. We’re still solving problems for Support, but our scope has expanded to accelerate the rest of the business. This is a great opportunity for us that hasn’t fully come to fruition just yet. A lot of our services can be leveraged by other teams. The mindset we have now is building services that provide a whole lot of leverage and speed to other teams. Every year things change a whole lot on my team for the better. Three years in now and it’s still a wild ride.

Numbers

As always, here’s some numbers on what I’ve accomplished over the year:

  • 110 km of running, 11 hours total
  • 1,039 km of cycling, 47 hours total
  • 1,010 Github contributions from work and personal projects
  • 3 books read, 5 on the go
  • 6 posts published on this blog, 5 unpublished
  • 2,373,358 steps, 1,642.99 km, 868,256 calories recorded via my Fitbit

Looking back, my running and cycling rival my 2017 numbers. It’s been a great time outside this year!

🍻 to another year!

Building a homelab – a walk through history and investing in new hardware

This is the first post in a series of my experiences while Building a Homelab. The second post focuses on setting up a local DNS server and can be found here.

I’ve had a particular interest in home computers and servers for a long time now. One of my experiences was wiring my childhood home up with CAT-5 ethernet to the rooms with TVs or computers and having them all connected to a 24 port 100 Mbps switch in the crawlspace. This was part of a master plan to provide different computers in the house with internet connection (when WiFi wasn’t as good as it is today), TVs with smart media boxes (think Apple TV, Roku, and the like but 10 years ago), and to tie it all together a home server for serving media storing files.

The magazine Maximum PC was a major source for this inspiration as they had a number of captivating DIY articles for running your own home server, media streaming devices, and home networking. The memory is a bit rough around the edges, but these projects happened around the same time and on my own dollar – all for the satisfaction of having a bleeding edge entertainment system.

Around this time Windows had a product out for a year called Windows Home Server. It was a OS which catered towards consumers and their home needs. Some of the features it had was network file shares for storing files, computer backup and restore, media sharing, and a number of extensions available from the community. I built a $400 box to run this OS and store two hard drives. The network switch in the crawlspace was a perfect place to put this headless server. Over many years this server was successfully used for computer backups, file storage, network bandwidth monitoring, and media serving to a number of PCs and media streaming boxes attached to TVs.

Two of the TVs in the house had these Western Digital TV Live boxes for playing media off of the network. These devices were quite basic at the time where only Youtube, Flickr, and a handful of other services were available – lacking Netflix and the other now popular Internet streaming services. Instead, they were primarily built for streaming media off of the local network – in this case off of the home server file share. My family and I were able to watch movies and TV shows from the comfort of our couch, and on-demand. This was crazy cool at the time as most people were still using physical media (DVD/Blu-ray) and streaming media had not taken off yet. I also vaguely remember hacking one of the boxes to put on a community-built firmware.

Windows Home Server was great at the time since it offered all of this functionality out of the box with simple configuration. I remember playing with BSD-based FreeNAS on old computers and being overwhelmed at all of the extra configuration needed to achieve something that you get out of the box with Windows Home Server. Additionally, the overhead of having to administer FreeNAS while only having a vague knowledge of Linux and BSD at the time wasn’t a selling point.

Now back to current times. I’m in the profession of software development, have been using various Linux distros for personal use on laptops and servers, and would now consider myself a sysadmin enthusiast. Living in my own place, I’ve been using my own Ubuntu-based laptop to run a Plex media server and stream content to my Roku Streaming Stick+ attached to my TV. The laptop’s 1 TB hard drive was filling up. It was also inconvenient to have this laptop constantly on for serving content.

Browsing Reddit, I came across r/homelab, a community of people interested in owning and managing servers for their own fun. Everything from datacenter server hardware to Raspberry PIs, networking, virtualization, operating systems, and applications. This subreddit gave me the idea of purchasing some decommissioned server hardware from eBay. I sat on the idea for a few months. Covid-19 eventually happened and with all my spare time I gave in to buying some hardware.

After a bunch of research on r/homelab about which servers are quiet, energy efficient, extendable, and will last a number of years, I settled on a Dell R520 with 2 x 6 cores at 2.4 Ghz, 48 GB DDR3 RAM, 2 x 1 Gbit NICs and 8 x 3.5″ hard drive bays. I bought a 1 TB SSD as the boot drive and a refurbished 10 TB hard drive for storing data.

The front of the Dell R520, showing the 8 3.5″ drive bays and some of the internals.

Since I intended on running the ZFS filesystem on the data drive, many people gave the heads up that the Host Bus Adaptor (HBA) card (a piece of hardware which connects the SAS/SATA hard drives and SSDs to the motherboard) comes with the default Dell firmware. This default firmware caters towards always running some sort of hardware-based RAID setup, thus hiding the SMART status of all drives. With ZFS, accessing the SMART data for each drive is paramount for data integrity. To get around this limitation with the included HBA card, the homelab community has some unofficial firmware for it which exposes IT mode, basically a way to pass through each drive to the OS – completely bypassing any hardware RAID functionality. Some breath holding later and the HBA card now had the new firmware.

I bought a separate HBA card with the knowledge at the time that the one that comes with the Dell R520 didn’t have any IT mode firmware from the community. I ended up being wrong after a whole lot of investigation. Thankfully I should be able to flash new firmware on this card as well and sell it back on eBay.

A Dell Perc H310 Mini Mono HBA (Host Bus Adaptor) used in Dell servers for interfacing between the motherboard and SAS/SATA drives.

As the hardware was all being figured out, I was also researching and playing with different hypervisors – an operating system made for running multiple operating systems on the same hardware. The homelab community often refers to VMware ESXi, Proxmox VE, and even Unraid. I sampled out the first two, as Unraid didn’t have an ISO available to test with and wasn’t free.

Going through the pain of making a USB stick bootable for an afternoon, I eventually got ESXi installing on the system. Poking around, it was interesting to see that VM storage was handled by having a physical disk formatted to a VMware format specific to storing multiple VMs – vmfs. With the goal of having one of the VMs have full control over a drive formatted with the ZFS filesystem, ESXi provides a feature called hardware passthrough which bypasses virtualization of the physical hardware. One big blocker for myself was the restriction on the free version which limits VMs to a maximum of 8 vCPUs – a waste of resources when having 12 CPUs and not enough VMs to utilize them.

Next, I took a look at Proxmox by loading it up as a VM on ESXi. It was Debian based, which was a plus as I’m comfortable with systemd and Ubuntu systems already. The Proxmox UI appeared like it had quite a few useful features, but didn’t feel like what I needed. I was much more comfortable with the terminal, and these graphical interfaces to manage things felt more like a limitation than a benefit. I could always SSH into Proxmox and manage things there, but there’s always the aspect of learning the intricacies of how this turnkey system was setup. Who knows what was default Debian configured and what was modified by Proxmox. Not to mention, what if Docker or other software was out of date and couldn’t be upgraded? This would be an unnecessary limitation I could avoid if rolling my own.

Lastly, I went back to my roots – Ubuntu Server. I spun up a VM of it on ESXi. Since I’m quite used to the way Ubuntu works it was comfortable knowing what I could do. There were no 8 vCPU limitations with Ubuntu Server as the host OS – I can utilize all of the server’s resources. After some thinking I realized I didn’t have any need to run any VMs at the moment. In the past I’ve managed a number of VMs using QEMU using Ubuntu Server, therefore if the need arises again I can pull it off. The reason why I’m not using any VMs is because I’m using Docker for all of my application needs. I already have a few apps running in Docker containers on my laptop that I’ll eventually transfer over to the server. Next up, ZFS on Linux has been available for a while now in Ubuntu, giving me the confidence that the data drive will be formatted with ZFS without a problem.

The internals of the Dell R520 with the thermal cover removed. Note the row of six fans across the width of the case to keep things cool.

In the end I scrapped the idea of running a hypervisor such as EXSi and running multiple VMs on top of it because my workloads all live in Docker containers instead. Ubuntu Server is more suitable since I am able to configure everything from a SSH console. If I may conjecture why the r/homelab community loves their VMs, it may be because many of the hobbyists are used to using them for their day-jobs. There were a handful of folks who did run their own GUI-less, no-VM setups, but it was the minority.

In the end, Ubuntu Server 20.04 LTS was installed on a 1 TB SSD boot drive. A 10 TB HDD was formatted with ZFS in a single drive configuration. Docker daemon was installed from its official Apt repo, and a number of other non-root processes were installed from Nix and Nixpkgs.

Conclusion

There’s a few more things I want to discuss regarding the home server. Some of those include using Nix and Nixpkgs in a server environment and some of the difficulties, setting up a local DNS server to provide domain name resolution for devices on the network and in Docker containers, a reverse proxy for the webapps running in Docker containers using the Caddy webserver, and some DataDog monitoring.

In the future I have plans to expand the amount of storage while at the same time introducing some redundancy with ZFS RAIDz1, diving into being able to remotely access the local network via VPN or some other secure method, and better monitoring for uptime, ZFS notifications, OS notifications, and the like.

RubyConf 2019 Talks – Day 2

Here’s a continuation of the previous post covering day 1, this one instead on the talks I attended for Day 2 of RubyConf 2019! Headings are linked to a video of the talk.

Injecting Dependencies For Fun and Profit

Chris Hoffman discussed the basics and the benefits of dependency injection, mentioning that it’s an alternative to mocking when testing. The benefit of dependency injection is that all of the classes using dependency injection list their dependencies explicitly in the initializer. This benefits people who go to read that code later, especially new devs to the team, since all of a classes dependencies are centralized in one location. It also makes testing classes in isolation easier since test doubles of the dependencies can be passed into the classes initializer, compared to the implicit method of mocking objects which can lead to dependencies being forgotten deep in the class.

One of the interesting patterns that Chris’ company adopted (and that I don’t necessarily agree with) to manage dependencies with dependency injection in their codebase is to have a dependency god object. This object is initialized at the start of the program and contains a reference to each dependency in their system. This dependency object is then passed by reference into each classes initializer. When a class needs a dependency it refers to the dependency in the dependency god object. This appears to be a purely functional way of using dependency injection compared to the more popular solution of using globally accessible dependency objects. dry-rb‘s auto_inject is a common dependency injection library which uses the globally accessible dependency pattern.

Overall, dependency injection is a great pattern for scaling medium to large codebases and making testing components simpler.

The Fewer the Concepts, the Better the Code

David Copeland presented the idea of programming with fewer concepts for better code comprehension between many developers. This talk was a bit of a shock since it goes against many of my ideals, but I fully enjoyed challenging my beliefs on the subject. For context, David’s team was just himself and a lead with a non-computer-science background at a small company. When David’s code was reviewed by his lead, the code was critiqued to be simpler and easier to understand. Over time David figured out that using more generic programming language concepts, such as for, if, return, etc. common to most procedural programming languages was what his lead was pushing him towards.

The talk then went into an example of some code which a Rubyist would have written with each, map, implicit returns, etc. contains many more concepts that a developer would have to know about compared to the same code written with much fewer concepts. An example of the benefit of writing code to use these generic programming language concepts is that learning to use new programming languages can be much simpler since they all generally have the same shared concepts. Onboarding new developers onto the team can be much faster if the dev only has to understand a small subset of programming language features. The Go programming language was compared to this practice as it has a smaller number of concepts than other programming languages.

At the end of the talk I asked the question about whether this style of programming may outweigh benefits by making it easier to introduce more bugs. Using functional programming language features such as the Enumerable collection of functions in Ruby can make code much easier to reason about. David agreed that more bugs are definitely a possibility, but he doesn’t have anecdotal evidence from his team.

Disk is Fast, Memory is Slow. Forget all you Think you Know

Another controversial talk I wanted to challenge my beliefs with, this time challenging the principle of memory not always being faster than disk. Daniel Magliola presented this conundrum in the form of a improvement he was attempting to make. The improvement was making metrics available for his cluster of forking Unicorn processes. When using Prometheus to collect metrics from apps, it queries each app at a specific endpoint to read in the metrics and their values. The problem with forking web servers is that when the request comes in to return all of the metrics, the request is dispatched to one of the Unicorn processes, only returning that process’ metrics, not the group of forked Unicorn processes as it should.

Daniel went down the rabbit hole on this GitHub issue looking for performant ways to support metrics collection for forking webservers. With the goal of keeping the recording of metrics as close to 1 microsecond as possible, some solutions that were investigated involved storing metrics in Redis, the Ruby class PStore which transactionally stores a hash to a file, and tenderlove/mmap library to share a memory mapped hash to each process. Unfortunately none of the potential solutions could beat 1 microsecond.

The solution Daniel discovered, and expertly discussed throughout his talk was using plain old files and file locks. This solution ended up only taking ~6 microseconds per metric write and was much more reliable and simpler than dealing with mmap’ed memory, or more running infrastructure. The title of the talk was misleading, and was touched on near the end of the talk as this file-based solution benefitted from operating system optimizations made to cache writes in main memory and disk caches. According to the program the file was updated successfully to disk, with proper locking to prevent multiple writers tripping over each other, but this was all possible by the performant abstraction our modern operating systems provide us with.

Digging up Code Graves in Ruby

Noah Matisoff went into how code graves, aka dead code comes about. Oftentimes developers can be modifying existing parts of code and stop calling other pieces of code, either in the same or different file. That code may still have tests, so test code coverage metrics can’t really help here. Feature flags, where 100% of users are going through one code path and not the other are also prime candidates for code that doesn’t need to exist.

Code coverage tools can be run in production, or in development and help give a good idea of what parts of code are never reached. Static code analysis tools can also help determine if code isn’t referenced anywhere, but it is a hard problem to solve with Ruby since the language isn’t typed and is quite dynamic. Another solution to help keep dead code out of codebases was to add todos to the codebase. Todos can be setup to remind developers to remove bits of code from the codebase or perform other actions. Some automations were shown to make todos more actionable.

RubyConf 2019 Talks – Day 1

I attended RubyConf this year in Nashville, Tennessee with a few of my teammates from Shopify. What a great city and a great first time attending RubyConf!

I took notes on many of the talks I attended and here are the summaries for the first of the three days. Day 2 is available here. Headings that have links go to a video of the talk.

Matz Keynote – Ruby Progress Report

Matz started off the conference with his talk on the upcoming Ruby 3, talking about some upcoming features with it, and the timeline. Ruby 3 will absolutely be released at the end of 2020, removing half-baked features if necessary to keep it on track. This probably also means that if the 3×3 performance goals aren’t fully met, then it’ll still be shipped. He spent some time on talking about being a Rubyist, as the majority of attendees were new to RubyConf, encouraging people to have discussions and contribute to the future of Ruby.

Matz went into some of the new features going into Ruby 2.7 and Ruby 3, and some of the features or experiments being removed. Some of the biggest hype was around the addition of pattern matching, the just in time compiler (JIT), emojis (though Matz didn’t think so), type checking, static analysis, and an improved concurrency model via guilds (think Javascript workers) and fibers. Some features or experiments that were removed were the .: (shorthand for Object#method), the pipeline operator, deprecating automatic conversion of hash to keyword arguments. Some attendees were vocal about getting more rationale about removing these features, and Matz was more than accommodating to explain in more detail.

No Return: Beyond Transactions in Code and Life

Avdi Grimm’s talk focused on discussing the unlifelike constraints that are imposed on users when performing things online. For example, filling out a survey or form online may result in the user losing their progress if they exit their browser. In real life this doesn’t happen, so why should we constrain these transactions so much? Avdi recommends that when building out these processes, these transactions, that we should instead think of it as a narritive, one stream of information sharing that only requires the user to complete a step when it’s really necessary. Avdi related this to our code by suggesting a few concepts that can make our programs more narrative-like such as embracing state and history of data by utilizing event-sourcing/storming and temporal modelling, failing forwards in code by treating exceptions as data and expecting failures, and interdependence in code by using back pressure, and circuit breakers.

Investigative Metaprogramming

Betsy Haibel talked about an effective way of figuring out a bug during a potentially painful upgrade of their Rails app to 6.0. Through the use of metaprogramming, she was able to fix a frozen hash modification bug that would have otherwise been quite difficult to debug. She accomplished this feat by monkey patching the Hash#freeze method, saving a backtrace whenever it is called. Then in the Hash#[]= method, rescue any runtime exceptions that occur and start a debugger session. This helped her narrow down exactly where the hash was frozen earlier on in the code.

Besty then went into detail on what metaprogramming is, and how it differs from language to language. Java, for example has distinct loadtime and runtime phases when the application is starting up. Ruby, on the other hand is both loading classes and executing code at the same time since it’s all performed together during runtime.

Lastly, the talk provided a pattern for using metaprogramming to investigate bugs or other problems in code. Through reflection, recording, and reviewing, the same pattern can be applied to help debug even the most complex code. The reflection step makes up determining what part of the code early on leads to the program failing. The moment that it occurs can be found by inspecting the backtrace at that point in time. Next is the recording step where we want to patch the code that we’ve identified from the reflection step to save the backtrace. This can be done either by saving the caller to an instance variable, class variable, logging. To get a foothold into the code, the patching can be accomplished by using Module#prepend or even the TracePoint library. Lastly, reviewing is the step in which we observe an event in the system (eg. an Exception) and either pause the world or log some info for further reading. An example of this would be to put in a breakpoint or debugger statement, optionally making it a conditional breakpoint to help filter through the many occurrences.

Ruby Ate My DSL!

Daniel Azuma presented about what DSLs (Domain Specific Languages) are, the benefits of them, and how they work. One of the biggest takeaways from this talk was that DSLs are more like Domain Specific Ruby as we’re not building our own language, instead the user of these DSLs should fully expect to be able to use Ruby while using DSLs.

Daniel also went on to mention how to build your own DSL, mentioning a few gotchas as he went. One of those was that since instance_eval is used throughout implementing DSLs, that we should be aware of users clobbering existing instance variables and methods. One solution is to have a naming convention for the DSLs internal instance variables and private methods (eg. prefixing with underscore characters). Alternatively, another way of preventing this clobbering from going on is to separate the DSL objects from the implementation which operates on those objects. This then has the effect that the user of the DSL has the minimum surface area needed to set the DSL up, removing the possibility of overwriting instance variables or methods the internal DSL needs to run.

Design DSLs which look and behave like classes. Specifically, whenever blocks are used, have them map to an instance of a class. RSpec is a great example of this where describe and it calls are blocks which create instances of classes. The it call creates instances that belong to the describe instance. Where things get more interesting and lifelike is if helper methods and instance variables defined higher up in a DSL can be used further down in the DSL. This is the concept of lexical scoping.

Lastly, constants are a pain to work with in Ruby. They don’t behave as expected when using blocks and evals. Some DSLs provide alternatives to constants, for example RSpec’s let.

mruby/c: Running on Less Than 64KB RAM Microcontroller

Hitoshi HASUMI presented mruby/c, an mruby implementation focused on very resource constrained devices. Where mruby focuses on devices with 400k of memory, mruby/c is for devices with 40k of memory. Devices with this small amount of memory can be microcontrollers which are cheap to run and offer many benefits over devices which run operating systems. Some benefits are instantaneous startup and being more secure.

Hitoshi focused his talk on the work he performed building out IoT devices to monitor temperatures of ingredients at a sake brewery in Japan. These devices had a way for workers to measure temperatures, display the reading, as well as send that reading back to a server for further processing. Hitoshi made it clear that there are many different thing that could go wrong in the intense environment of a brewery. High temperatures, hardware failure, resource constraints, etc.

The latter half of the talk was focused on how mruby/c works and how to use it. mruby/c uses the same bytecode as mruby, but removes a few features that regular Ruby developers are used to having, namely: modules and the stdlib. mruby/c compiles down to C files and provides it’s own realtime operating system. Hitoshi finishes the talk with plugging a number of libraries and tools that he’s developed to help with debugging, testing, and generating code. Those being mrubyc-debugger, mrubyc-test, and mrubyc-utils, respectively.

Statistically Optimal API Timeouts

Daniel Ackerman discussed the widespread use of APIs and how timeouts for those remote requests are not being configured efficiently. He introduces the problem that timeouts should be optimized for the best user experience – the fastest response. Given a slow responding API request, we should timeout if we have high confidence that the request is taking too long. He prefixed the rest of his talk explaining that setting the timeout to the 95th percentile is a quick but accurate estimate.

Since APIs are all different, Daniel presents a mathematical proof of determining statistically optimal API request timeouts. By analyzing a histogram of the API response times, we can determine the optimal timeout that balances user experience with timing out requests. Slow API requests often mean that the service is under heavy load or not responding.

The Ultimate Guide to Ruby Timeouts was mentioned as a go-to source for configuring timeouts and knowing which exceptions are raised for many commonly used libraries. Definitely a useful resource. Daniel finished his talk with a plug to his gem rb_maxima, a library which makes it easy to use the Maxima algebraic system from Ruby.

Collective Problem Solving in Software

Jessica Kerr talked about the idea of cameratas – the concept of a group of people who discuss and influence the trends of a certain area. More formally, camerata came from the Florentine Camerata, a group of renaissance musicians and artists gathered in Florence, Italy who helped develop the genre of opera. Their work was revolutionary at the time.

Jessica then related it to the great ideas that have come out of ThoughtWorks, a London-based consulting company. Their incredible contributions over the years have included the concepts of Agile, CI, CD, and DevOps to name a few, have influenced the entire software industry and has set the bar higher.

In general, great teams make great people. Software teams are special in that they consist of the connections between the people in the team as well as the tools that the team uses. Jessica relates this to a big socio-technical system, introducing the term symmathesy to capture the idea that teams and their tools learn from each other. No one person has a full understanding of the systems they work on, therefore the more symmathesy going on in the team, the better the team and system is. This is similar to the concept of senior developers being able to understand the bigger picture when it comes to teams, tools, and people compared to new developers usually concerned about their small bit of code.

The talk was closed by encouraging dev teams to incentivize putting the team first compared to the individual, grow teams by increasing the flow of information sharing and back and forth with their tools. Lastly, great developers are symmathesized.


Summaries of Day 2’s talks are available here