Dec 8, 2019

RubyConf 2019 Talks - Day 2

Here’s a continuation of the previous post covering day 1, this one instead on the talks I attended for Day 2 of RubyConf 2019! Headings are linked to a video of the talk.

Injecting Dependencies For Fun and Profit

Chris Hoffman discussed the basics and the benefits of dependency injection, mentioning that it’s an alternative to mocking when testing. The benefit of dependency injection is that all of the classes using dependency injection list their dependencies explicitly in the initializer. This benefits people who go to read that code later, especially new devs to the team, since all of a classes dependencies are centralized in one location. It also makes testing classes in isolation easier since test doubles of the dependencies can be passed into the classes initializer, compared to the implicit method of mocking objects which can lead to dependencies being forgotten deep in the class.

One of the interesting patterns that Chris’ company adopted (and that I don’t necessarily agree with) to manage dependencies with dependency injection in their codebase is to have a dependency god object. This object is initialized at the start of the program and contains a reference to each dependency in their system. This dependency object is then passed by reference into each classes initializer. When a class needs a dependency it refers to the dependency in the dependency god object. This appears to be a purely functional way of using dependency injection compared to the more popular solution of using globally accessible dependency objects. dry-rb‘s auto_inject is a common dependency injection library which uses the globally accessible dependency pattern.

Overall, dependency injection is a great pattern for scaling medium to large codebases and making testing components simpler.

The Fewer the Concepts, the Better the Code

David Copeland presented the idea of programming with fewer concepts for better code comprehension between many developers. This talk was a bit of a shock since it goes against many of my ideals, but I fully enjoyed challenging my beliefs on the subject. For context, David’s team was just himself and a lead with a non-computer-science background at a small company. When David’s code was reviewed by his lead, the code was critiqued to be simpler and easier to understand. Over time David figured out that using more generic programming language concepts, such as for, if, return, etc. common to most procedural programming languages was what his lead was pushing him towards.

The talk then went into an example of some code which a Rubyist would have written with each, map, implicit returns, etc. contains many more concepts that a developer would have to know about compared to the same code written with much fewer concepts. An example of the benefit of writing code to use these generic programming language concepts is that learning to use new programming languages can be much simpler since they all generally have the same shared concepts. Onboarding new developers onto the team can be much faster if the dev only has to understand a small subset of programming language features. The Go programming language was compared to this practice as it has a smaller number of concepts than other programming languages.

At the end of the talk I asked the question about whether this style of programming may outweigh benefits by making it easier to introduce more bugs. Using functional programming language features such as the <a href="https://ruby-doc.org/core/Enumerable.html">Enumerable</a> collection of functions in Ruby can make code much easier to reason about. David agreed that more bugs are definitely a possibility, but he doesn’t have anecdotal evidence from his team.

Disk is Fast, Memory is Slow. Forget all you Think you Know

Another controversial talk I wanted to challenge my beliefs with, this time challenging the principle of memory not always being faster than disk. Daniel Magliola presented this conundrum in the form of a improvement he was attempting to make. The improvement was making metrics available for his cluster of forking Unicorn processes. When using Prometheus to collect metrics from apps, it queries each app at a specific endpoint to read in the metrics and their values. The problem with forking web servers is that when the request comes in to return all of the metrics, the request is dispatched to one of the Unicorn processes, only returning that process’ metrics, not the group of forked Unicorn processes as it should.

Daniel went down the rabbit hole on this GitHub issue looking for performant ways to support metrics collection for forking webservers. With the goal of keeping the recording of metrics as close to 1 microsecond as possible, some solutions that were investigated involved storing metrics in Redis, the Ruby class PStore which transactionally stores a hash to a file, and tenderlove/mmap library to share a memory mapped hash to each process. Unfortunately none of the potential solutions could beat 1 microsecond.

The solution Daniel discovered, and expertly discussed throughout his talk was using plain old files and file locks. This solution ended up only taking ~6 microseconds per metric write and was much more reliable and simpler than dealing with mmap’ed memory, or more running infrastructure. The title of the talk was misleading, and was touched on near the end of the talk as this file-based solution benefitted from operating system optimizations made to cache writes in main memory and disk caches. According to the program the file was updated successfully to disk, with proper locking to prevent multiple writers tripping over each other, but this was all possible by the performant abstraction our modern operating systems provide us with.

Digging up Code Graves in Ruby

Noah Matisoff went into how code graves, aka dead code comes about. Oftentimes developers can be modifying existing parts of code and stop calling other pieces of code, either in the same or different file. That code may still have tests, so test code coverage metrics can’t really help here. Feature flags, where 100% of users are going through one code path and not the other are also prime candidates for code that doesn’t need to exist.

Code coverage tools can be run in production, or in development and help give a good idea of what parts of code are never reached. Static code analysis tools can also help determine if code isn’t referenced anywhere, but it is a hard problem to solve with Ruby since the language isn’t typed and is quite dynamic. Another solution to help keep dead code out of codebases was to add todos to the codebase. Todos can be setup to remind developers to remove bits of code from the codebase or perform other actions. Some automations were shown to make todos more actionable.