Why “&” doesn’t actually break your HTML URLs

Writing tests for some code which generated HTML ended up surfacing one peculiarity with how HTML encodes URLs. The valid URL https://example.com?a=b&c=d would always get modified when inserted into HTML like so: <a href="https://example.com?a=b&amp;c=d">foo</a>.

One of my teammates commented on this during a code review – why the & character is converted to &amp; in the resulting HTML. That URL didn’t look right since the &amp; would break the URL query string.

Even more confusing was that the HTML in the URL still worked since Google Chrome and other browsers converted the URL in the HTML from its &amp; form back to &. Were the browsers just being helpful by handling these developer mistakes, much like it already does with closing missing HTML elements?

The fake bug hunt

Over two hair pulling days of reading GitHub issues, StackOverflow, HTML standards, source code, and more, it was clear that there was a clear divide in understanding. One group of people who understood this as a bug in their library of choice and another group who understood that this wasn’t a bug.

I was definitely in the former group of people until I finally found a helpful blog post clearing up the confusion. Even this StackOverflow answer concisely summed why this is, in a few quick sentences.

Simply stated, lone & characters in HTML are invalid and must be escaped to &amp;.

Since HTML is a descendant of XML, HTML implements a subset of XML’s rules. One of those rules is that a & character by itself is invalid since & is used as an escape character for a character entity reference (eg. &amp;, &lt;).

The confusion arises when people don’t know that this rule exists. Many, including myself, was blaming it on their HTML parsing libraries such as Nokogiri and libxml2. Others blamed their web app of choice since it sends them invalid HTML or XML that their HTML parser doesn’t know how to deal with.

Conclusion

Another way of understanding the same problem is that a URL on its own has its own rules around which characters must be encoded. HTML also has different encoding rules. So when URLs are used in HTML, the URL may look invalid, but given that it is in HTML, HTML has its own rules around what characters need escaping. This can lead to funky looking URLs, but rest assured that using a HTML parsing library or a browser will properly encode and decode any sort of data stored within HTML.

This explains why our browsers see &amp; in the raw HTML and know to convert it back to &. This also confirms that it is completely fine seeing &amp; characters in tests comparing HTML.

Twenty-four!

It’s hard to come up with the content for this post while fending off a sickness, but I know it’s a yearly ceremony for myself to look back and reflect on the prior year. As always, what better of a time to do this than on my Birthday!

One word can really describe my primary focus over the past year: Career. This time last year I was just about to pass the 90 day mark at Shopify.

Whether it’s been building close and trustworthy friendships with teammates and other colleagues, levelling up new hires through mentoring, or continually delivering impactful work – this year has been nothing more than exemplary.

Let’s get right into things! Since this time last year I moved to downtown Ottawa and am now living without roommates. Crazy to think that it only happened 10 months ago since it feels like forever, but I am enjoying all the perks of having no roommates and downtown life.

This summer was one of my most active to date. There was always something going on during the week or weekend from July straight through August. I had the opportunity to travel with a friend to his home town of Fredericton, New Brunswick. For the first time being on the east-cost I was expecting an east-coast accent out of everyone, but the place seemed more like Ontario than not. It was a great time hanging out with his friends and attending a party at the local hotel.

I had a great time with friends at two music festivals: Bluesfest and Escapade. Escapade was especially fun since there was a number of great acts: Alesso, Tchami, Zedd, and Kaskade. One private festival I went to was about 15 people camping at a buddies lakeside property in Quebec. A DJ booth was set up and the trance music was going on late into the night.

Being so close to downtown has its perks – I was within walking distance to both festivals.

Another bunch of cool moments were centred around exploring other Shopify offices. Barrel Hall in Waterloo was the coolest looking since it was once a distillery. It still has all of the characteristic aging barrels and wooden structure. Montreal’s office has the best artwork and looks like the most liveable city. Toronto’s offices and the city in general was a grand party since it’s mostly new to me, but I have friends, family, and colleagues everywhere.

Barrel Hall, Shopify’s Waterloo Office, definitely had the most character.

Even though I didn’t do as much cycling as last year, this year I took advantage of Ottawa’s Sunday Bikedays. On Sundays throughout the summer certain parkways were closed for the morning. This allowed for coasting down some long stretches of smooth roadway. The midpoint for some of these outings were spent taking a break at a local brewpub. Some friends joined me every once in a while too!

One of the many destinations – where the Rideau Canal meets the Ottawa River.

Investing and personal finance has become a hobby of mine and ever more important as I get older. It’s better earlier than later to learn about the do’s and don’ts of personal finance. A year of listening to related podcasts, plenty of reading, and managing my savings has enabled me to go from zero to pretty competent. I’m lucky to have a representative group of people to bounce ideas and plans off of.

I went to my first conference, BSides Ottawa, which was quite fun. I met a number of colleagues and played in my first capture the flag event. I found out that I can defend, but am not too good at attacking. I’ll try again to attend this year!

December was when myself and a few others started a rewrite for Shopify’s Help Centre. Unknown to us, there was quite a lot of feature creep – either from “that one little feature that’s existed forever”, to adding multiple language support. This resulted in the project taking seven months, but we’re glad to have done it. Throughout the process we started and built our own kick-ass team. When the rewrite shipped it went off without a hitch! 🎇 Now all of our current projects hinge off of the benefits that this rewrite brought.

Some of the team which traveled to Montreal to launch the Help Centre.

I attended a few training sessions that should benefit my career – Visualizing Software Architecture with the C4 Model, as well as Agile Scrum training. The latter has definitely transformed my team for the better.

There was a lot of various work events – planned or unplanned, official or unofficial – which I’m pretty grateful to have experienced with friends and colleagues. Alas, there are too many to mention.

For example: that one time we had a marching band…

Here’s to another year of learning, growth, exploration, and good times.

Hunting for segmentation faults in Ruby programs

I was working on building a content management engine for Shopify’s next generation Help Centre. Code named Brodie, it was equivalent to the Jekyll static site generator added to Ruby on Rails, but instead of rendering all the pages up front at compile time, each page is generated when it is requested by the client.

Brodie used a Ruby Gem called Redcarpet for the Markdown rendering. Redcarpet worked wonderfully, but Brodie ended up having a severe bug due to the extensive usage of it. The way Redcarpet was being used in Brodie resulted in periodic segmentation faults (segfaults) while rendering Markdown. These segfaults were causing many 502 and 503 errors when some unknown pages were being visited. It was such an issue that all the web servers in the cluster would go down for some time until they restarted automatically.

How do I Redcarpet?

To better explain the issue and its resolution, it is best to have an understanding of how Redcarpet, and really any other text renderer works. Here is a simple example:

In the above example, the code defines the Markdown that is to be rendered to HTML, sets up the Redcarpet::Markdown configuration object, and then finally parses and renders the Markdown to HTML.

But wait! There’s more. Jekyll and Brodie both use the Liquid language (made by Shopify!) to make it easier to write and manage content. Liquid provides control flow structures, variables, and functions. One useful function allows including the contents of other files into the current file (the equivalent of partials in Rails). Here is an example that uses the Liquid include function:

As we can see in the example above, the code renders the Liquid and Markdown to HTML. This is achieved by rendering the Liquid first, then passing the result of that into the Markdown renderer. Additionally, the Liquid include function injected the contents of _included.liquid exactly where the include function was called in main.md.

Now that the basics of Markdown and Liquid rendering have been explained, it is now possible to understand this segfault issue.

“Where is this segfault coming from???”

When my team and I were close to launching the new Help Centre that used Brodie, the custom-built Liquid and Markdown rendering engine, the app would crash due to segmentation faults. When the servers were put under load with many requests coming in, the segfaults and resulting downtime was magnified. It was clear from load testing that a small amount of traffic would bring down the entire site and keep it down.

The segfaulting would lead to servers becoming unavailable until Kubernetes, the cluster manager, checked that those servers were unhealthy and restarted them. The time it took for the pod to come back online would be 30-60 seconds. With the system being under load, it was only a couple of minutes before all the servers in the cluster were down. When this happened, the app returned HTTP 502 and 503 errors to any client requesting a page – never a good sign.

The only message that was present in the logs before the app died was the following:

Assertion failed: (md->work_bufs[BUFFER_BLOCK].size == 0), function sd_markdown_render, file markdown.c, line 2544.

Apparently, Ruby crashed in a random Redcarpet C function call. No sort of stacktrace or helpful logging followed this message. The logs did not even include which page the client requested, as the usual Rails request logging was created after the HTTP request finished. This Assertion failed message was a lead, but didn’t help much since it does not reference what caused it.

I have dealt with other Redcarpet issues in the past, where methods that have been extended in Redcarpet to add custom behaviour have thrown exceptions. Sometimes these exceptions have caused the request to fail and a stacktrace of the issue to show up. Other times it has resulted in a segfault with a similar Redcarpet C function in the message. Ultimately, writing better code fixed this earlier situation.

My intuition told me that an error was being thrown while rendering the page, causing this segfault to occur. I attempted an experiment where I added some rescue blocks to the Redcarpet methods that we extended. This would prevent the potential exceptions from being raised in the buggy code that was causing it, hopefully resulting in no segfaults. If that fix succeeded, I could safely assume that fixing the code which raised the error would be the end of the story.

Trying this, the experiment was shipped to production. Things went well until the next day. Sometime overnight the page that caused the segfaults was hit, and the operational dashboards recorded the cluster going down and rebooting. At least this confirmed that the Redcarpet extensions were not at fault.

Getting lucky

Playing around with things, a page was found out of sheer luck that could cause the app to segfault repeatedly. Visiting this page once did not cause the server to crash or the response to 500, but refreshing the page multiple times did cause the server to crash. Since this app was running multiple threads in the local development and production environments to answer requests in parallel, it is possible that there was a shared Redcarpet data structure that was getting clobbered by multiple threads writing to it at the same time. This is actually a recurring issue according to the community:

Recursive rendering

Discussing the issue more with my larger team of developers, there was the idea of removing any sort of cross-thread sharing of Redcarpet’s configuration object. One of the other developers shipped a PR which gave each thread its own Redcarpet configuration object, but this did not end up fixing the problem.

A tree, showing the order in which nodes are traversed using the depth-first search algorithm. (CC 3.0)

Building on top of this developer’s work, I knew that it was possible for the Redcarpet renderer to be called recursively based on the nature of the Liquid and Markdown content files. As described earlier, it is possible for one content file to include another content file. As we saw in the examples earlier in this article, when a content file is being rendered, the rendering pauses on that content file and descends into the included file to render it, then returns to where it left off in the original content file. This behaviour is exactly like the depth-first search algorithm from graph theory.

After making this breakthrough it was simple to understand what to try next. Each time Redcarpet was being called to render some Markdown, always create a new Redcarpet configuration object. This should solve the issue of multiple thread writes, as well as the recursive writes. Even though there is extra overhead with creating a new Redcarpet configuration object each time a content file is rendered, it is a reliable workaround that bypasses Redcarpet’s single-thread, single-writer limitation.

After coding and shipping this fix, it worked!

Refreshing that problematic page multiple times, no matter how many times, never crashed the app. The production servers were back to handling their original capacity and one developer was feeling very relieved.

Takeaways

I learned a considerable number of things from this debugging experience. Even when using battle-tested software (like Redcarpet), there may be use cases which are not exactly supported or documented to not work. Additionally, the Redcarpet library is now rarely maintained. Knowing the limitations up front can save time and frustration. One of the main reasons why this article was written was that there was no other writing about this issue and the workarounds. Hopefully it will help save time for developers in the future who run into similar issues.

It was valuable to bounce ideas off of other team members. If I had not put out my ideas and had these discussions, I would not have understood the problem as well as I did. Even the potential fix that a teammate of mine shipped but did not end up working helped me understand the problem better.

Drawing out parts of the control flow on paper to really understand how the app renders content files builds a better mental model of what actually goes on inside the app. It is one thing to have a high level overview of how different components interact with each other, but it is an entirely different level of understanding to factually know what exactly happens. This can be extended to the intricacies of the software libraries being used. In this situation, knowing the internals and behaviour of Ruby on Rails, Liquid, and Redcarpet made it a lot easier to understand what was going on.

Lastly, you always feel like a boss when you fix big, complicated problems such as this one.

Twilio’s TaskRouter Quickstart

My team and I are exploring different services and technologies in the area of contact centres. We develop and maintain the tools for over 1000 support agents, with the number rapidly rising. Making smart, long-term business and technology decisions are paramount. One of the technologies we looked into was Twilio and its ecosystem – specifically TaskRouter.

Twilio’s TaskRouter provides a clean interface for building contact centres. Its goal is to take the tedious infrastructure and plumbing work out of building a custom contact centre, exposing the right APIs to implement domain logic. TaskRouter is a high-level service since it orchestrates voice, SMS, and other communication channels with the ability to assign incoming interactions across a workforce of agents ready to take those interactions.

Twilio-Ruby

To get a head start at understanding how TaskRouter works, I spent a day looking at Twilio’s Ruby quickstart guide for TaskRouter. Wow, was I in for a frustrating time.

The quickstart guide takes the reader through a number of steps, both inside of the Twilio Console as well as building a small Ruby Sinatra app. After completing the quickstart the reader should have a fully functioning call centre with an interactive voice response (IVR) to greet and queue any user that calls in.

Some of the things that made the quickstart harder to complete is that the Ruby code examples included throughout used an older version of the twilio-ruby gem. Because of this, the code examples didn’t work with the latest version. This was both a bad and good thing. Bad in that the existing code examples wouldn’t work out of the box, but good in the fact that I had to put in some extra effort into learning where the docs and other sources of help exist, and having a deeper understanding of how the Twilio API works.

I compiled a list of resources that would assist anyone going through the same or a similar situation. It certainly helped me complete the TaskRouter quickstart.

  • The README for the twilio-ruby gem provided a great overview of what functionality it provides and how the gem is to be used
  • The v4 to v5 upgrade guide for the twilio-ruby gem showed that there was some sense to this chaos by providing the rationale and examples for updating old versions of the twilio-ruby code to the latest (v5). This was where I had my moment of understanding for the quickstart code examples.
  • Using JWT tokens was part of the last section of the quickstart. Since twilio-ruby changed the way it uses tokens, its code examples had to be updated too. The main Twilio docs on JWT goes into intricacies around building policies contained within JWT tokens
  • My lead/manager was quite happy when I mentioned to him that the twilio-ruby gem no longer uses title case for situations where camel-case or snake-case would have been better to Ruby styling. TwiML was affected by this for a number of gem versions up until v5. Since TwiML is used frequently throughout the quickstart the docs for using TwiML in twilio-ruby helped during those times.
  • Lastly, if all else fails, feel free to reference my resulting code from the TaskRouter quickstart. It’s available here on GitHub.

How Does Symmetric and Public Key Encryption Work?

With the release of Rails 5.2 and the changes with how secrets are securely stored, I thought it would be timely to write about the benefits and downsides of secrets management in Rails. It would be valuable to compare how Rails handles secrets, how Shopify handles secrets, and a few other methods from the open source community. On my journey to write about this I got caught up in explaining how symmetric and public key encryption work. So the post comparing different Rails secret management gems will have to wait until another post.

Managing secrets is now more challenging

A majority of applications created these days integrate with other applications – whether it’s for communicating with other business-critical systems, or purely operational such as log aggregation. Secrets such as usernames, passwords, and API keys are used by these apps in production to communicate with other systems securely.

The early days of the Configuration Management, and then later the DevOps movements have rallied and popularized a wide array of methodologies and tools around managing secrets in production. Moving from a small, artisanal, hand-crafted set of long-running servers to the modern short-lifetime cloud instance paradigm now requires the discipline to manage secrets securely and repeatedly, with the agility to revoke and update credentials in a matter of hours if not minutes.

While there’s many ways to handle secrets while developing, testing, and deploying Rails applications, it’s important to bring up the benefits and downsides to the different methods, particularly around production. Different levels of security, usability, and adoption exist with different technologies. Public/private key encryption, also known as RSA encryption, is one of the technologies. Symmetric key encryption is also another common encryption technology.

There exist many ways to handle secrets within Rails and webapps in general. It’s important to understand the underlying concepts before settling on one method or another because making the wrong decision may result in secrets being insecure, or the security being too hard to use.

Let’s first discuss the different types of encryption that are characteristic of the majority of secret management libraries and products out there.

Symmetric Key Encryption

Symmetric key encryption may be the simplest form of encryption to understand, but don’t let that trick you into thinking that it’s not secure. Symmetric key encryption involves one key used to both encrypt and decrypt data. This key will have to be kept secret and only be shared with trusted people and systems. Once secrets are encrypted with the key, that encrypted data can be readily shared and transferred without worry of the unencrypted data being read.

A simple example of symmetric key encryption can be explained. The most straightforward method utilizes the binary XOR function. (This example is not representative of state of the art symmetric key encryption algorithms in use, but it does get the point across). The binary XOR function means “one or the other, but not both”. Here is an example that shows the complete set of inputs and outputs for one binary digit:

1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 1 = 1
0 XOR 0 = 0

A more complicated example would be:

10101010 XOR 01010101 = 11111111
11111111 XOR 11111111 = 00000000
11111111 XOR 01010101 = 10101010

Note that line 1 and 3 are related. The output of line 1 is part of the input of line 3. The second parameter of line 1 is used as the second parameter of line 3 too. Notice that the output of line 3 is the same as the first input of line 1. As demonstrated here, the XOR function will return the same input if the result of the function is fed back into itself a second time. A further example will show this property.

Given the property that any higher form of data representation can be broken down to binary, we can then show the example of hexadecimal digits being XOR’ed with another parameter.

12345678 XOR deadbeef = cc99e897

Given the key is the hexadecimal characters deadbeef and the data to be encrypted is 12345678, the result of the XOR is the incomprehensible result cc99e897. Guess what? This cc99e897 is encrypted. It can be saved and passed around freely. The only way to get the secret input (ie. 12345678) is to XOR it again with the key deadbeef. Let’s see this happen!

cc99e897 XOR deadbeef = 12345678

Fact check it yourself if you don’t believe me, but we just decrypted the data! This is the simplest example of course, so there’s a lot more that goes into symmetric key encryption that keeps it secure. Things like block-based, and stream-based algorithms, and even larger key sizes augment the simple XOR algorithm to make it more secure. It may be simple for someone who wants to break the encryption to guess the key in this example, but it becomes much harder the longer the key size is.

This is what makes symmetric key encryption so powerful – the ability to encrypt and decrypt data with a single key. With this property comes the need to keep this single key secret and separate from the data. When symmetric key encryption is used in practice, the smaller amount of people and systems that have the key the better. Humans can easily lose the key, leave jobs, or worse: share the key with people of malicious intent.

Public Key Encryption

Quite opposite to how symmetric key encryption works, public key encryption, (or asymmetric key encryption, or RSA encryption) uses two distinct keys. In its simplest form the public key is used for encryption and the private key is used for decryption. This method of encryption separates the need for the user who is encrypting the data from having the ability to decrypt the data. Put plainly, it allows for anyone to encrypt data with the public key while the owner of the private key is the only one able to decrypt the data. The public key can be shared with anyone without compromising the security of the encrypted data.

Some tradeoffs between symmetric and public key encryption is that the private key (the key used to decrypt data) is never shared with other parties, whereas the same key is used in symmetric key encryption. Also, a downside of public key encryption is that there are multiple keys to manage, therefore it brings a higher level of overhead compared to symmetric key encryption.

Let’s dig into a simple example. Given a public key (n=55, e=3) and a private key (n=55, d=27) we can show the math behind public key encryption. (These numbers were fetched from here).

Encrypting

To encrypt data the function is:

c = m^e mod n

Where m is the data to encrypt, e is the public key, mod is the modulus function, n is the shared modulus, and c is the encrypted data.

For the number 42 to be encrypted we can plug it into the formula quite simply:

c = 42^3 mod 55
c = 3

c = 3 is our encrypted data.

Decrypting

Decrypting takes a similar route. For this a similar formula is used:

m = c^d mod n

Where c is the encrypted data, d is part of the private key, mod is the modulus function, n is the shared modulus, and m is the decrypted data. Lets decrypt the encrypted data c = 3:

m = 3^27 mod 55
m = 42

And there we have it, our decrypted data is back!

As we can see, a separate key is used for encryption and decryption. It’s worth restating that this example here is very simplified. Many more mathematical optimizations, and larger key sizes are used to make public key encryption secure.

Signing – a freebie with public key encryption

Another benefit to using RSA public and private keys is that given the private key is only held by one user, that user can sign a piece of data to verify that it was them who actually sent it. Anyone who has the matching public key can verify that the data was signed by the private key and that the data was not tampered with during transit.

When Bob needs to receive data from Alice and Bob needs to be sure it was sent by Alice, as well as not tampered with while being sent, Alice can hash the data and then encrypt that hash with her private key. This encrypted hash is then sent along with the data to Bob. Bob can then use Alice’s public key to decrypt the hash and compare it to a hash of the data that he performs. If both of the hashes match, Bob knows that the data was truly from Alice and was not tampered with while being sent to him.

Wrapping up

To pick one method of encryption as the general winner at this abstract level is nonsensical. It makes sense to have a use case and pick the best encryption method for it by finding the best fit at the abstract level first, then finding a library which offers that method of encryption.

A following post will go into the tradeoffs between different encryption methods in relation to keeping secrets in Ruby on Rails applications. It will take a practical approach, explaining some of the benefits of one encryption method over another, and then give some examples of well-known libraries for each category.

Parallel GraphQL Resolvers with Futures

My team and I are building a GraphQL service that wraps multiple RESTful JSON services. The GraphQL server connects to backend services such as Zendesk, Salesforce, and even Shopify itself.

Our use case involves returning results from these backend services all from the same GraphQL query. When the GraphQL server goes out to query all of these backend services, each backend service can take multiple seconds to respond. This is a terrible experience if queries take many seconds to complete.

Since we’re running the GraphQL server in Ruby, we don’t get provided the nice asynchronous IO that would come with the NodeJS version of GraphQL. Because of this, the GraphQL resolvers run serially instead of in parallel – thus a GraphQL query to five backend services which take one second each to fetch data from will result in the query taking five seconds to run.

For our use case, having a GraphQL query that takes five seconds is a bad experience. What we would prefer is 2 seconds or less. This means performing some optimizations when GraphQL goes to do the HTTP requests to the backend services. Our idea is to parallelize those HTTP requests.

First Approaches

To parallelize those HTTP requests we took a look at non-blocking HTTP libraries, different GraphQL resolvers, and Ruby concurrency primitives.

Typhoeus

Knowing that running the HTTP requests in parallel is the direction to explore, we first took a look at the Ruby library Typhoeus. Typhoeus offers a simple abstraction for performing parallel HTTP requests by wrapping the C library libcurl. Below is one of the many possible ways to use Typhoeus.

After playing around with Typheous, we quickly found out that it wasn’t going to work without extending the GraphQL Ruby library. It became clear that it was nontrivial to wrap a GraphQL resolver’s life cycle with a Hydra from Typhoeus. A Hydra basically being a Future that runs multiple HTTP requests in parallel and returns when all requests are complete.

Lazy Execution

We also took a look at the GraphQL Ruby’s lazy execution features. We had a hope that the lazy execution would automatically optimize by running resolvers in parallel. It didn’t. Oh well.

We also tried a perverted version of lazy execution. I can’t remember why or how we came up with this method, but it was obviously overcomplicated for no good reason and didn’t work 😆

Threads and Futures

We looked back and understood the shortcomings of the earlier methods – namely, we had to find a concurrency method that would allow us to do the HTTP requests in the background without blocking the main thread until it needed the data. Based on this understanding we took a look at some Ruby concurrency primitives – both Futures (from the Concurrent Ruby library), and Threads.

I highly recommend using higher-order concurrency primitives such as Futures, and the like because of their well-defined and simple APIs, but for hastily hacking something together to see if it would work I experimented with Threads.

My teammate ended up figuring out a working example of Futures faster than I could hack my threads example together. (I’m glad they did, since we’ll see why next.) Here is a simple use of Futures in GraphQL:

It’s not clear at first, but according to the GraphQL Ruby docs, any GraphQL resolver can return the data or can return something that can then return the data. In the code example above, we use the latter by returning a Concurrent::Future in each resolver, and having the lazy_resolve(Concurrent::Future, :value!) in the GraphQL schema. This means that when a resolver returns a Concurrent::Future, the lazy_resolve part tells GraphQL Ruby to call :value! on the future when it really needs the data.

What does all of this mean? When GraphQL goes to fulfill a query, all the resolvers involved with the query quickly spawn Futures that start executing in the background. GraphQL then moves to the phase where it builds the result. Since it now needs the data from the Futures, it calls the potentially blocking operation value! on each Future.

The beautiful thing here is that we don’t have to worry about whether the Futures have finished fetching their data yet. This is because of the powerful contract we get with using Futures – the call to value! (or even just value) will block until the data is available.

Conclusion

We ended up settling on the last design – utilizing Futures to allow the main thread to put as much asynchronous work into background.

As seen through our thought process, all that we needed was to find a way to start execution of a long-running HTTP request, and give back control to the main thread as fast as possible. It was quite clear throughout the early ideas of utilizing concurrent HTTP request libraries (Typhoeus) that we were on the right track, but weren’t understanding the problem perfectly.

Part of that was not understanding the GraphQL Ruby library. Part of it was also being fuzzy on our concurrent primitives and libraries. Once we had taken a look at GraphQL Ruby’s lazy loading features, it became clear to us that we needed to kick-off the HTTP request and immediately give back control to the GraphQL Ruby library. Once we understood this, the solution became clear and we became confident after some prototypes that used Futures.

I enjoyed the problem solving process we went through, as well as this writing that resulted from it. The problem solving process ended up teaching the both of us some valuable engineering lessons about collaborative, up-front prototyping and design since we couldn’t have achieved this outcome on our own. Additionally, writing about this success can help others with our direct problem, not to mention learning about the different techniques that we met along the way.

Zero to One Hundred in Six Months: Notes on Learning Ruby and Rails

When they say you’ll like learning and programming in Ruby, they really mean it. From my experience learning and professionally using Ruby and Ruby-on-Rails day-to-day has been quite straightforward and friendly. The rate at learning Ruby and Rails is limited to how fast you’re able to obtain and use that knowledge from either resources online, in a book, or from other people.

It’s common for people who join Shopify to not know how to program in Ruby, yet will be required to. Ruby’s community has grown a great deal of beginner to intermediate guides for newcomers to quickly get up to speed at programming in Ruby. At Shopify, since the feedback is so fast, you’re able to get into an intense virtuous cycle of learning. Since you’re able to code and ship fast, you’re able to learn faster.

From personal experience I found it quite useful to do a deep-dive into Rails before starting to learn the full Ruby language. I focused on reading the entire Agile Web Development with Rails 5 book, which consists of a short primer on Ruby, then the bulk being how to develop an online store using Rails, and lastly an in-depth look into each Rails module. I completed this book over the two weeks before starting at Shopify to give me a head-start at learning.

Roughly the first two months spent at Shopify were a split of working on small tasks by myself, pair-programming with others, and reading a number of Ruby and Rails articles. At the end of two months I found myself being able to take pieces of work from our weekly sprints and completing them to the end without feeling like I was slow, and not requiring a team member to guide me through the entire change.

Code reviews over GitHub on the changes that myself and others have made provided a strong signal on how well my Ruby and Rails knowledge and style has progressed. Code reviews for my code at the start consisted of a lot of comments on style and better methods to use. As more and more code reviews were performed over time my intuition and knowledge increased, resulting in better code and less review comments. The bite-sized improvements gained in each code review slowly built up my knowledge and helped guide me towards areas of further learning.

Mastering Ruby and Rails is gained over months to years of constant use. This is where the lesser-known to obscure language features are understood and put to use, or explicitly not put to use (I’m looking at you metaprogramming!) Some examples being the unary & operator, bindings, and even Ruby internals such as object allocation and how Ruby programs are interpreted.

Coming from the statically-typed Java world, Ruby and Rails is INSANE with the things you can do since it is a dynamic language. My favourite dynamic language related features so far are Module#prepend for inheritance, and the ability to live-reload code.

After a sufficient amount of time gathering knowledge and experience, you gain the ability to help others along their path of learning. This not only benefits their understanding, but it also reinforces your knowledge of the subject.

Some of the things I look forward to in the future are learning about optimizing Rails apps, dispelling metaprogramming, reading Ruby Under a Microscope, and digging into the video back-catalogue of Destroy All Software. I hope you have a good journey too!

What the hack? – or how my first capture the flag went

The 2017 BSides Security Conference, just outside of Ottawa, was a two day event from October 5th to 6th. It was packed with talks, lock picking, and a capture the flag (CTF) competition. Pretty great for being a free conference.

On the second day of the conference I decided to join one of the Shopify CTF teams since it looked like a ton of fun. Actually, I think it was the deep house playing 24/7 was what lured me into the dim and crowded CTF room of the conference centre. I subbed in for one of my friends on Shopify’s Red Team, which was suitably named for Shopify’s second CTF team. Shopify’s first team was named the Blue Team.

I thought I knew what CTF’s were all about – hacking challenges, they say. But I was completely unprepared. My “so called” 10+ years of listening to the Security Now podcast didn’t exactly prepare me for the hands-on experience required for CTFs. It was quite the learning experience since most of the flags remaining on day 2 were difficult to capture for a newbie.

Having some of a background in security and hacking helps, though it doesn’t bridge the gap between the hacking experience and intuition required to solve CTF challenges. These challenges require experience and practice in thinking like an attacker.

For example, it’s one thing to understand that data can be hidden in images via steganography, but it’s another thing completely to actually extract the hidden data from an image.

Instead of wasting time on finding unknown flags, I focused on the topics I have experience with. Most of the flags I focused on were WEP and WPA cracking with aircrack-ng, and it’s associated collection tools. I was not able to inject packets with my setup, but luckily some other competitors did the hard work for me. After a few hours of unsuccessful attempts to crack the Wi-Fi networks I conceded that my attempts weren’t working.

I moved onto a new flag that involved breaking into an old exploited version of Joomla. After asking for some help from a teammate we found a script on exploit-db that would raise privileges to admin for any user. After running the exploit it took me a bit to figure out that it ran successfully since the flag was locked inside a Page that was locked for editing by someone else. The ‘locked for editing’ didn’t allow reading the Page, but after figuring out that the Page had a context menu to unlock it enabled me to view the flag. That made me facepalm both at Joomla’s UI and my inability to figure that out sooner.

After a day filled with a of couple muffins, a few slices of pizza, and countless teas the CTF concluded around 5pm. Winners were announced and thankfully our team didn’t fail too hard. I came out of the competition having met a bunch of colleagues from different parts of the company, and the expectation of what to expect in future CTFs. I’ll definitely be attending another CTF.

My team, the Red-Team, placed somewhere around 5th or 6th. Not too bad for having a handicap on it like myself. I got to hand it to Shopify, they have some seriously talented Security folks! No wonder Shopify’s Blue-Team came in first!

Twenty-Three!

I’m Twenty-Three dude! The first of October fell on a Sunday this year. The personal milestones and events that have occurred certainly make this a year for me to remember. Here’s a few of the main ones to note:

  • Completed my Bachelor of Computer Science – five years of hard work has finally(?) paid off
  • Started a new job at Shopify – one of the hottest, fastest growing, and prestigious unicorn companies
  • Ran my first 10k race during the Ottawa Race Weekend
  • Travelled along the California Coast, around the heart of New York, and all over Montreal

The celebrations started with Saturday morning. A bunch of friends and colleagues of mine grabbed breakfast to wish a colleague farewell and best wishes being back at school.

The crowd storms the field as Carleton wins at Sportsball!

Just like last year, a larger group of us met up at the TD Place for the annual university football Panda Game between Carleton and Ottawa U (Carleton won, of course :P). A lot of time was spent laughing at our once younger selves acting foolish in the crowd. After the win we spent a few hours pigging out at an all-you-can-eat sushi restaurant.

To finish the day off we visited the pool and hot tub at a friend’s apartment, and played the card game What do you Meme? Sunday, my actual birthday, has been spent relaxing and writing this post. I’m sure my colleagues will have more festivities planned for next week.

Additionally, this year I got more into road biking – putting just over a thousand kilometres on a new road bike, and visiting a few new locations in the Ottawa-Gatineau area – Cafe British in Alymer, Quebec is worth the trek on a nice sunny day.

Since I’ve been interested in craft beers, this year I received a homebrewing kit as a gift. The process is laborious for a day or two but is quite rewarding at the end. I’ve completed two large batches so far – a wheat beer and an IPA. They’re pretty drinkable, but the quality isn’t high enough for the ingredients I used. I’ll definitely have to make a higher quality batch of beer, or even delve into making wine.

At the end of October last year I travelled with my Aunt and a few of her friends to Manhattan in New York. What a city! The sheer size and bustle of everything going on at all hours of the day is intoxicating. The restaurant scene was tasty, even when eating within a reasonable budget. The attractions and Central Park were fascinating. I would recommend any first-timers to get a City-Pass to get access to the major attractions. It was definitely a memorable week.

Me shaking Leo’s hand at the TWiT studios.

For a graduation trip in June I flew over to California with my mom. We spent a solid 10 days rv-ing and camping from north of San Francisco all the way down to San Diego. The climate, scenery, and attractions made it an amazing experience. We stopped by the TWiT studios to see Leo Laporte, a longtime mentor of mine for 11 years. Leo’s podcasts have taught me a staggering amount about technology news and computer security. It was great to have been able to see him in person and watch a live recording of Security Now with Steve Gibson.

A San Diego sunset.

During the trip we had to stop by the Computer History museum. Holy cow! That museum is big. I could have spent a few days there reading everything. One particular part of the museum I found memorable was showing my mom a diagram of the history of programming languages, pointing to the ones I knew, and the ones I would soon learn at Shopify. Seeing some of the first Google servers were pretty nostalgic in a way with how scrappy they were to get a lot of computing power for cheap.

Some of the rough goals I have this year are to advance my career, improve on my social skills, become friends with more people, and build up some muscle. I have surprised myself already with the speed and progress that has been made across these goals so far.

Since starting work at Shopify the thought of moving closer to work has really been bugging me. A 20 minute bike ride is great in the summertime, but Winter is Coming – and Ottawa doesn’t fall short with its winters. The social scene with colleagues and all the activities that are had downtown are more attracting factors for moving closer. I’m really liking the idea of moving into a 1-bedroom apartment – gaining more privacy and being able to focus more on myself compared to living with my entertaining and sometimes distracting roommates (in a good way).

I’m not sure how I can top the major events from this past year. Maybe moving into my very own place, attending a conference or two, performing my first conference talk, or even some travelling with friends? I’m down for all of those.

Installing Go the Right Way

100% Derpy

It’s a pain to get the latest version of the Go programming language installed on a Debian or Ubuntu system.

Oftentimes if your operating system is not the latest then there is a slim chance that there will be an up to date version of Go available. You may get lucky and find newer versions of Go in some random person’s PPA, where they’ve backported newer versions to older operating systems. This works, but newly released versions are reliant on the package maintainer to update and push it out.

Other options of installing the latest version of Go may involve building the package from source. This method is often tedious and can be error prone with the number of steps involved. Not exactly for the faint of heart.

Command line tools have been built for certain programming languages to streamline the installation of new versions. For Go, GVM is the Go Version Manager. Inspired by RVM, the Ruby Version Manager, GVM makes it quite simple to install multiple versions of Go, and to switch between them with one simple command.

The only downside that GVM has is that it’s not installed via a system package (eg. a deb file). Don’t let that worry you too much though! Installation is as simple as running the following curl-bash, and then using the GVM command to start installing different versions of Go. Here’s the installation guide/readme.

bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)

One confusing point when using GVM to install the latest version of Go resulted in a failed installation. This made no sense. Eventually RTFM’ing resulted in understanding that you first have to install an earlier version of Go to “bootstrap” the installation of any version of Go later than 1.5. Explained here in more detail.

After following their instructions to install Go 1.4 it was now possible to install the latest version of Go and get on with coding!