Brodie: Building Shopify’s new Help Centre

One of the primary projects which has defined the existence of my team at Shopify was a complete rebuild of the Help Centre’s platform. The prior Help Centre utilized Jekyll (the static site generator) with a number of features added over the past five years to provide documentation to our merchants, partners, and prospective customers.

The rebuild took about six months, and successfully launched with multiple languages in July 2018.

Deacon Brodie

This post will first discuss the limitations we encountered with using Jekyll for a number of years on a Help Centre which has grown to 15 technical writers and 1600 pages. Next, a number of upcoming features are outlined which the new platform should easily accommodate for. Following that, a high level overview of Brodie, the library we built to replace Jekyll. Next, Brodie’s internals are explained with details on how it integrates with Ruby on Rails. This post then ends with links to related code discussed throughout this post.

Jekyll’s Limitations

As of February 2018, Shopify’s Help Centre consisted of 1600 pages, 3000 images, and 300 partials/includes. This amount of content can really slow down Jekyll’s build time. A clean build takes 80 seconds, while changing a single character on a page requires 15 seconds for a partial rebuild. This really slows down the workflow for our technical writers, as well as developers who maintain the heavy Javascript-based Support page.

Static sites, where a server serves up html files, can only get you so far. Features considered dynamic must be implemented using client-side Javascript. This has proven to be difficult and even restrictive to the features that could be added to the site, especially when it comes to features which require running on a server and not in the user’s browser. Things such as authenticating Shopify merchants before they contact Support is more difficult considering that all of the functionality lives in Javascript, or another app is relied upon.

The original Deacon Brodie’s Tavern in Edinburgh

Even other companies have blogged about the hoops they’ve jumped through to scale Jekyll too.

Upcoming Features

Allowing users to login to the Help Centre with their Shopify credentials can provide a more personalized experience. Based on the shops the Merchant has access to, the pages in the Help Centre can be tailored to their Country, the features that they use, and the growth stage of their business. The API documentation can be extended to provide the logged in user the ability to query their shop’s API.

Enabling the ability for merchants to login to the Help Centre can simplify the process of talking with Support. Once logged in, users would be able to bypass verifying who they are to a Support representative, since they’ve already proven who they are by logging into the Help Centre. This saves time on both ends of the conversation and keeps the user focused on their problem.

A short history of Deacon Brodie’s life

Features could also be added to enhance the workflow of our technical writers. As a logged in technical writer a few features could be enabled such as showing all pages regardless of being hidden or being an early-release feature, a link to view the page on GitHub, or even a link to view the current page in Google Analytics. Improvements such as these make it much quicker to access to relevant data.

Being able to correlate the Help Centre pages visited by a user before they contact Support can help infer how successful pages are at helping answer the user’s question. Pages which do poorly can be updated, and pages which are successful can be studied for trends. Resources can be better focused on areas of the Help Centre pages which need it. Additionally, combining the specific pages visited to Support interactions opens the opportunity to perform A/B tests. A Help Centre page can have two or more versions, and the version which results in the least amount of Support interactions could be considered the winning version. Currently there is no way to definitively correlate the two.

Many Support organizations gauge the effectiveness of their Help Centre content (self-help) by comparing potential Support interactions solved by Help Centre pages to the number of actual Support interactions. A so called deflection ratio, where the higher the self-help-to-support-interaction ratio the better. This ratio can be more accurately calculated by better tracking of the user’s journey through these various Shopify properties before they contact Support.

Lastly, Internationalization (aka I18n) and Localization means translating pages into different languages and cultural norms. I18n would enable the Help Centre to be used by people other than those who know English, or prefer reading in a language they understand better. I18n support can be hacked into Jekyll, but as was discussed earlier with 1600 pages already slowing down the build times, Jekyll will absolutely cripple when there exists multiple localized versions of each page. Therefore, having an app that can scale to a much larger number of pages is required for I18n and localization to even be considered.

The Solution

To enable our Help Centre to scale way past 1600 pages, and support complex server-side features, a scrappy team was formed to rebuild the Help Centre platform in Ruby on Rails.

Rewriting any of the content pages or partials wouldn’t be feasible for the time or resources we had – therefore maintaining compatibility with the existing content files was paramount.

Allowing the number of pages in the Help Centre to keep growing, but to dramatically reduce the 80 second clean build time, and the 15 second page rebuild time requires an architectural shift. Moving away from Jekyll’s model of pre-rendering all pages at build time to the model of rendering only what’s needed at request time. Instead of performing all computational work up-front, performing smaller batches of work at request time spreads out the cost.

The Deacon Brodies Pub in Ottawa, steps away from Shopify HQ

Ruby on Rails was chosen as the new technology stack for the Help Centre for a few reasons. The limits were being reached with Jekyll, therefore we technically couldn’t continue using it. Shopify’s internal tooling and production systems  heavily integrate with Rails applications, therefore building on Rails to utilize these would save a lot of developer time. Shopify also employs the largest base of Rails developers, so tapping into that workforce and knowledge base is very beneficial for future development.

Ruby on Rails brings a number of complementary features such as a solid MVC framework, simple caching abstractions for application code and views, as well as a strong and healthy community of libraries and users. These benefits make Rails a great selling point for building new features faster and easier than the prior Jekyll system.

One of the things that has been working quite well over the past few years has been the workflow for our technical writers. It consists of using a text editor (such as Atom) to edit Markdown and Liquid code, then using Git and GitHub to open a Pull Request for peer review of the changes. Automated tests check for broken links, missing images, incorrectly formed HTML and Javascript.  Once the changes are approved and all tests have passed, the Pull Request is merged and shipped to production.

Since there isn’t a good reason to change the technical writer’s current workflow we’re more than happy to design the new documentation site with the existing workflow in mind.

One of the main features of the platform my team built was the flexible content rendering engine. It’s equivalent to Jekyll on steroids. Here I’ll discuss the heart of the system, Brodie, the ERB-Liquid-Markdown rendering engine.

Brodie

Brodie is the library we’ve purpose-built for Shopify’s new Help Centre. It renders any file that contains ERB, Liquid, and Markdown, or a combination of the three into HTML.

Brodie is named after Deacon Brodies, an Ottawa pub which is itself named after Deacon William Brodie, an 18th-century city councillor in Edinburgh who moonlighted as a burglar and gambler.

Deacon Brodie’s double life inspired the Robert Louis Stevenson story Strange Case of Dr Jekyll and Mr Hyde.

Brodie, and the custom extensions built on-top of it, enable a smooth transition from Jekyll to Rails. Shopify’s 1600 pages, 3000 images, and 300 partials/includes can be rendered by Brodie without modification. Additionally, the workflow of the technical writers is not disturbed. They continue to use their favourite text editor to modify content files, Git and GitHub to perform reviews, and to utilize the existing Continuous Delivery pipeline for fast validation and shipping.

Views in Rails are rendered using templates. A template is a file that consists of code that defines what the user will see. In a Rails app the template file will usually consist of ERB mixed into HTML. A template file like this would belong in the app/views/ directory and would have a descriptive name such as homepage.html.erb.

The magic in Rails treats templates differently based on its filename. Let’s break it down. homepage represents the template’s filename. Rails knows to look for this template based on this name. The html part represents what the format the template should output to. Lastly, erb is the portion which specifies what language the template file is written in. This naming convention enables Rails to dynamically render views just by looking at the filename.

Rails provides template handlers to render ERB to HTML, as well as JSON and a few others. Rails offers the ability to extend its rendering system by plugging in new template handlers. This is where Brodie integrates with Rails applications. Brodie provides its own template handler to take content files and convert the ERB, Liquid, and Markdown to HTML.

Rails exposes this via the ActionView::Template.register_template_handler(:md, Content) where :md is the file extension to act on, and Content is the Class to use as the template rendering engine (template handler). Next we’ll go over how a template handler works.

Rendering Templates

The only interface a template handler is required to respond to is call with one parameter being the template to render. This method should return a string of code that will render the view. This string will be eval‘ed by the template later. Returning a string of code is a Rails optimization which inlines much of the code required to render the template. This reduces the number of methods needing to be called, speeding up the already time consuming rendering process.

When Rails needs to render a view it takes the specified template and calls the proper template handler on itself. The handler returns a string that contains the code that renders the template. The Template class combines the code with other code, then evals the stringified code.

For example, the ERB-Liquid-Markdown renderer has a call method like the following:

def call(template)
  compiled_source = erb_handler.call(template)
  "Brodie::Handlers::Content.render(begin;#{compiled_source};end, local_assigns)"
 end

Brodie first renders the ERB present in the template’s content with the existing ERB handler that comes with Rails. Brodie then returns a string of code which calls the “render” method on itself. That render method is shown next:

def render(source, local_assigns = {})
  markdown.call(
    liquid.call(source, local_assigns)
  ).html_safe
end

Here is where the actual rendering of the Liquid and Markdown occur. When this code is eval‘ed the parameter local_assigns is included for passing in variables when rendering a view. This is how variables are magically passed from Rails controllers into views.

Left: The old Jekyll site. Right: The new Rails site. The Help Centre rebuild looks the same but has a completely new backend

It’s as straightforward as that for rendering ERB, Liquid, and Markdown together. The early days of Brodie were spent understanding the ins-and-outs of ActiveView enough to validate that this approach was a sane practice and not breaking in edge cases.

Further Reading

The current documentation is really limited when it comes to Templates and Template Handlers. I would suggest building a small template handler, setting breakpoints and walk through the source. Here’s a great example of a template handler for Markdown.

Additionally, looking over the source code and comments is the best way to get an understanding of the ActiveView internals. The main entry point into ActiveView is the render method from TemplateRenderer. Template would be best to check out next as it concerns itself with actually rendering templates. Lastly, Handlers would be good to check out to see how Rails can register and fetch Template Handlers.

Parallel GraphQL Resolvers with Futures

My team and I are building a GraphQL service that wraps multiple RESTful JSON services. The GraphQL server connects to backend services such as Zendesk, Salesforce, and even Shopify itself.

Our use case involves returning results from these backend services all from the same GraphQL query. When the GraphQL server goes out to query all of these backend services, each backend service can take multiple seconds to respond. This is a terrible experience if queries take many seconds to complete.

Since we’re running the GraphQL server in Ruby, we don’t get provided the nice asynchronous IO that would come with the NodeJS version of GraphQL. Because of this, the GraphQL resolvers run serially instead of in parallel – thus a GraphQL query to five backend services which take one second each to fetch data from will result in the query taking five seconds to run.

For our use case, having a GraphQL query that takes five seconds is a bad experience. What we would prefer is 2 seconds or less. This means performing some optimizations when GraphQL goes to do the HTTP requests to the backend services. Our idea is to parallelize those HTTP requests.

First Approaches

To parallelize those HTTP requests we took a look at non-blocking HTTP libraries, different GraphQL resolvers, and Ruby concurrency primitives.

Typhoeus

Knowing that running the HTTP requests in parallel is the direction to explore, we first took a look at the Ruby library Typhoeus. Typhoeus offers a simple abstraction for performing parallel HTTP requests by wrapping the C library libcurl. Below is one of the many possible ways to use Typhoeus.

After playing around with Typheous, we quickly found out that it wasn’t going to work without extending the GraphQL Ruby library. It became clear that it was nontrivial to wrap a GraphQL resolver’s life cycle with a Hydra from Typhoeus. A Hydra basically being a Future that runs multiple HTTP requests in parallel and returns when all requests are complete.

Lazy Execution

We also took a look at the GraphQL Ruby’s lazy execution features. We had a hope that the lazy execution would automatically optimize by running resolvers in parallel. It didn’t. Oh well.

We also tried a perverted version of lazy execution. I can’t remember why or how we came up with this method, but it was obviously overcomplicated for no good reason and didn’t work 😆

Threads and Futures

We looked back and understood the shortcomings of the earlier methods – namely, we had to find a concurrency method that would allow us to do the HTTP requests in the background without blocking the main thread until it needed the data. Based on this understanding we took a look at some Ruby concurrency primitives – both Futures (from the Concurrent Ruby library), and Threads.

I highly recommend using higher-order concurrency primitives such as Futures, and the like because of their well-defined and simple APIs, but for hastily hacking something together to see if it would work I experimented with Threads.

My teammate ended up figuring out a working example of Futures faster than I could hack my threads example together. (I’m glad they did, since we’ll see why next.) Here is a simple use of Futures in GraphQL:

It’s not clear at first, but according to the GraphQL Ruby docs, any GraphQL resolver can return the data or can return something that can then return the data. In the code example above, we use the latter by returning a Concurrent::Future in each resolver, and having the lazy_resolve(Concurrent::Future, :value!) in the GraphQL schema. This means that when a resolver returns a Concurrent::Future, the lazy_resolve part tells GraphQL Ruby to call :value! on the future when it really needs the data.

What does all of this mean? When GraphQL goes to fulfill a query, all the resolvers involved with the query quickly spawn Futures that start executing in the background. GraphQL then moves to the phase where it builds the result. Since it now needs the data from the Futures, it calls the potentially blocking operation value! on each Future.

The beautiful thing here is that we don’t have to worry about whether the Futures have finished fetching their data yet. This is because of the powerful contract we get with using Futures – the call to value! (or even just value) will block until the data is available.

Conclusion

We ended up settling on the last design – utilizing Futures to allow the main thread to put as much asynchronous work into background.

As seen through our thought process, all that we needed was to find a way to start execution of a long-running HTTP request, and give back control to the main thread as fast as possible. It was quite clear throughout the early ideas of utilizing concurrent HTTP request libraries (Typhoeus) that we were on the right track, but weren’t understanding the problem perfectly.

Part of that was not understanding the GraphQL Ruby library. Part of it was also being fuzzy on our concurrent primitives and libraries. Once we had taken a look at GraphQL Ruby’s lazy loading features, it became clear to us that we needed to kick-off the HTTP request and immediately give back control to the GraphQL Ruby library. Once we understood this, the solution became clear and we became confident after some prototypes that used Futures.

I enjoyed the problem solving process we went through, as well as this writing that resulted from it. The problem solving process ended up teaching the both of us some valuable engineering lessons about collaborative, up-front prototyping and design since we couldn’t have achieved this outcome on our own. Additionally, writing about this success can help others with our direct problem, not to mention learning about the different techniques that we met along the way.

A few Gotchas with Shopify API Development

I had a fun weekend with my roommate hacking on the Shopify API and learning the Ruby on Rails framework. Shopify makes it super easy to begin building Shopify Apps for the Shopify App Store – essentially the Apple App Store equivalent for Shopify store owners to add features to their customer facing and backend admin interfaces. Shopify provides two handy Ruby gems to speed up development: shopify_app and shopify_api. An overview of the two gems are given and then their weaknesses are explained.

Shopify provides a handy gem called shopify_app which makes it simple to start developing an app for the Shopify App Store. The gem provides Rails generators to create controllers, add webhooks, configure the basic models and add the required OAuth authentication –  just enough to get started.

The shopify_api gem is a thin wrapper of the Shopify API. shopify_app integrates it into the controllers automatically, making requests for a store’s data very simple.

Frustrations With the API

The process of getting a developer account and developer store created takes no time at all. The API documentation is clear for the most part. Though attempting to develop using the Plus APIs can be frustrating when using the APIs for the first time. For example, querying the Discount API, Gift Card API, Multipass API, or User API results in unhelpful 404 errors.  The development store’s admin interface is misleading as a discounts section can be accessed where discounts may be added and removed.

By default, anyone who signs up to become a developer only has access to the standard API endpoints, leaving no access to the Plus endpoints. These Plus endpoints are only available to stores which pay for Shopify Plus, and after digging into many Shopify discussion boards it was explained by a Shopify employee that developers need to work with a store who pays for Shopify Plus to get access to those Plus endpoints. The 404 error when accessing the API didn’t explain this and only added confusion to the situation.

One area that could be improved is that there is little mention of tiered developer accounts. The API should at least give a useful error message in the response’s body explaining what is needed to gain access to it.

Webhooks Could be Easier to Work With

The shopify_app gem provides a simple way to define any webhooks that should be registered with the Shopify API for the app to function. The defined webhooks are registered only once after the app is added to a store. During development you may add and remove many webhooks for your app. Since defined webhooks are only registered when the app is added to a store the most straightforward way to refresh the webhooks is to remove the app from the store and then add it again.

This can become pretty tedious which is why I did some digging around in the shopify_app code and created the following code sample to synchronize the required webhooks with the Shopify API. Simply hit this controller or call the containing code somewhere in the codebase.

If there’s a better solution to this problem please let me know.

Lastly, to keep track of your sanity the httplog gem is useful to track the http calls that shopify_app, shopify_api and any other gem makes.

Wrapping Up

The developer experience on the Shopify API and app store is quite pleasing. It has been around long enough to build up a flourishing community of people asking questions and sharing code. I believe the issues outlined above can be easily solved and will make Shopify a more pleasing platform.