Private Docker Repositories with Artifactory

A while ago I was looking into what it takes to setup a private Docker Registry. The simplest way involves running the vanilla Docker Registry image and a small amount of configuration (vanilla is used to distinguish the official Docker Registry from the Artifactory Docker Registry offering). The vanilla Docker Registry is great for proof of concepts or for people who want to design a custom solution, but in organizations where there are multiple environments (QA, staging, prod) wired together using a Continuous Delivery pipeline – JFrog Artifactory is well suited for the task.

Artifactory, the fantastic artifact repository for storing your Jars, Gems, and other valuables has an extension to host Docker Repositories to store and manage Docker images as first-class citizens of Artifactory.

Features

Here’s a few compelling features which make Artifactory worthwhile over the vanilla Docker Registry.

Role-based access control

The Docker Registry image doesn’t come with any fine-grained access control. The best that can be done is either allowing or disallowing access to all operations on the registry by the use of a .htpasswd file. In the best scenario, each user of the registry has their own username and password.

Artifactory uses its own fine-grained access control mechanisms to secure the registry – enabling users and groups to be assigned permissions to read, write, deploy, and modify properties. Access can be configured through the Artifactory web UI, REST API, or AD/LDAP.

Transport Layer Security

If enabled, Artifactory will use the same TLS encryption it uses for Docker Registries. Unlike a vanilla Docker Registry, there is no need to setup a reverse proxy to tunnel all insecure HTTP connections over HTTPS. The web UI offers a screen to copy and paste authentication information for connecting to the secured Artifactory Registry.

Data Retention

Artifactory has the option to automatically purge old Docker images when the unique number of tags has grown to a certain size. This keeps the number of available images, and therefore the storage space, within reason. The results of not having old images purged can lead to running out of disk space, or for you cloud users, expensive object storage bills.

Image Promotion

Continuous delivery defines the concept of pipelines. These pipelines represent the flow of commits from when a developer checks in their code to the SCM, all the way through CI, and eventually into production. Continuous Delivery organizations who have chosen to use multiple environments for validating their software changes would “promote” a version of the software from one environment to the next. A version would only be promoted if it passed the validation requirements for that environment.

For example, the promotion of version 123 would first go through the QA environment, then the Staging environment, then the Production environment.

Artifactory includes Docker image promotion as a first-class feature, separating it from the vanilla Docker Registry. What would be a series of manual steps, or a script to run, is now a single API endpoint to promote a Docker image from one registry to another.

Browsing for Images

The Artifactory UI already has the ability to look at various artifacts contained in Maven, NPM, and other types of repositories. It was only natural to offer the same service for Docker Registries. All images of a repository can be listed and searched upon. Images can be further described by showing the various tags and layers that compose it.

The current vanilla Docker Registry doesn’t have a GUI. It is only through third-party projects that a GUI can be provided to offer the same functionality as Artifactory.

Remote Registries

Artifactory has the ability to provide a caching layer for registries. Performance is gained when images and metadata are fetched from the cached Artifactory instance, preventing the time and latency incurred from going to the original registry. Resiliency is also gained since the Artifactory instance can continue serving cached images and metadata to satisfy client requests even when the remote registry has become unavailable. (S3 outage anyone?)

Virtual Registries

Besides hosting local and caching remote registries, virtual registries is a combination of the two. Virtual registries unite images from a number of local and remote registries, enabling Docker clients to conveniently use just a single registry. Administrators are then able to change the backing registries when needed, requiring no change on the client’s side.

This is most useful for humans who need ad hoc access to multiple registries that correspond to multiple environments. For example, the QA, Staging, Production, and Docker Hub registries can be combined together, making it seem like one registry to the user instead of four different instances. Machines running in the Production environment, for example, could only have access to the Production Docker Registry, thereby preventing any accidental usage of unverified images.

Conclusion

Artifactory is a feature rich artifact tool for Maven, NPM, and many other repository types. The addition of Docker Registries to Artifactory provides a simple solution that caters towards organizations who are implementing Continuous Delivery practices.

If you’re outgrowing an existing vanilla Docker Registry, or entirely new to the Docker game then give Artifactory a try for your organization, it won’t disappoint.

Practicing User Safety at GitHub

GitHub explains a few of their guidelines for harassment and abuse prevention when they’re developing new features. Some of the interesting points in the article include a list of privacy-oriented questions to ask yourself when developing a new feature, providing useful audit logs for retrospectives, and minimizing abuse from newly created accounts by restricting access to the service’s capabilities. All of these points taken into consideration make it harder for abuse to occur, making the service a better environment for its users.

See the original article.

A few Gotchas with Shopify API Development

I had a fun weekend with my roommate hacking on the Shopify API and learning the Ruby on Rails framework. Shopify makes it super easy to begin building Shopify Apps for the Shopify App Store – essentially the Apple App Store equivalent for Shopify store owners to add features to their customer facing and backend admin interfaces. Shopify provides two handy Ruby gems to speed up development: shopify_app and shopify_api. An overview of the two gems are given and then their weaknesses are explained.

Shopify provides a handy gem called shopify_app which makes it simple to start developing an app for the Shopify App Store. The gem provides Rails generators to create controllers, add webhooks, configure the basic models and add the required OAuth authentication –  just enough to get started.

The shopify_api gem is a thin wrapper of the Shopify API. shopify_app integrates it into the controllers automatically, making requests for a store’s data very simple.

Frustrations With the API

The process of getting a developer account and developer store created takes no time at all. The API documentation is clear for the most part. Though attempting to develop using the Plus APIs can be frustrating when using the APIs for the first time. For example, querying the Discount API, Gift Card API, Multipass API, or User API results in unhelpful 404 errors.  The development store’s admin interface is misleading as a discounts section can be accessed where discounts may be added and removed.

By default, anyone who signs up to become a developer only has access to the standard API endpoints, leaving no access to the Plus endpoints. These Plus endpoints are only available to stores which pay for Shopify Plus, and after digging into many Shopify discussion boards it was explained by a Shopify employee that developers need to work with a store who pays for Shopify Plus to get access to those Plus endpoints. The 404 error when accessing the API didn’t explain this and only added confusion to the situation.

One area that could be improved is that there is little mention of tiered developer accounts. The API should at least give a useful error message in the response’s body explaining what is needed to gain access to it.

Webhooks Could be Easier to Work With

The shopify_app gem provides a simple way to define any webhooks that should be registered with the Shopify API for the app to function. The defined webhooks are registered only once after the app is added to a store. During development you may add and remove many webhooks for your app. Since defined webhooks are only registered when the app is added to a store the most straightforward way to refresh the webhooks is to remove the app from the store and then add it again.

This can become pretty tedious which is why I did some digging around in the shopify_app code and created the following code sample to synchronize the required webhooks with the Shopify API. Simply hit this controller or call the containing code somewhere in the codebase.

If there’s a better solution to this problem please let me know.

Lastly, to keep track of your sanity the httplog gem is useful to track the http calls that shopify_app, shopify_api and any other gem makes.

Wrapping Up

The developer experience on the Shopify API and app store is quite pleasing. It has been around long enough to build up a flourishing community of people asking questions and sharing code. I believe the issues outlined above can be easily solved and will make Shopify a more pleasing platform.

The Software Engineering Daily Podcast is Highly Addictive

Over the past several months the Software Engineering Daily podcast has entered my regular listening list. I can’t remember where I discovered it, but I was amazed at the frequency at which new episodes were released and the breadth of topics. Since episodes come out every weekday there’s always more than enough content to listen to. I’ve updated My Top Tech, Software and Comedy Podcast List to include Software Engineering Daily. Here are a few episodes that have stood out:

Scheduling with Adrian Cockroft was quite timely as part of my final paper for my undergraduate degree focused on the breadth of topics in scheduling. Adrian discussed many of the principles of scheduling and related them to how they were applied at Netflix and earlier companies. Scheduling is really a necessity for software developers to know as scheduling occurs in all layers of the software and hardware stack.

Developer Roles with Dave Curry and Fred George was very entertaining and informative as it presented the idea of “Developer Anarchy”, a different structure to running, (or not running), development teams. Instead of hiring Project Managers, Quality Assurance, or DBAs to fill a specific niche of a development team, you mainly hire programmers and leave them to perform all of those tasks according to what they deem is necessary.

Infrastructure with Datanauts’ Chris Wahl and Ethan Banks entertained as much as it informed. This episode had a more casual setting as the hosts told stories and brought years of experience to the current and future direction of infrastructure in all layers of the stack. Comparing the current success of Kubernetes to the not-so-promising OpenStack was quite informative as it showed that multiple supporting organizations drove the OpenStack project to have different priorities and visions, whereas Google, being the single organization to drive Kubernetes, is shown to have one single, unified vision.


EDIT 2017-02-26 – Add Datanauts episode

Better Cilk Development With Docker

I’m taking a course that focuses on parallel and distributed computing. We use a compiler extension for GCC called Cilk to develop parallel programs in C/C++. Cilk offers developers a simple method for developing parallel code, and as a plus it now comes included in GCC since version 4.9.

The unjust thing with this course is that the professor provides a hefty 4GB Ubuntu virtual machine just for running the GNU compiler with Cilk. No sane person would download an entire virtual machine image just to run a compiler.

Docker comes to the rescue. It couldn’t be more space effective and convenient to use Cilk from a Docker container. I’ve created a simple Dockerfile containing the latest GNU compiler for Ubuntu 16.04. Here are some Gists showing how to build and run a Dockerfile which contain the dependencies needed to build and run Cilk programs.