The State of GitHub's Code Review

14 Oct 2013

In a word: Stagnant.

Code review is an essential part of a team's shipping process. It helps disseminate knowledge across teams, catches bugs (or more likely poor architectural decisions) before they bite you, and provide opportunities to educate junior engineers. GitHub, the current de facto standard for this, is letting us down.

Perhaps they've hit some sort of local maximum for this activity, or maybe they're focused on providing GitHub functionality for verticals like GIS or 3D modelers, but not much has changed in the world of reviewing code in many years. Herein is my 5 item wish list for code review.

1. Side-by-side diffs (since fixed)

Side by side diffing is a long requested feature of GitHub. The benefit of this is it allows you to see both files in context of each other when comparing changes that are in close proximity. This also seems like the best possible case of a responsive design issue. These diffs can be wide. For smaller screens, showing the unified diff makes sense. When I'm on my 30" monitor, it'd be great to have more functionality.

2. Keyboard shortcuts for review

Scrolling through a few thousand lines of un-minified javascript or South migrations is frustrating. It's easy to scroll a little too far and miss a small change sandwiched in between larger diffs. Providing just two, simple keyboard shortcuts would make this dramatically better: file & comment navigation (e.g. next and previous). These would allow me to skip those files I've either already seen or want to leave to another reviewer. As an author, the navigation around comments will let me see what remains to be addressed.

3. JS Library detection

How many copies of jQuery are there in GitHub? A few thousand at least. I'm sure this is repeated for other major javascript libraries like d3, backbone, etc. Instead of showing me the full 14,725 lines of jquery-ui, just say "This is the canonical version of jquery-ui v1.9.0". In the event that this version of jQuery was modified, just diff relative to that canonical version.

(and yes, I know about bower.)

4. Paying attention to in-progress reviews (since fixed)

If I review an in-progress pull request and subsequent changes are necessary, I have three options.

First, I can review the entire pull request again from scratch. Reviews take a great deal of time and having to start from scratch, reviewing everything again is a pain. In absence of better options, I often do this, limiting my number of reviews because it's so time consuming.

The second option is to review the commits individually as they've happened since your previous review. From here, you can make commit comments (which have an odd way of showing up in the UI, but they work) on the individual commits. The downside to this is you're subjected to the stream of consciousness of the developer committing code. If they're the sort of person who makes lots of smaller commits, you can get lost in the small changes in a localized area of the code without getting a sense of how it fits in the larger change.

The last option is url hacking. Looking at the commit SHA where you made your review and the current SHA, you can diff them with a url like the following: https://github.com/django/django/compare/62b74ea...31e29a2 where the two SHAs in the URL are the range you want to compare (exclusive of the origin). This is pretty helpful in condensing the issues with the individual commit review option, but unfortunately it takes you out of reviewing mode. This means that you may see an issue in the code, but you can't comment on it here. You have to either go to the commit in which it was introduced or find it in the full diff of the main PR.

5. Provide me context in review

There is a great deal of knowledge and context that GitHub could bring to bear on the topic of code review.

First, has this method been tested? What does the coverage look like? Link me to the tests so I can validate that it's testing the things I think it should. Bonus points for keeping the code in question viewable.

Second, does this adhere to our style guide? Can you comment on the lines where it doesn't with the message the tool fails on? You can hook this up through your CI build and interact with the commit-status API, but that doesn't quite serve the same purpose. Reviewers are going to comment on the fact that your indentation is off or you didn't run your code through gofmt. If comments are already in place, they can skip those and check for larger code problems.

Where in the project is this code used? I'm going to look at code that's used in our billing backend (correctness around number manipulation) with a slightly different eye that I might if it's used in our login and authorization (sanitizing user inputs) or even DB queries (efficiency, batching of requests).

I'm not suggesting GitHub build all of this. I understand that they want to foster an ecosystem, but what they can provide is cleaner APIs for accomplishing this. In an effort to work through a style guide (PEP8), I wrote a small bot which attempted to comment on PRs with PEP8 violations. Unfortunately, the API for commenting on a single line of a pull request is made difficult by requiring you to offset of line number in the diff they're generating to create a comment. A saner approach would be the one taken by Gerrit, which allows you to specify the file name and the line number of the file you want to comment on.

So where does that leave us?

GitHub is a great tool. It's done a great thing for the open source community. It's good enough that it covers all the major bases you'd want, but is a bit disappointing when you realize all of the cool stuff they could be doing. As I often tell my friends, it's a tool that's good enough that you don't want to rewrite it but bad enough that you do.

In order to address some of the issues above, my team is investigating switching to Gerrit. It doesn't provide interesting context in reviews, nor does it offer the pie-in-the-sky JS library detection, but it scores pretty high marks on everything else. It looks like this.

If you're using a different tool which you've found helpful, please let me know on twitter or via email.