We work on a fairly large project that resembles a web application for our customer. Because the customer is part of a larger organization, the project was also needed for a second, rather independent customer within the organization. Now we had two customers with distinct requirements. We forked the code base and developed both branches independently. But often, there is a bug fix or a new feature that is needed in both branches. And while both customers have different requirements, it’s still the same application in the core. Technically speaking, both branches are part of a product family. We use atomic commits and cherry-picks to keep the code bases of the branches in sync if needed.
Another customer has a custom hardware with an individual control software written by us. The hardware was built several times, with the same software running on all instances. After a while, one hardware instance got an additional module that only was needed there. We coped by introducing an optional software module that can control the real hardware on this special instance or act as an empty placeholder for the other instances. Soon, we had to introduce another module. The software is now heavily modularized. Then the hardware defects began. The customer replaced every failing hardware component with a new type of hardware, using their new capabilities to improve the software features, too. But every hardware instance was replaced differently and there is no plan to consolidate the hardware platforms again. Essentially, this left us with a apecific version of the software for each hardware instance. Currently, we see no possibility to unify the different hardware platforms with one general interface. What we did was to fork the code base and develop on each branch independently. But often, there is a bug fix or a new feature that is needed in several branches. Technically speaking, all branches are part of a product family. We use atomic commits and cherry-picks to keep the code bases of the branches in sync if needed.
In both cases, we needed a list that helped us to keep track which commits were already cherry-picked, never need to be picked or are not reviewed in that regard yet. Our version control system of choice, git, supports this requirement by providing globally unique commit IDs. Maintaining this list manually is a cumbersome task, so we developed a little tool that helps us with it.
Meet the diffibrillator
First thing we always do for our projects is to come up with a witty name for it. In this case, because it is a “diff tracker” really, we came up with the name “diffibrillator”. The diffibrillator is a diff tracker on the granularity level of commits. For each new commit in either repository of a product family, somebody has to review the commit and decide about its category:
- Undecided: This is the initial category for each commit. It means that there is no decision made yet whether to cherry-pick the commit to one or several other branches or to define it as “unique” to this branch.
- Unported: If a reviewer chooses this category for a commit, there is no need to port the content of the commit to other branches. The commit is regarded as part of the unique differences of this branches to all other ones in the product family.
- Ported: If there are other branches in the product family that require the same changes as are made in the commit, the reviewer has to do two things: cherry-pick the commit to the required branches (port the functionality) and mark the commit and the new cherry-pick commits as “ported”. This takes the commits out of the pending list and indicates that the changes in the commit are included in several branches.
In short, the diffibrillator helps us to keep track about every commit made on every branch in the product family and shows us where we forgot to port a functionality (like a bugfix) to the other members of the family.
Here is a typical screenshot of the desktop GUI. Some information is blurred to keep things ambiguous and to protect the innocent.
You see a (very long) table with several columns. The first column denotes the commit date of the commit in each row. The commits are sorted anti-chronologically over all projects, but inserted into its project’s column. In this screenshot, you can see that the third project wasn’t changed for quite a time. Some commits are categorized, but the latest commits need some work in this regard.
Foundation for the diffibrillator
The diffibrillator in its current state relies heavily on the atomic nature of our commits. As soon as two functionalities are included in one commit, both the cherry-pick and the categorization would lose precision. Luckily, we have only developers that adhere to the commit-early-commit-often principle. We had plans for a diff tracker with the granularity of individual changes, but an analysis of our real requirements revealed that we wouldn’t benefit from the higher change resolution but lose the trackability on the commit level. And that is the level we want to think and act upon.
Technicalities of the diffibrillator
The biggest problem was to design the REST API orthogonal enough to make any sense but also with a big amount of pragmatism to keep it fast enough. This lead to a query that should return only the commits’ IDs but returns all information about them to avoid several thousand subsequent HTTP requests for the commits’ data. As a result, this query’s answer grew very big, leading to timeout errors on smallband connections. To counter this problem, we had to introduce result paging, where the client can specify the start index and result length of its query.
Why should you care?
We are certain that the task to keep several members of a product family in sync isn’t all that seldom. And while there are many different possible solutions to this problem, the two most prominent approaches seem to be “modularization” or “diff tracking”. We chose diff tracking as the approach with lower costs for us, but lacked tool support. The diffibrillator is a tool to keep track of all your product familys’ commits and to categorize them. It relies on atomic commits, but is relatively low-tech and easy to understand otherwise.
If you happen to have the same problem of a product family consisting of several independent projects, drop us a line. We’d love to hear from you about your experience and solutions. And if you think that the diffibrillator can help you with that task, let us know! We are not holding anything back.