Commit it! » andreas.heigl.org

I have to fix a bug and finally find the code where everything goes south. And immediately I ask myself: “Why is that written in this way? I mean: It *has* to break!”

Sounds familiar?

Luckily I am not editing files via FTP directly on a server but am in a project that uses version control. Which allows me to check who and when this line was last changed. Which will provide me with some information on whom to ask about why that line was written in that specific way.

And guess what? That line was last changed 5 years ago by a then-coworker that has since left.

But at least we have the commit-message that was left by that coworker.

And, yeah! You guessed it: It’s not helpful: “Fixed bug”…

If you are lucky, it contains a link to an issue. But: Yeah! 3 years ago the team changed their issue-tracker and they didn’t bother to move over closed tickets…

So let me talk a bit about what It takes to

Write a good Commit Message

For me a good commit message tells future me why I changed things. It does not need to tell me what changed as I can see that from the commits diff. But explaining the reasoning why I changed something is important for future me to understand what I have to consider apart from the issue I am trying to fix right now.

Short Summary

To help me with that I am usually following the “Beams Rule” and start with a short summary of the commit. While Chris’ 50 characters are an arbitrary number (that has some reasoning behind it) and some people argue that one shouldn’t be limited by such arbitrary numbers, I find it actually good to keep it that short. If you have problems fitting all you did into a summary of 50 characters, you probably did too much in one commit. Split up the commit into several smaller ones. Check out atomic commits for easier understandable histories.

To help me with that short summary I am using a commit-message template on my machine that starts with

# when this commit is applied it will

This reminds me every time I commit something, that my first line should summarize what I did in this commit.

Then I add a blank line

And then I write my

actual commit message

There is not limitation to this message! It can be as long as one likes.

I usually start with explaining what the problem was that this commit is trying to fix. So a summary of the issue/ticket that you were working on and how it is related to this commit.

Yes! I am partially duplicating the content of the issue-tracker. But only the parts relevant for this commit. I will later also link to the issue, but I have encountered it more than once that the issue-tracker has been changed and there is no trace of the issue any more the moment you need that information the most. So duplicating the relevant info into the commit allows me to see what the problem was that this commit is trying to solve.

After that I tell my future self why I chose the implementation I did. Explain what different things I tried that didn’t work (why?), that this is the fastest solution and that the edge-cases x,y,and z are not considered to be an issue, that you talked about this in your dev-meeting and the consensus was to use this implementation because these other implementations were considered to be too time-consuming etc.

Everything that will help your future self understand why you did what you did.

I try not to explain what I changed. I can see that from the diff.

This is clearly the most complex part of the whole process. What to include and what not to include. And in case you fixed a typo you might actually not really need it as it is obvious! But are you sure?

Yes! I also omit this in more or less self explanatory commits. When I fix a typo I don’t need to explain why I did it. When I change some code-styling I don’t need to explain why I did it. Apart from perhaps the reasoning why you suddenly introduced this specific coding style.

But in general I write rather long commit messages. In exceptional cases I refrain from doing so. But in general I do!

And after that long commit message, I usually add

Some Meta-Data

There are things I do not want to loose and that belong to the commit, but that are less for the human reader but for automated processes.

Those should still be included but they can be added at the end of the commit message. Things like

issue-references,
time spent on this commit,
whether it is a fix or part of a new feature,
whatever else I might want to add

I put them right at the end: Each on it’s own line. That way they do not take up valuable space at the beginning but are there and scripts that you are running can fetch that from anywhere in the message.

Make it a habit

Making this a habit helped me a lot. Especially as it at one point becomes kind of second nature.

Writing a commit message was no longer an obnoxious task at the end of the process that kept me from finishing the task. It became a valuable part of the coding process once I started doing that.

To help with that I have – as previously mentioned – created a commit-message template that reminds me of the different things.

In some projects I annoy my coworkers by using commit-hooks that check the commit message whether it complies with certain standards – Like the 50 character limit of the first line, blank second line, some more information, issue-number or commit-type is set at the end etc. It will make some people angry at the beginning because they suddenly can’t “just finish the job” but after some time they will start to understand and it will become natural to them.

But what about conventional commits?

Well… Let me ask a different question: What is the

Target Audience of a commit message

I write commit messages for those trying to understand why a certain part of an application was written in a specific way so that future me can make an informed decision when having to change something whether that change will break something else, why the bug that I am trying to fix actually is a feature that was introduced some years ago or that I now finally encountered the edge-case that past me thought might happen but the team decided to cross that bridge when we reach it (we now reached it).

So the target audience is fellow developers that are working on the code base. Whether internal or external is not relevant here, it is people getting their hands dirty in code.

What does conventional commits try to solve? According to their website it is

A specification for adding human and machine readable meaning to commit messages

The main thing it specifies is adding a type and an optional scope to the summary of the commit.

But what is that for? Is that relevant when I want to know why something was changed? I doubt so. I try to figure out why the change was made, it becomes irrelevant to me whether the change was a bug-fix or due to a feature or whether it is a breaking change.

It does become relevant when I want to use some automated process to generate something else from the commit message. Something like a Changelog or a Release-Message. But those are for a different target audience.

And when I want to use an automated process, why do I not move those information into the Meta-Data section of the commit message? The automated process can still pick it up. And my hooks (or some other process) can make sure that they are not forgotten. And that also makes sure that the precious 50 characters are not lost to something for machine-readability that doesn’t add real information about what happened in that commit.

Keep a Changelog

And, to be honest, creating a Changelog or a Release-Message from commit-logs is perhaps not the best of ideas. They are for a completely different target audience and therefore require completely different content. Putting the burden of creating a meaningful Changelog- or Relese-Message-entry onto the shoulders of those writing the code is perhaps not the best of ideas. That’s a bit like expecting developers to write End-User documentation.

Why do I think so? Because commits are atomic. And not every commit is relevant for a Changelog or a Release-Message. If you are working with merge-commits, then you could use those to add something for the Changelog or the Release-Message as a merge usually is a selfcontained thing that one wants in a changelog or a Release-Message.

Another option would be to use something like keepachangelog to keep the wording of the changelog completely separate from the commit messages. It would even allow one to have someone else write the changelog entry than the person creating the code-commits. And your CI then makes sure that merging is not possible without an entry in the changelog-file.

Why should I care, we use Squash-Merge

My condolences! And prepare for a squash-merge surcharge should you want me to work on your code.

In my opinion squashing everything into one commit in an automated way is indeed destroying a lot of information about the reasoning behind decisions. So far no one has been able to explain to me why squash merging is better than just merging. The amount of storage-space saved is usually not really an issue. Or – when it is – it is an issue due to a huge monorepo which in itself might be problematic but that is a completely different topic!

I achieve a “clean” history by rebasing everything onto the main branch before merging. I use merge-commits (without fast-forward merges!) to make it clear which parts were developed together. That also allows me to revoke a specific – perhaps broken – merge but also to revoke one specific commit.

Yes, when I squash commits, all the commit-messages are kept, so no information is lost. Apart from the information which part of the commit message actually belongs to which line of code of the commit. Which might be vital. When squashing I can only guess and might indeed miss something.

So I would never want to use squash-commits. Especially to get a nice and orderly commit-history. The commit-history is not for showing off how nice and orderly a project is, but for what actually happened.

But that is just my thought on that.

How are you handling commits?