Image's source
07 Jan 2020 - The Dark Side

Using Git Bisect For Regression Identification

tl;dr git bisect can make you save eons when you are looking in your git history to find the exact revision that created a problem. This is a quick tutorial.


A lot inspired from Automated git bisect will make your day - I am writing this article because it did make my day!

The example project

Let's do a bit of work together.

mkdir demo-git-bisect
cd demo-git-bisect
git init

echo 'Yeah, we created a new git repository!' > notes.md
git add notes.md
git commit -m 'First commit'

echo '' >> notes.md
echo 'A cool title' >> notes.md
echo '======' >> notes.md
git commit -a -m 'Add a cool title'

Now we have a repository with a couple commits.

echo 'my_add = lambda x, y: x + y' > addScript.py
git add addScript.py
git commit -m 'Add an addition script'

echo 'from addScript import my_add' > addScript_test.py
echo '' >> addScript_test.py
echo '' >> addScript_test.py
echo 'assert(my_add(1, 1) == 2)' >> addScript_test.py
git add addScript_test.py
git commit -m 'Add test for our addition script'

We now have a working my_add addition function, and we can test if it works with python addScript_test.py.

All is fine in the world... The project goes on.

echo '' >> notes.md
echo 'We now have an addition function.' >> notes.md
git commit -a -m 'Add some documentation'

echo 'Just call my_add() to add things' >> notes.md
git commit -a -m 'Add some moar documentation'

echo 'my_add = lambda x, y: x * y' > addScript.py
git commit -a -m 'Update my_add function'

echo '' >> notes.md
echo '' >> notes.md
echo 'To be implemented: my_div and my_mult' >> notes.md
git commit -a -m 'Add some ambition'

Now if we call python addScript_test.py we will see that the test is failing.

Debugging

Here we only have a handful of commits, but in a real project we can have a lot more and finding the last working version that will highlight us the changes that have cause the problem. Which can be a real help.

Fortunately we have a test here otherwise we could simply add one.

First we are going to create a script to test if the current version is working, with that script:

  • We can install dependencies if they change between versions.
  • We can restore some specific version of our testing script (for example if we had to extend it to highlight our bug).
  • We must make sure that the script last call returns 0 if everything is fine and something else otherwise.
echo '#!/bin/bash' > test_current.sh
echo '' >> test_current.sh
echo 'echo "* (optional) Installing dependencies"' >> test_current.sh
echo 'echo "* (optional) cp-ing some old test script"' >> test_current.sh
echo 'echo "* Calling the script"' >> test_current.sh
echo 'python3 addScript_test.py' >> test_current.sh

git bisect actually runs a binary search in a commit history in order to identify a culprit revision. More information on git bisect.

So, one last thing. After we start, we need to mark good and bad commits so git bisect knows between which commits it should look.

git bisect start
# current revision does not work
git bisect bad
# Four revisions ago is when we added the test (HEAD~4), we know it worked
git bisect good HEAD~4

Then we can just tell git bisect to use our script to find the breaking changes.

git bisect sh test_current.sh

And tadddaaaaaaaahhhhh, we are know at the breaking revision. We can visualize the changes which are very likely to have caused the problem:

git diff HEAD HEAD^

We can now finish our errand:

git bisect reset

Fräntz Miccoli

This blog is wrapping my notes about software engineering and computer science. My main interests are architecture, quality and machine learning but content in this blog may diverge from time to time.

Since a bit of time now, I am the happy cofounder, COO & CTO of Nexvia.

Ideas are expressed here to be challenged.


About me Out Of The Comfort Zone Twitter