?
This document uses PHP Chinese website manual Release
“git bisect”使軟件用戶和開發(fā)人員能夠輕松找到引入回歸的提交。我們將說明為什么有很好的工具來對抗回歸很重要。我們描述了“git bisect”如何從外部工作以及內部使用的算法。然后我們解釋如何利用“git bisect”來改進當前的實踐。我們將討論“git bisect”在將來如何改進。
Git 是由 Linus Torvalds 創(chuàng)建并由 Junio Hamano 維護的分布式版本控制系統(tǒng)(DVCS)。
在 Git 中,就像許多其他版本控制系統(tǒng)(VCS)一樣,系統(tǒng)管理的數(shù)據(jù)的不同狀態(tài)稱為提交。而且,由于VCS主要用于管理軟件源代碼,因此在某些提交中引入了軟件中行為的“有趣”更改。
實際上,人們對引入“壞”行為的提交特別感興趣,稱為 bug 或回歸。他們對這些提交感興趣,因為提交(希望)包含一小部分源代碼更改。當你只需要檢查一小部分變化時,比當你不知道放在哪里時更容易理解和正確地解決問題。
因此,為了幫助人們找到引入“不良”行為的提交,發(fā)明了“git bisect”命令集。當然,在“git bisect”的說法中,承認有趣行為出現(xiàn)的地方稱為“壞”承諾,而其他承諾則稱為“良好”承諾。而引入我們感興趣的行為的提交被稱為“第一次錯誤提交”。請注意,在我們正在搜索的提交空間中可能會有多個“第一個錯誤提交”。
所以“git bisect”旨在幫助找到“第一個錯誤的提交”。為了盡可能高效,它會嘗試執(zhí)行二分查找。
回歸是軟件行業(yè)的一個大問題。但是很難在這個說法背后加入一些真實的數(shù)字。
一般來說,有一些關于錯誤的數(shù)字,比如2002年的 NIST 研究[1]說:
根據(jù)美國商務部國家標準學會委托的一項新發(fā)布的研究顯示,軟件錯誤或錯誤非常普遍,因此對美國經(jīng)濟造成的損失每年約為595億美元,約占國內生產總值的0.6%和技術(NIST)。在國家一級,超過一半的費用由軟件用戶承擔,其余由軟件開發(fā)商/供應商承擔。該研究還發(fā)現(xiàn),盡管所有錯誤都無法消除,但可以通過改進的測試基礎設施消除超過三分之一的這些成本或估計的22.2億美元成本,從而能夠更早,更有效地識別和消除軟件缺陷。這些節(jié)省與找到更接近引入它們的開發(fā)階段的錯誤百分比(但不是100%)相關。目前,在開發(fā)過程或售后軟件使用期間,直到“下游”才發(fā)現(xiàn)超過一半的錯誤。
接著:
軟件開發(fā)人員已經(jīng)花費了大約80%的開發(fā)成本來識別和糾正缺陷,而除軟件之外的其他任何類型的產品都很少出現(xiàn)如此高的錯誤。
最終得出的結論是:
通向更高軟件質量的途徑是顯著改進軟件測試。
還有其他估計說,軟件相關成本的80%是關于維護[2]。
盡管如此,根據(jù)維基百科[3]:
對維護的一種普遍看法是,它只是修復錯誤。然而,多年來的研究和調查表明,大部分超過80%的維護工作都用于非糾正措施(Pigosky,1997)。用戶提交的問題報告實際上是對系統(tǒng)功能的增強,這種看法延續(xù)下去。
但是我們可以猜測,對現(xiàn)有軟件的改進是非常昂貴的,因為你必須留意回歸。至少這會使上述研究相互一致。
當然,某種軟件是開發(fā)出來的,然后在一段時間內使用而沒有多少改進,然后最終被扔掉。當然,在這種情況下,回歸可能不是一個大問題。但另一方面,很多人在數(shù)年甚至數(shù)十年不斷開發(fā)和維護大量軟件。而且由于經(jīng)常有很多人依賴(有時是批判地)這樣的軟件,所以回歸是一個非常大的問題。
一個這樣的軟件是 Linux 內核。如果我們看看 Linux 內核,我們可以看到花了很多時間和精力來應對回歸。發(fā)布周期以2周的合并窗口開始。然后標記第一個版本候選版本(rc)版本。之后,在最終版本發(fā)布之前,大約會有7或8個 rc 版本,每個版本之間會出現(xiàn)大約一周的時間。
第一次 rc 發(fā)布和最終發(fā)布之間的時間應該被用來測試 rc 版本和防止bug,尤其是回歸。這一次超過發(fā)布周期的80%。但這還沒有結束,因為它在發(fā)布之后當然會繼續(xù)。
然后這就是 Ingo Molnar(一位知名的Linux內核開發(fā)人員)對他使用 git bisect 的說法:
我在合并窗口期間最積極地使用它(當很多樹在上游合并并且 bug 流入量最高時) - 是的,我曾經(jīng)有過多次使用它的情況。我的平均一天大概是一次。
因此,開發(fā)人員一直在進行回歸測試,事實上,眾所周知應盡快修復錯誤,以便盡快找到漏洞。這就是為什么有很好的工具來達到這個目的。
那么用什么工具來對抗回歸呢?它們幾乎與用于防止常規(guī)錯誤的那些相同。唯一特定的工具是與“git bisect”類似的測試套件和工具。
測試套件非常好。但是,當它們單獨使用時,應該使用它們,以便在每次提交后檢查所有測試。這意味著它們效率不高,因為許多測試運行時沒有有趣的結果,并且它們遭受了組合式爆炸。
事實上,問題在于大型軟件通常有許多不同的配置選項,并且每次提交之后,每個測試用例都應該通過每個配置。因此,如果您對每個版本都有:N 配置,M 提交和 T 測試用例,則應執(zhí)行:
N * M * T tests
其中 N,M 和 T 都隨著軟件的大小而增長。
所以很快就不可能完全測試一切。
如果一些錯誤通過你的測試套件,那么你可以添加一個測試到你的測試套件。但是如果你想使用你的新改進的測試套件來發(fā)現(xiàn)錯誤在哪里滑入,那么你將不得不模擬一個二分過程,否則你可能會直截了當?shù)貜拿總€提交的“壞”提交開始測試每個提交,這可能是非常浪費。
要使用的第一個“git bisect”子命令是“git bisect start”來開始搜索。然后必須設置邊界來限制提交空間。這通常是通過給予一個“不好的”和至少一個“好的”提交來完成的。他們可以在初次調用時傳遞給“git bisect start”,如下所示:
$ git bisect start [BAD [GOOD...]]
或者可以使用以下設置:
$ git bisect bad [COMMIT]
和:
$ git bisect good [COMMIT...]
BAD,GOOD 和 COMMIT 都是可以解析為提交的名稱。
然后,“git bisect”將檢出其選擇的提交并要求用戶對其進行測試,如下所示:
$ git bisect start v2.6.27 v2.6.25Bisecting: 10928 revisions left to test after this (roughly 14 steps)[2ec65f8b89ea003c27ff7723525a2ee335a2b393] x86: clean up using max_low_pfn on 32-bit
請注意,我們將使用的示例實際上就是一個玩具示例,我們將查找具有“2.6.26-something”版本的第一個提交,即提交的“SUBLEVEL = 26”行頂級 Makefile。這是一個玩具示例,因為使用 Git 比使用“git bisect”(例如“git blame”或“git log -S <string>”)更好地找到此提交。
在這一點上,基本上有兩種方法來驅動搜索。它可以由用戶手動驅動,也可以由腳本或命令自動驅動。
如果用戶正在驅動它,那么在搜索的每一步中,用戶將不得不測試當前提交,并使用“git bisect good”或“git bisect bad”命令來判斷它是“好”還是“壞”分別如上所述。例如:
$ git bisect bad Bisecting: 5480 revisions left to test after this (roughly 13 steps)[66c0b394f08fd89236515c1c84485ea712a157be] KVM: kill file->f_count abuse in kvm
在經(jīng)過幾步之后,“git bisect”最終會發(fā)現(xiàn)第一個錯誤的提交:
$ git bisect bad 2ddcca36c8bcfa251724fe342c8327451988be0d is the first bad commit commit 2ddcca36c8bcfa251724fe342c8327451988be0d Author: Linus Torvalds <torvalds@linux-foundation.org>Date: Sat May 3 11:59:44 2008 -0700 Linux 2.6.26-rc1:100644 100644 5cf82581... 4492984e... M Makefile
在這一點上,我們可以看到提交的內容,檢查出它(如果它沒有被檢出)或修改它,例如:
$ git show HEAD commit 2ddcca36c8bcfa251724fe342c8327451988be0d Author: Linus Torvalds <torvalds@linux-foundation.org>Date: Sat May 3 11:59:44 2008 -0700 Linux 2.6.26-rc1 diff --git a/Makefile b/Makefile index 5cf8258..4492984 100644--- a/Makefile+++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6-SUBLEVEL = 25-EXTRAVERSION =+SUBLEVEL = 26+EXTRAVERSION = -rc1 NAME = Funky Weasel is Jiggy wit it # *DOCUMENTATION*
當我們完成后,我們可以使用“git bisect reset”重新開始我們在開始平分前的分支:
$ git bisect reset Checking out files: 100% (21549/21549), done.Previous HEAD position was 2ddcca3... Linux 2.6.26-rc1 Switched to branch 'master'
驅動二等分過程的另一種方式是告訴“git bisect”在每個二等分步驟中啟動腳本或命令,以確定當前提交是“好”還是“壞”。為此,我們使用“git bisect run”命令。例如:
$ git bisect start v2.6.27 v2.6.25Bisecting: 10928 revisions left to test after this (roughly 14 steps)[2ec65f8b89ea003c27ff7723525a2ee335a2b393] x86: clean up using max_low_pfn on 32-bit $ $ git bisect run grep '^SUBLEVEL = 25' Makefile running grep ^SUBLEVEL = 25 Makefile Bisecting: 5480 revisions left to test after this (roughly 13 steps)[66c0b394f08fd89236515c1c84485ea712a157be] KVM: kill file->f_count abuse in kvm running grep ^SUBLEVEL = 25 Makefile SUBLEVEL = 25Bisecting: 2740 revisions left to test after this (roughly 12 steps)[671294719628f1671faefd4882764886f8ad08cb] V4L/DVB(7879): Adding cx18 Support for mxl5005s......running grep ^SUBLEVEL = 25 Makefile Bisecting: 0 revisions left to test after this (roughly 0 steps)[2ddcca36c8bcfa251724fe342c8327451988be0d] Linux 2.6.26-rc1 running grep ^SUBLEVEL = 25 Makefile 2ddcca36c8bcfa251724fe342c8327451988be0d is the first bad commit commit 2ddcca36c8bcfa251724fe342c8327451988be0d Author: Linus Torvalds <torvalds@linux-foundation.org>Date: Sat May 3 11:59:44 2008 -0700 Linux 2.6.26-rc1:100644 100644 5cf82581... 4492984e... M Makefile bisect run success
在這個例子中,我們將“grep ^SUBLEVEL = 25
Makefile”作為參數(shù)傳遞給“git bisect run”。這意味著在每一步中,我們通過的 grep 命令將會啟動。如果代碼0(表示成功)退出,那么git bisect會將當前狀態(tài)標記為“良好”。如果它以代碼1退出(或包含1到127之間的任何代碼,除特殊代碼125外),則當前狀態(tài)將被標記為“不良”。
退出128到255之間的代碼對于“git bisect run”是特別的。他們立即停止平分過程。例如,如果傳遞的命令需要很長的時間才能完成,那么這很有用,因為你可以用一個信號殺停它并停止平分過程。
如果檢測到一些非常異常的情況,它也可以用于傳遞給“git bisect run”到“exit 255”的腳本。
有時會發(fā)生,當前狀態(tài)不能被測試,例如,如果它不編譯,因為當時有一個阻止它的錯誤。這是特殊退出 碼125 的用途。它告訴“git bisect run”,當前提交應該被標記為不可測試,并且另一個應該被選中并簽出。
如果平分過程是手動驅動的,則可以使用“git bisect skip”來做同樣的事情。(事實上,特殊的退出碼125使得“git bisect run”在后臺使用“git bisect skip”。)
或者如果你想要更多的控制,你可以使用例如“git bisect visualize”來檢查當前狀態(tài)。它會啟動 gitk(或者如果DISPLAY
沒有設置環(huán)境變量,則使用“git log” )來幫助您找到更好的對分點。
無論哪種方式,如果您有一串不可測試的提交,則可能發(fā)生您正在查找的回歸已由其中一個不可測試的提交引入。在這種情況下,無法確定哪個提交引入了回歸。
所以如果你使用“git bisect skip”(或者用特殊代 代碼125 退出運行腳本),你可能會得到如下結果:
There are only 'skip'ped commits left to test.The first bad commit could be any of:15722f2fa328eaba97022898a305ffc8172db6b1 78e86cf3e850bd755bb71831f42e200626fbd1e0 e15b73ad3db9b48d7d1ade32f8cd23a751fe0ace 070eab2303024706f2924822bfec8b9847e4ac1b We cannot bisect more!
如果你想向其他人展示你的平分過程,你可以使用例如:
$ git bisect log > bisect_log.txt
并且可以使用以下方法進行重放:
$ git bisect replay bisect_log.txt
隨著 Git 提交形成一個有向無環(huán)圖(DAG),在每一步中找到最好的二等分提交并不是那么簡單。無論如何, Linus 發(fā)現(xiàn)并實施了一個“真正的愚蠢”算法,后來由 Junio Hamano 改進,效果很好。
因此,當沒有跳過提交時,“git bisect”使用的算法可以找到最佳的二等分提交,如下所示:
1)只保留提交:
a)是“壞”提交的祖先(包括“壞”提交本身),b)不是“好”提交的祖先(不包括“好”提交)。
這意味著我們擺脫了在 DAG 中無趣的提交。
例如,如果我們從這樣的圖表開始:
G-Y-G-W-W-W-X-X-X-X \ / W-W-B /Y---G-W---W \ / \ Y-Y X-X-X-X-> time goes this way ->
其中 B 是“壞”提交,“G”是“好”提交,W,X 和 Y 是其他提交,我們將在第一步之后得到以下圖表:
W-W-W \ W-W-B /W---W
所以只有 W 和 B 的提交將被保留。因為提交 X 和 Y 將分別被規(guī)則 a)和 b)刪除,并且由于提交 G 也被規(guī)則b)刪除。
注意 Git 用戶,它相當于只保留以下提供的提交:
git rev-list BAD --not GOOD1 GOOD2...
還要注意,我們并不要求那些被認為是“良好”提交的后代的提交。所以在下面的例子中,提交 W 和 Z 將被保留:
G-W-W-W-B /Z-Z
2)從圖的“好”端開始,將每個提交的祖先的數(shù)量加上一個
例如下面的圖表,其中 H 是“壞”提交,A 和 D 是一些“好”提交的雙親項:
A-B-C \ F-G-H /D---E
這會給:
1 2 3A-B-C \6 7 8 F-G-H1 2/D---E
3)與每個提交相關聯(lián):min(X,N - X)
其中 X 是與步驟2中的提交相關聯(lián)的值),N 是圖中提交的總數(shù)。
在上面的例子中,我們有 N = 8,所以這會給出:
1 2 3A-B-C \2 1 0 F-G-H1 2/D---E
4)最好的二等分點是具有最高關聯(lián)號碼的提交
所以在上面的例子中,最好的對分點是提交 C.
5)請注意,為了加快算法,實施了一些快捷方式
因為我們從一開始就知道 N,我們知道 min(X,N - X)不能大 于N / 2。因此,在步驟2)和3)中,如果我們將 N / 2與提交相關聯(lián),那么我們知道這是最好的對分點。所以在這種情況下,我們可以停止處理任何其他提交并返回當前提交。
對于任何提交圖,您可以使用“git rev-list --bisect-all”查看與每次提交相關的編號。
例如,對于上面的圖,一個命令如:
$ git rev-list --bisect-all BAD --not GOOD1 GOOD2
會輸出如下內容:
e15b73ad3db9b48d7d1ade32f8cd23a751fe0ace (dist=3)15722f2fa328eaba97022898a305ffc8172db6b1 (dist=2)78e86cf3e850bd755bb71831f42e200626fbd1e0 (dist=2)a1939d9a142de972094af4dde9a544e577ddef0e (dist=2)070eab2303024706f2924822bfec8b9847e4ac1b (dist=1)a3864d4f32a3bf5ed177ddef598490a08760b70d (dist=1)a41baa717dd74f1180abf55e9341bc7a0bb9d556 (dist=1)9e622a6dad403b71c40979743bb9d5be17b16bd6 (dist=0)
首先讓我們定義“最佳平分點”。我們會說如果知道它的狀態(tài)(“好”或“壞”)提供了盡可能多的信息,那么提交X是最好的平分點還是最好的平分提交,無論提交的狀態(tài)是“好”還是“壞”。
這意味著最好的二等分提交是以下函數(shù)最大的提交:
f(X) = min(information_if_good(X), information_if_bad(X))
其中 information_if_good(X)是我們得到的信息,如果X 很好,而 information_if_bad(X)是我們在X不好的情況下得到的信息。
現(xiàn)在我們假設只有一個“第一次錯誤提交”。這意味著它的所有后代都是“壞的”,其他所有的提交都是“好的”。我們假設所有的提交都有好的或壞的概率,或者是第一個壞提交的概率,所以知道 c 提交的狀態(tài)總是提供相同數(shù)量的信息,無論這些 c 提交在圖上,以及任何 c 是。(所以我們假設這些提交例如在一個分支上或接近一個好的或不好的提交時不會提供更多或更少的信息)。
我們還假設在上面的二分算法中,我們有一個像步驟1)之后的清理圖。這意味著我們可以通過我們可以從圖中刪除的提交次數(shù)來衡量我們得到的信息。
讓我們在圖中提交一個 X 提交。
如果 X 被發(fā)現(xiàn)是“好的”,那么我們知道它的祖先都是“好的”,所以我們想說:
information_if_good(X) = number_of_ancestors(X) (TRUE)
這是真實的,因為在步驟1)b)我們刪除了“良好”提交的祖先。
如果發(fā)現(xiàn) X 是“壞的”,那么我們知道它的后代都是“壞的”,所以我們想說:
information_if_bad(X) = number_of_descendants(X) (WRONG)
但是這是錯誤的,因為在步驟1)a)我們只保留壞提交的祖先。所以當一個提交被標記為“不好”時,我們會得到更多的信息,因為我們也知道,先前的“壞”提交的祖先不是新的“壞”提交的祖先,它不是第一個錯誤提交。我們不知道它們是好還是壞,但我們知道它們不是第一個不好的提交,因為它們不是新的“壞”提交的祖先。
所以當一個提交被標記為“壞”時,我們知道我們可以刪除圖中的所有提交,除了那些是新的“壞”提交的祖先的提交。這意味著:
information_if_bad(X) = N - number_of_ancestors(X) (TRUE)
其中 N 是(清理)圖中的提交數(shù)。
所以最終這意味著要找到最佳的二等分提交,我們應該最大化該函數(shù):
f(X) = min(number_of_ancestors(X), N - number_of_ancestors(X))
這很好,因為在步驟2)我們計算 number_of_ancestors(X),所以在步驟3)我們計算 f(X)。
我們以下面的圖為例:
G-H-I-J / \ A-B-C-D-E-F O \ / K-L-M-N
如果我們計算下面的非最優(yōu)函數(shù):
g(X) = min(number_of_ancestors(X), number_of_descendants(X))
我們得到:
4 3 2 1 G-H-I-J1 2 3 4 5 6/ \0A-B-C-D-E-F O \ / K-L-M-N 4 3 2 1
但用 git bisect 使用的算法,我們得到:
7 7 6 5 G-H-I-J1 2 3 4 5 6/ \0A-B-C-D-E-F O \ / K-L-M-N 7 7 6 5
所以我們選擇 G,H,K 或 L 作為最好的二等分點,這比 F 好。因為如果例如L不好,那么我們不僅會知道 L,M 和 N 是壞的,而且 G,H ,I 和 J 不是第一個錯誤提交(因為我們假設只有一個第一個錯誤提交,它必須是 L 的祖先)。
所以目前的算法似乎是我們最初設想的最好的。
當一些提交被跳過時(使用“git bisect skip”),那么對于步驟1)到3),二分算法是相同的。但是,我們大致使用以下步驟:
6)通過減少關聯(lián)值來排序提交
7)如果第一次提交沒有被跳過,我們可以返回并停在這里
8)否則過濾掉排序列表中的所有跳過的提交
9)使用偽隨機數(shù)發(fā)生器(PRNG)產生一個介于0和1之間的隨機數(shù)
10)將此隨機數(shù)乘以其平方根以將其偏向0
11)將結果乘以過濾列表中的提交數(shù),以獲得該列表中的索引
12)在計算出的索引處返回提交
在步驟7)之后(在跳過算法中),我們可以檢查第二個提交是否被跳過,如果不是這樣,則返回它。實際上,這是我們在 Git 1.5.4版(2008年2月1日發(fā)布)中開發(fā)“git bisect skip”之前使用的算法,直到 Git 1.6.4版(2009年7月29日發(fā)布)為止。
但 Ingo Molnar和H. Peter Anvin(另一位著名的Linux內核開發(fā)人員)都抱怨說,有時候最好的二等分點都發(fā)生在所有提交都無法測試的區(qū)域。在這種情況下,用戶被要求測試許多不可測試的提交,這可能是非常低效的。
實際上,不可測試的提交通常是不可測試的,因為一次引入了破壞,并且只有在引入了其他許多提交之后,破壞才得以解決。
這種破壞當然大部分時間與我們試圖在提交圖中找到的破壞無關。但它阻止我們知道這個有趣的“不良行為”是否存在。
因此,接近無法測試的提交很有可能無法自行測試。最好的二等分提交也經(jīng)常在一起發(fā)現(xiàn)(由于二分算法)。
這就是為什么當?shù)谝粋€被跳過的時候選擇下一個最好的沒有加載的分叉提交是一個壞主意。
我們發(fā)現(xiàn)圖表上的大部分提交可能會在測試時提供相當多的信息。那些平均不會提供大量信息的提交是接近好的和不好的提交的提交。
因此,使用帶有偏見的 PRNG 承諾從好的和不好的提交看起來是一個不錯的選擇。
對這種算法的一個明顯的改進是在使用 PRNG 之前查找一個提交值,該提交值具有與最好的二等分提交之一相關的值,并且位于另一個分支上。因為如果存在這樣的提交,那么它不太可能也是不可測試的,所以它可能比幾乎隨機選擇的提供更多的信息。
在二分法算法中還有一個調整,在上面的“二分算法”中沒有描述過。
We supposed in the previous examples that the "good" commits were ancestors of the "bad" commit. But this is not a requirement of "git bisect".
Of course the "bad" commit cannot be an ancestor of a "good" commit, because the ancestors of the good commits are supposed to be "good". And all the "good" commits must be related to the bad commit. They cannot be on a branch that has no link with the branch of the "bad" commit. But it is possible for a good commit to be related to a bad commit and yet not be neither one of its ancestor nor one of its descendants.
For example, there can be a "main" branch, and a "dev" branch that was forked of the main branch at a commit named "D" like this:
A-B-C-D-E-F-G <--main \ H-I-J <--dev
The commit "D" is called a "merge base" for branch "main" and "dev" because it’s the best common ancestor for these branches for a merge.
Now let’s suppose that commit J is bad and commit G is good and that we apply the bisection algorithm like it has been previously described.
As described in step 1) b) of the bisection algorithm, we remove all the ancestors of the good commits because they are supposed to be good too.
So we would be left with only:
H-I-J
But what happens if the first bad commit is "B" and if it has been fixed in the "main" branch by commit "F"?
The result of such a bisection would be that we would find that H is the first bad commit, when in fact it’s B. So that would be wrong!
And yes it can happen in practice that people working on one branch are not aware that people working on another branch fixed a bug! It could also happen that F fixed more than one bug or that it is a revert of some big development effort that was not ready to be released.
In fact development teams often maintain both a development branch and a maintenance branch, and it would be quite easy for them if "git bisect" just worked when they want to bisect a regression on the development branch that is not on the maintenance branch. They should be able to start bisecting using:
$ git bisect start dev main
To enable that additional nice feature, when a bisection is started and when some good commits are not ancestors of the bad commit, we first compute the merge bases between the bad and the good commits and we chose these merge bases as the first commits that will be checked out and tested.
If it happens that one merge base is bad, then the bisection process is stopped with a message like:
The merge base BBBBBB is bad.This means the bug has been fixed between BBBBBB and [GGGGGG,...].
where BBBBBB is the sha1 hash of the bad merge base and GGGGGG,… is a comma separated list of the sha1 of the good commits.
If some of the merge bases are skipped, then the bisection process continues, but the following message is printed for each skipped merge base:
Warning: the merge base between BBBBBB and [GGGGGG,...] must be skipped.So we cannot be sure the first bad commit is between MMMMMM and BBBBBB.We continue anyway.
where BBBBBB is the sha1 hash of the bad commit, MMMMMM is the sha1 hash of the merge base that is skipped and GGGGGG,… is a comma separated list of the sha1 of the good commits.
So if there is no bad merge base, the bisection process continues as usual after this step.
If you both have a test suite and use git bisect, then it becomes less important to check that all tests pass after each commit. Though of course it is probably a good idea to have some checks to avoid breaking too many things because it could make bisecting other bugs more difficult.
You can focus your efforts to check at a few points (for example rc and beta releases) that all the T test cases pass for all the N configurations. And when some tests don’t pass you can use "git bisect" (or better "git bisect run"). So you should perform roughly:
c * N * T + b * M * log2(M) tests
where c is the number of rounds of test (so a small constant) and b is the ratio of bug per commit (hopefully a small constant too).
So of course it’s much better as it’s O(N * T) vs O(N * T * M) if you would test everything after each commit.
This means that test suites are good to prevent some bugs from being committed and they are also quite good to tell you that you have some bugs. But they are not so good to tell you where some bugs have been introduced. To tell you that efficiently, git bisect is needed.
The other nice thing with test suites, is that when you have one, you already know how to test for bad behavior. So you can use this knowledge to create a new test case for "git bisect" when it appears that there is a regression. So it will be easier to bisect the bug and fix it. And then you can add the test case you just created to your test suite.
So if you know how to create test cases and how to bisect, you will be subject to a virtuous circle:
more tests ? easier to create tests ? easier to bisect ? more tests
So test suites and "git bisect" are complementary tools that are very powerful and efficient when used together.
You can very easily automatically bisect broken builds using something like:
$ git bisect start BAD GOOD $ git bisect run make
例如:
$ git bisect run sh -c "make || exit 125; ./my_app | grep 'good output'"
另一方面,如果你經(jīng)常這樣做,那么可以使用腳本來避免太多的輸入。
以下是一個示例腳本,它由 Junio Hamano [4] 使用的真實世界腳本稍微修改而來。
可以將此腳本傳遞給“git bisect run”以查找引入性能回歸的提交:
#!/bin/sh # Build errors are not what I am interested in.make my_app || exit 255# We are checking if it stops in a reasonable amount of time, so # let it run in the background..../my_app >log 2>&1 &# ... and grab its process ID.pid=$!# ... and then wait for sufficiently long.sleep $NORMAL_TIME # ... and then see if the process is still there.if kill -0 $pid then # It is still running -- that is bad. kill $pid; sleep 1; kill $pid; exit 1else # It has already finished (the $pid process was no more), # and we are happy. exit 0fi
很明顯,不要在明知故障的情況下提交更改,即使其他一些提交后來修復了損壞。
使用任何 VCS 在每個提交中只有一個小的邏輯更改也是一個好主意。
The smaller the changes in your commit, the most effective "git bisect" will be. And you will probably need "git bisect" less in the first place, as small changes are easier to review even if they are only reviewed by the committer.
Another good idea is to have good commit messages. They can be very helpful to understand why some changes were made.
These general best practices are very helpful if you bisect often.
First merges by themselves can introduce some regressions even when the merge needs no source code conflict resolution. This is because a semantic change can happen in one branch while the other branch is not aware of it.
For example one branch can change the semantic of a function while the other branch add more calls to the same function.
This is made much worse if many files have to be fixed to resolve conflicts. That’s why such merges are called "evil merges". They can make regressions very difficult to track down. It can even be misleading to know the first bad commit if it happens to be such a merge, because people might think that the bug comes from bad conflict resolution when it comes from a semantic change in one branch.
Anyway "git rebase" can be used to linearize history. This can be used either to avoid merging in the first place. Or it can be used to bisect on a linear history instead of the non linear one, as this should give more information in case of a semantic change in one branch.
Merges can be also made simpler by using smaller branches or by using many topic branches instead of only long version related branches.
And testing can be done more often in special integration branches like linux-next for the linux kernel.
A special work-flow to process regressions can give great results.
Here is an example of a work-flow used by Andreas Ericsson:
write, in the test suite, a test script that exposes the regression
use "git bisect run" to find the commit that introduced it
fix the bug that is often made obvious by the previous step
commit both the fix and the test script (and if needed more tests)
And here is what Andreas said about this work-flow [5]:
To give some hard figures, we used to have an average report-to-fix cycle of 142.6 hours (according to our somewhat weird bug-tracker which just measures wall-clock time). Since we moved to Git, we’ve lowered that to 16.2 hours. Primarily because we can stay on top of the bug fixing now, and because everyone’s jockeying to get to fix bugs (we’re quite proud of how lazy we are to let Git find the bugs for us). Each new release results in ~40% fewer bugs (almost certainly due to how we now feel about writing tests).
Clearly this work-flow uses the virtuous circle between test suites and "git bisect". In fact it makes it the standard procedure to deal with regression.
In other messages Andreas says that they also use the "best practices" described above: small logical commits, topic branches, no evil merge,… These practices all improve the bisectability of the commit graph, by making it easier and more useful to bisect.
So a good work-flow should be designed around the above points. That is making bisecting easier, more useful and standard.
One nice about "git bisect" is that it is not only a developer tool. It can effectively be used by QA people or even end users (if they have access to the source code or if they can get access to all the builds).
There was a discussion at one point on the linux kernel mailing list of whether it was ok to always ask end user to bisect, and very good points were made to support the point of view that it is ok.
For example David Miller wrote [6]:
What people don’t get is that this is a situation where the "end node principle" applies. When you have limited resources (here: developers) you don’t push the bulk of the burden upon them. Instead you push things out to the resource you have a lot of, the end nodes (here: users), so that the situation actually scales.
This means that it is often "cheaper" if QA people or end users can do it.
What is interesting too is that end users that are reporting bugs (or QA people that reproduced a bug) have access to the environment where the bug happens. So they can often more easily reproduce a regression. And if they can bisect, then more information will be extracted from the environment where the bug happens, which means that it will be easier to understand and then fix the bug.
For open source projects it can be a good way to get more useful contributions from end users, and to introduce them to QA and development activities.
In some cases like for kernel development it can be worth developing complex scripts to be able to fully automate bisecting.
Here is what Ingo Molnar says about that [7]:
i have a fully automated bootup-hang bisection script. It is based on "git-bisect run". I run the script, it builds and boots kernels fully automatically, and when the bootup fails (the script notices that via the serial log, which it continuously watches - or via a timeout, if the system does not come up within 10 minutes it’s a "bad" kernel), the script raises my attention via a beep and i power cycle the test box. (yeah, i should make use of a managed power outlet to 100% automate it)
We have seen that test suites an git bisect are very powerful when used together. It can be even more powerful if you can combine them with other systems.
For example some test suites could be run automatically at night with some unusual (or even random) configurations. And if a regression is found by a test suite, then "git bisect" can be automatically launched, and its result can be emailed to the author of the first bad commit found by "git bisect", and perhaps other people too. And a new entry in the bug tracking system could be automatically created too.
We saw earlier that "git bisect skip" is now using a PRNG to try to avoid areas in the commit graph where commits are untestable. The problem is that sometimes the first bad commit will be in an untestable area.
To simplify the discussion we will suppose that the untestable area is a simple string of commits and that it was created by a breakage introduced by one commit (let’s call it BBC for bisect breaking commit) and later fixed by another one (let’s call it BFC for bisect fixing commit).
For example:
...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z-...
where we know that Y is good and BFC is bad, and where BBC and X1 to X6 are untestable.
In this case if you are bisecting manually, what you can do is create a special branch that starts just before the BBC. The first commit in this branch should be the BBC with the BFC squashed into it. And the other commits in the branch should be the commits between BBC and BFC rebased on the first commit of the branch and then the commit after BFC also rebased on.
For example:
(BBC+BFC)-X1'-X2'-X3'-X4'-X5'-X6'-Z' /...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z-...
where commits quoted with ' have been rebased.
You can easily create such a branch with Git using interactive rebase.
For example using:
$ git rebase -i Y Z
and then moving BFC after BBC and squashing it.
After that you can start bisecting as usual in the new branch and you should eventually find the first bad commit.
For example:
$ git bisect start Z' Y
If you are using "git bisect run", you can use the same manual fix up as above, and then start another "git bisect run" in the special branch. Or as the "git bisect" man page says, the script passed to "git bisect run" can apply a patch before it compiles and test the software [8]. The patch should turn a current untestable commits into a testable one. So the testing will result in "good" or "bad" and "git bisect" will be able to find the first bad commit. And the script should not forget to remove the patch once the testing is done before exiting from the script.
(Note that instead of a patch you can use "git cherry-pick BFC" to apply the fix, and in this case you should use "git reset --hard HEAD^" to revert the cherry-pick after testing and before returning from the script.)
But the above ways to work around untestable areas are a little bit clunky. Using special branches is nice because these branches can be shared by developers like usual branches, but the risk is that people will get many such branches. And it disrupts the normal "git bisect" work-flow. So, if you want to use "git bisect run" completely automatically, you have to add special code in your script to restart bisection in the special branches.
Anyway one can notice in the above special branch example that the Z' and Z commits should point to the same source code state (the same "tree" in git parlance). That’s because Z' result from applying the same changes as Z just in a slightly different order.
So if we could just "replace" Z by Z' when we bisect, then we would not need to add anything to a script. It would just work for anyone in the project sharing the special branches and the replacements.
With the example above that would give:
(BBC+BFC)-X1'-X2'-X3'-X4'-X5'-X6'-Z'-... /...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z
That’s why the "git replace" command was created. Technically it stores replacements "refs" in the "refs/replace/" hierarchy. These "refs" are like branches (that are stored in "refs/heads/") or tags (that are stored in "refs/tags"), and that means that they can automatically be shared like branches or tags among developers.
"git replace" is a very powerful mechanism. It can be used to fix commits in already released history, for example to change the commit message or the author. And it can also be used instead of git "grafts" to link a repository with another old repository.
In fact it’s this last feature that "sold" it to the Git community, so it is now in the "master" branch of Git’s Git repository and it should be released in Git 1.6.5 in October or November 2009.
One problem with "git replace" is that currently it stores all the replacements refs in "refs/replace/", but it would be perhaps better if the replacement refs that are useful only for bisecting would be in "refs/replace/bisect/". This way the replacement refs could be used only for bisecting, while other refs directly in "refs/replace/" would be used nearly all the time.
Another possible improvement to "git bisect" would be to optionally add some redundancy to the tests performed so that it would be more reliable when tracking sporadic bugs.
This has been requested by some kernel developers because some bugs called sporadic bugs do not appear in all the kernel builds because they are very dependent on the compiler output.
The idea is that every 3 test for example, "git bisect" could ask the user to test a commit that has already been found to be "good" or "bad" (because one of its descendants or one of its ancestors has been found to be "good" or "bad" respectively). If it happens that a commit has been previously incorrectly classified then the bisection can be aborted early, hopefully before too many mistakes have been made. Then the user will have to look at what happened and then restart the bisection using a fixed bisect log.
There is already a project called BBChop created by Ealdwulf Wuffinga on Github that does something like that using Bayesian Search Theory [9]:
BBChop is like
git bisect
(or equivalent), but works when your bug is intermittent. That is, it works in the presence of false negatives (when a version happens to work this time even though it contains the bug). It assumes that there are no false positives (in principle, the same approach would work, but adding it may be non-trivial).
But BBChop is independent of any VCS and it would be easier for Git users to have something integrated in Git.
We have seen that regressions are an important problem, and that "git bisect" has nice features that complement very well practices and other tools, especially test suites, that are generally used to fight regressions. But it might be needed to change some work-flows and (bad) habits to get the most out of it.
Some improvements to the algorithms inside "git bisect" are possible and some new features could help in some cases, but overall "git bisect" works already very well, is used a lot, and is already very useful. To back up that last claim, let’s give the final word to Ingo Molnar when he was asked by the author how much time does he think "git bisect" saves him when he uses it:
a
lot
. About ten years ago did i do my firstbisection
of a Linux patch queue. That was prior the Git (and even prior the BitKeeper) days. I literally days spent sorting out patches, creating what in essence were standalone commits that i guessed to be related to that bug. It was a tool of absolute last resort. I’d rather spend days looking at printk output than do a manualpatch bisection
. With Git bisect it’s a breeze: in the best case i can get a ~15 step kernel bisection done in 20-30 minutes, in an automated way. Even with manual help or when bisecting multiple, overlapping bugs, it’s rarely more than an hour. In fact it’s invaluable because there are bugs i would never eventry
to debug if it wasn’t for git bisect. In the past there were bug patterns that were immediately hopeless for me to debug - at best i could send the crash/bug signature to lkml and hope that someone else can think of something. And even if a bisection fails today it tells us something valuable about the bug: that it’s non-deterministic - timing or kernel image layout dependent. So git bisect is unconditional goodness - and feel free to quote that ;-)
Many thanks to Junio Hamano for his help in reviewing this paper, for reviewing the patches I sent to the Git mailing list, for discussing some ideas and helping me improve them, for improving "git bisect" a lot and for his awesome work in maintaining and developing Git.
Many thanks to Ingo Molnar for giving me very useful information that appears in this paper, for commenting on this paper, for his suggestions to improve "git bisect" and for evangelizing "git bisect" on the linux kernel mailing lists.
Many thanks to Linus Torvalds for inventing, developing and evangelizing "git bisect", Git and Linux.
Many thanks to the many other great people who helped one way or another when I worked on Git, especially to Andreas Ericsson, Johannes Schindelin, H. Peter Anvin, Daniel Barkalow, Bill Lear, John Hawley, Shawn O. Pierce, Jeff King, Sam Vilain, Jon Seymour.
Many thanks to the Linux-Kongress program committee for choosing the author to given a talk and for publishing this paper.