At the time I was building scc and quicly discovered one of the main bottlenecks in the application was walking the file tree finding files to process. This resulted in me exploring other file walk implementations for Go in order to remove the bottleneck.
The conclusion I came to was to use github.com/karrick/godirwalk which was much faster than any of the other implementations at the time. That conclusion was written in 2018 however, and there have been multiple Go releases since then including performance changes related to files https://golang.org/doc/go1.16#os As such it seemed like a good time to revisit the tests and see if anything has changed.
All code for the tests can be found on Kablamo's GitHub. The tests were written to recursively count the number of files and directories from where the application was run and print the total count at the end. The output for each application should be identical, and this was confirmed for all tests run. Because some walk implementations use goroutines atomic.AddInt64 operations were used when counting for all implementations in order to keep the comparison as close as posible.
The first test was run against a shallow checkout of the linux kernel commit d999ade1cc86cd2951d41c11ea769cb4452c8811 with the excellent hyperfine command line tool used to run the tests. All code was compiled using Go 1.17 and run on a MacBook Air 2020 M1 with 16 GB of RAM.
bash1# what version of Go compiled everything?23$ go version4go version go1.17 darwin/arm6456# verify the output of each application is the same78$ filepathwalk && filepathwalkdir && iafan && karrick && michealtjones && readdir978498107849811784981278498137849814784981516# run the benchmark using hyperfine1718$ hyperfine 'filepathwalk' 'filepathwalkdir' 'iafan' 'karrick' 'michealtjones' 'readdir'19Benchmark #1: filepathwalk20Time (mean ± σ): 236.4 ms ± 3.3 ms [User: 54.7 ms, System: 186.7 ms]21Range (min … max): 232.9 ms … 246.0 ms 12 runs2223Benchmark #2: filepathwalkdir24Time (mean ± σ): 141.0 ms ± 1.3 ms [User: 42.5 ms, System: 101.1 ms]25Range (min … max): 139.2 ms … 145.9 ms 20 runs2627Benchmark #3: iafan28Time (mean ± σ): 93.5 ms ± 3.4 ms [User: 117.0 ms, System: 526.8 ms]29Range (min … max): 91.4 ms … 110.0 ms 26 runs3031Benchmark #4: karrick32Time (mean ± σ): 263.4 ms ± 1.5 ms [User: 57.0 ms, System: 213.5 ms]33Range (min … max): 260.9 ms … 266.4 ms 11 runs3435Benchmark #5: michealtjones36Time (mean ± σ): 91.8 ms ± 3.8 ms [User: 106.7 ms, System: 549.3 ms]37Range (min … max): 90.0 ms … 109.6 ms 26 runs3839Benchmark #6: readdir40Time (mean ± σ): 117.8 ms ± 2.8 ms [User: 22.5 ms, System: 92.9 ms]41Range (min … max): 112.1 ms … 124.3 ms 23 runs4243Summary44'michealtjones' ran451.02 ± 0.06 times faster than 'iafan'461.28 ± 0.06 times faster than 'readdir'471.54 ± 0.07 times faster than 'filepathwalkdir'482.58 ± 0.11 times faster than 'filepathwalk'492.87 ± 0.12 times faster than 'karrick'50
The results show that the parallel implementations now edge out all other implementations. This is the opposite of what I experienced in 2018.
Interestingly what was the fastest implementaton in 2018 github.com/karrick/godirwalk is now the slowest. It also has the biggest difference in API implementaion and as such should probably not be used anymore. Pleasingly filepath.Walk has improved a lot since I last tried this, and is now fast enough that it should be acceptable for most use cases.
For the remaining implementations the margin of difference is not as large as you would expect when compared to the new ReadDir implementation added in Go 1.16.
Wondering if the results might be due to not having enough data to work with, I tried the implementations against all my projects. With just shy of 180,000 files in different directories, it's about 5.2 GB of content.
bash1# whats the size of the content we are about to look over23$ du -hs45.2G .56# independently verify how many files we should be scanning78$ find . | wc -l91797091011# verify the output of each application is the same1213$ filepathwalk && filepathwalkdir && iafan && karrick && michealtjones && readdir1417970915179709161797091717970918179709191797092021# run the benchmark using hyperfine2223$ hyperfine 'filepathwalk' 'filepathwalkdir' 'iafan' 'karrick' 'michealtjones' 'readdir'24Benchmark #1: filepathwalk25Time (mean ± σ): 530.7 ms ± 7.9 ms [User: 150.7 ms, System: 398.9 ms]26Range (min … max): 525.2 ms … 550.5 ms 10 runs2728Benchmark #2: filepathwalkdir29Time (mean ± σ): 346.6 ms ± 21.5 ms [User: 133.6 ms, System: 230.4 ms]30Range (min … max): 329.2 ms … 401.8 ms 10 runs3132Benchmark #3: iafan33Time (mean ± σ): 303.7 ms ± 47.7 ms [User: 290.7 ms, System: 1136.0 ms]34Range (min … max): 268.3 ms … 435.5 ms 10 runs3536Benchmark #4: karrick37Time (mean ± σ): 1.750 s ± 0.017 s [User: 339.9 ms, System: 1434.1 ms]38Range (min … max): 1.730 s … 1.784 s 10 runs3940Benchmark #5: michealtjones41Time (mean ± σ): 276.3 ms ± 18.3 ms [User: 271.4 ms, System: 1229.3 ms]42Range (min … max): 263.3 ms … 308.7 ms 10 runs4344Benchmark #6: readdir45Time (mean ± σ): 268.7 ms ± 1.7 ms [User: 60.1 ms, System: 219.6 ms]46Range (min … max): 266.6 ms … 271.6 ms 11 runs4748Summary49'readdir' ran501.03 ± 0.07 times faster than 'michealtjones'511.13 ± 0.18 times faster than 'iafan'521.29 ± 0.08 times faster than 'filepathwalkdir'531.97 ± 0.03 times faster than 'filepathwalk'546.51 ± 0.08 times faster than 'karrick'55
This shows that the lower overhead promised in Go 1.16 is indeed the case. The new implementation is the fastest here, and that's without appling goroutines which could yield more performance for those willing to do so.
In 2021 the Go native file walk implementations beat any library. It's clear that filepath.Walk has improved enough that its no slower than any other similar implementaion I tried, and the new ReadDir implementation is as fast or faster than any other implementation I tried even without Go routines. It's wonderful to see such baseline performance improvements applied to Go, and it seems even a simple recompile with the latest Go version will give some impressive disk performance improvements.