Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Would love to see how bfs compares to fdir[0] for directory traversal. Even though fdir is using Node.js underneath, the comparisons I have done with fd & find are pretty close. Of course, bfs would probably be quite a bit faster...but how much faster exactly?

Disclaimer: I am the developer of fdir.

[0] https://github.com/thecodrr/fdir



From a quick test, about 5x faster:

    tavianator@tachyon $ cat bench.mjs
    #!/usr/bin/env node

    import { fdir } from "fdir";

    console.log(new fdir().withFullPaths().crawl("/home/tavianator/code/android").sync().length);
    tavianator@tachyon $ hyperfine -w1 "node ./bench.mjs" "bfs ~/code/android -false"
    Benchmark 1: node ./bench.mjs
      Time (mean ± σ):      2.073 s ±  0.031 s    [User: 1.372 s, System: 1.260 s]
      Range (min … max):    2.022 s …  2.128 s    10 runs

    Benchmark 2: bfs ~/code/android -false
      Time (mean ± σ):     417.5 ms ±   6.8 ms    [User: 592.3 ms, System: 2487.7 ms]
      Range (min … max):   405.2 ms … 429.7 ms    10 runs

    Summary
      bfs ~/code/android -false ran
        4.97 ± 0.11 times faster than node ./bench.mjs
Is there a way to call it that doesn't require holding all the paths in memory simultaneously?


You are using fdir synchronously. Try this:

console.log(await new fdir().onlyCounts().crawl("/home/tavianator/code/android").withPromise());


I tried it on my end. Built bfs from source using `make release`:

    $ hyperfine -w1 "NODE_ENV=production node ./fdir.mjs" "./bin/bfs /
    home/thecodrr/ -false"

    Benchmark 1: NODE_ENV=production node ./fdir.mjs
      Time (mean ± σ):     965.5 ms ±  53.0 ms    [User: 703.0 ms, System: 1220.5 ms]
      Range (min … max):   858.4 ms … 1041.3 ms    10 runs

    Benchmark 2: ./bin/bfs /home/thecodrr/ -false
      Time (mean ± σ):      1.530 s ±  0.127 s    [User: 0.341 s, System: 2.282 s]
      Range (min … max):    1.401 s …  1.808 s    10 runs

    Summary
      'NODE_ENV=production node ./fdir.mjs' ran
        1.58 ± 0.16 times faster than './bin/bfs /home/thecodrr/ -false'

    $ cat fdir.mjs
    #!/usr/bin/env node

    import { fdir } from "fdir";

    console.log(await new fdir().onlyCounts().crawl("/home/thecodrr").withPromise());
For some reason, reducing the UV_THREADPOOL_SIZE to 2 gives the best result on my machine (I have heard the opposite in case of macOS):

    $ hyperfine -w1 "UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs" "NODE_ENV=production node ./fdir.mjs" "./bin/bfs /home/thecodrr/ -false"
    Benchmark 1: UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs
      Time (mean ± σ):     355.8 ms ±  16.1 ms    [User: 479.4 ms, System: 356.0 ms]
      Range (min … max):   328.3 ms … 387.5 ms    10 runs

    Benchmark 2: NODE_ENV=production node ./fdir.mjs
      Time (mean ± σ):     935.4 ms ±  52.7 ms    [User: 695.8 ms, System: 1176.5 ms]
      Range (min … max):   850.6 ms … 1031.9 ms    10 runs

    Benchmark 3: ./bin/bfs /home/thecodrr/ -false
      Time (mean ± σ):      1.534 s ±  0.104 s    [User: 0.353 s, System: 2.307 s]
      Range (min … max):    1.428 s …  1.773 s    10 runs

    Summary
      'UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs' ran
        2.63 ± 0.19 times faster than 'NODE_ENV=production node ./fdir.mjs'
        4.31 ± 0.35 times faster than './bin/bfs /home/thecodrr/ -false'
Another factor to take into account is that I ran all this on a WSL instance which may or may not affect the performance. However, since both programs are running on WSL, the results should be accurate.


At a glance, 7.6 million files in under 2.xx seconds using bfs. And fdir 1 million in 1 second. So bfs is 2-3 times faster than fdir.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: