The purpose of running SHA is to determine whether the input can be trusted thou...

thayne · on Oct 21, 2022

And even if they are, you probably aren't going to do a digest on the whole thing all at once.

kevincox · on Oct 21, 2022

IDK, I see a lot of "non-production" code that likes to read whole files into a in-memory buffer rather than bothering with streaming. With modern memory sizes it is completely possible that you could read a 4GiB file into memory without having issues.

Too · on Oct 22, 2022

Thinking about it, it's strange that hashlib doesn't have a function that accepts a file-object as input, to make the efficient method default pit of success. Naive implementations like hash(f.read()) are all too common to see. Such an api would also allow the entirety of the tight loop to be done outside of slow interpreted code.

Guessing the rationale is that if you read a file, you also want to use the content for something else, not only compute its hash.