> Why not compute the file hash on your local machine before encryption, and check that hash against a master dupe list (hash, dupe_count) of all hashes from all users' pre-encrypted local files?
You could do this, but it would still be possible to determine which users have a copy of a particular file (or a piece of a file).
> Secondly, I cannot see how this requires there to be an index of users hashes. Surely one could store hashes with reference count, increment when a user adds, decrement when a user deletes. The user ID isn't necessary for a reference counter.
On the surface, it looks like this would discredit the first claim that I've just made. I think though that in reality it could be detected. For example, the Government could require them to wait and watch until a user downloads a file (or piece of a file) keyed by the hash of the piece whose owners need to be identified. Given that this is feasible, I don't think that there is any point implementing this measure, and it would help to maintain data integrity by not doing it.
You could do this, but it would still be possible to determine which users have a copy of a particular file (or a piece of a file).
> Secondly, I cannot see how this requires there to be an index of users hashes. Surely one could store hashes with reference count, increment when a user adds, decrement when a user deletes. The user ID isn't necessary for a reference counter.
On the surface, it looks like this would discredit the first claim that I've just made. I think though that in reality it could be detected. For example, the Government could require them to wait and watch until a user downloads a file (or piece of a file) keyed by the hash of the piece whose owners need to be identified. Given that this is feasible, I don't think that there is any point implementing this measure, and it would help to maintain data integrity by not doing it.