With respect to online conversations - most of them are on the open-web, anyone can see them. I don't care if their content gets out. Private conversations should be kept between their participants, their host, and their host's infrastructure provider.
More saliently however, many of these screenshots contain incidental data which I wouldn't necessarily want to be centralized off of my own hardware. This ranges from the identities of multiple alt-accounts, who they follow on social media, to generic information about my social graph. They also include receipts of much of my online transaction history.
While I'm under no delusion that much of that data doesn't travel all over the universe via data brokers and information sharing agreements, I'm just not comfortable directly handing it all to any one company.
If I was working on a commercial project, I'd leap at the opportunity to outsource the task of content transcription - it would save me time, money, and quite probably give me better results.
But since I want to feed it all into my personal archive, which runs on my own hardware and is as much a learning project as it is a utility, and since I like to keep my personal life as personal as possible, I make a point of keeping everything self-hosted wherever possible.
I'll fully admit that it's paranoid, labor-intensive, likely ineffectual, and by most measures a bit excessive.
But there are few places where one is at liberty to draw a line in the sand anymore with how their data is distributed. This is simply where I've chosen to draw one of mine.
Look I fully agree with you if that is what you want, and you are fully aware of the trade-off you make.
When you pull this off you are a very talented skilled engineer. I hope you open source your solution so friction is removed for other people with a similar dilemma in the future.
Our time is the only currency we have and we can pursue activities we love or fear. The line between paranoia or choosing for personal freedom is thin and very personal.
I came to the conclusion for myself I have spend to much time on home grown solution for problems others have solved better and cheaper. Getting from it works 80% of the time to 99% and I can blindly trust my infra is the difference between a weekend and year fulltime work.
I choose for G Suite because at least Google offers me a paid option to exclude my account from their advertisement data monetizing branch.
I do really respect that you make a deliberate effort in this.
With respect to online conversations - most of them are on the open-web, anyone can see them. I don't care if their content gets out. Private conversations should be kept between their participants, their host, and their host's infrastructure provider.
More saliently however, many of these screenshots contain incidental data which I wouldn't necessarily want to be centralized off of my own hardware. This ranges from the identities of multiple alt-accounts, who they follow on social media, to generic information about my social graph. They also include receipts of much of my online transaction history.
While I'm under no delusion that much of that data doesn't travel all over the universe via data brokers and information sharing agreements, I'm just not comfortable directly handing it all to any one company.
If I was working on a commercial project, I'd leap at the opportunity to outsource the task of content transcription - it would save me time, money, and quite probably give me better results.
But since I want to feed it all into my personal archive, which runs on my own hardware and is as much a learning project as it is a utility, and since I like to keep my personal life as personal as possible, I make a point of keeping everything self-hosted wherever possible.
I'll fully admit that it's paranoid, labor-intensive, likely ineffectual, and by most measures a bit excessive.
But there are few places where one is at liberty to draw a line in the sand anymore with how their data is distributed. This is simply where I've chosen to draw one of mine.