Hacker Newsnew | past | comments | ask | show | jobs | submit | noiszytech's commentslogin

This is an awesome (and thorough) response and a great idea. I totally agree about the arms race and how AI + crowdsourced data could be applied to create much more realistic fake viewing patterns. I'm super-glad that this conversation is taking place.

But, wouldn't collecting viewing habits and then using AI to define (and emulate) real-looking behavior immediately put the developer(s) in that moral grey area that so many algorithms occupy? Technically it could be done, and it would be fascinating to work on, but we'd have to start with a huge browsing dataset (creepy) and then process it to figure out the patterns (exactly what this tool is trying to subvert), and then feed that back as output from within the user's browser (probably feeding back indistinguishable-but-AI-driven data and creating a loop). It's a murky space to wade into, and one that needs a lot more conversation.

Instead I decided to just keep it simple. The first page is chosen randomly from a list of (user-approved) sites. A link on that page is chosen randomly from the list of links that open in the same window and point to the same domain, and that's clicked. That's repeated a somewhat-random number of times, usually about 2-7 times, before a new site is chosen from the user-approved list, and the process starts over.

Check out Cathy O'Neil's definition of "Weapons of Math Destruction" (good overview of her book here: http://money.cnn.com/2016/09/06/technology/weapons-of-math-d...) - I'd love to hear your thoughts on that framework for determining the morality of algorithms.


Hey there - I like your project. Sounds like our tools are complementary, and you make an interesting point about the browser-vs-plugin based approach. I do think it's worthwhile to specifically browse news sites - I believe they're the worst "filter bubble" offenders right now, and this can help break that. The more efforts there are in this space, the better. Thanks for sharing your project!


Cool! I really think this is a "the more the merrier" space. More tools = good.


It does suck some bandwidth - and your point is well taken, that would be a great technique too. It loads about one new page per minute, so it shouldn't be a crazy amount (unless pages contain video, for example). The thing is, anything less than visiting from your own browser on your own machine and doing user-like things (like waiting and then clicking again) is easily filter-out-able for most tracking tools. The idea here is to purposefully associate the "noise" with your online footprint.


Hi - Noiszy creator here. You're right about the scripts - we're using GTM and Google Analytics. I think it's 100% legit for sites (like mine) to want to know how much traffic they're getting - this to me is a good use of data. I work in the analytics field and actually I love data - I just don't like AI using data in creepy ways. The thing is, there is a big moral grey area between "count visits" and "creepy targeting based on everything you do". My hope is that by disrupting the data, we can make the conversation happen about the right things to do with algorithms. I don't have the answers but I hope to be a part of the discussion.


> I think it's 100% legit for sites (like mine) to want to know how much traffic they're getting - this to me is a good use of data.

As others have pointed out, it's not about you getting the data, it's about the Analytics services you're using getting the data.

I'll also point out that if all you want is to know how much traffic you're getting, you can do it with far more ease by simply looking at your server logs. Why server-side analytics aren't used more I have no idea.


Your website appears to also be sending data to facebook with App ID 314192535267336. This might be some squarespace thing. It qualifies as "creepy" to me that you're tracking more than you even know, and it's connected back to my facebook identity!


I get it. Everyone, everywhere, is collecting data - whether it's CNN.com or Noiszy.com or Google Analytics. While you may be using Google Analytics for basic tracking of anonymous visitors, Google may be using it for "creepy targeting based on everything you do".

I'm really not opinionated (I think the plugin is interesting and am completely ok with being tracked) I just found it a bit hypocritical to create a product against tracking while simultaneously providing tracking data to Google.


I think of it as a product to deter AI processing of your data, not a product against data itself. I, too, am ok with being tracked - but being ok with data collection =/= blanket approval of all data processing & use.

Thanks for raising this point.


Google Analytics most certainly does feed user data into "an AI" and use it to track users and sell ads targeting them. I'm not sure what you're getting at here.


Does Google Analytics' privacy policy even allow Google to do that?


They are certainly allowed to according to their TOS: https://www.google.com/analytics/terms/us.html

"Google and its wholly owned subsidiaries may retain and use, subject to the terms of its privacy policy (located at https://www.google.com/policies/privacy/), information collected in Your use of the Service"

Also read: https://www.google.com/policies/privacy/partners/


Check out Piwik (https://piwik.org/)! It'll do what you need but the data at least stays on your server.


It's not about your own intentions. You are using third party services which track and collect data for their own purposes.


The problem is, Noiszy is corrupting your own web analytics data and the analytics data of all other participants in the web while doing fake visits.

This is all or nothing. You can not separate the "good" use of data vs the "bad" use. It's all the same data. Only the usage is different.


You're right that it's the same data, but I think we absolutely must address how to separate the "good" use of data vs the "bad" use. Especially since, for most people, the data is already out there and there's no way to roll that back. I hope we're going to be talking a lot more about the ethics and morality of algorithm development.


Hi, Noiszy creator here. You're totally right - Firefox is next!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: