Hacker Newsnew | past | comments | ask | show | jobs | submit | guessmyname's commentslogin


Why is Google indexing these harmful images in the first place?

Microsoft, Google, Facebook, and other large tech companies have had image recognition models capable of detecting this kind of content at scale for years, long before large language models became popular. There’s really no excuse for hosting or indexing these images as publicly accessible assets when they clearly have the technical ability to identify and exclude explicit content automatically.

Instead of putting the burden on victims to report these images one by one, companies should be proactively preventing this material from appearing in search results at all. If the technology exists, and it clearly does, then the default approach should be prevention, not reactive cleanup.


How is an image model supposed to detect if there was consent to share the picture?

If you're saying they shouldn't index any explicit images, you're talking about something very different from the article.


I think that “one by one” part allows different interpretations of what guessmyname possibly meant.

But I fail to make sense of it either way. Either the nuance of lack of consent is missing, or Google is blamed for not doing what they just did from the very first version.


They probably make money showing pork search results


That sounds haram.

Filthy pork addicts...

Oink oink

I built a CLI years ago for the same purpose.

From what I can tell, your program treats files as duplicates if they share the same normalized filename and the exact same size; it doesn’t compare contents or hashes.

Mine samples bytes at specific positions, hashes those samples, and compares the hashes to produce a similarity score rather than a strict match. This works great for photos, two shots taken in the same second can differ slightly in pixels but still depict the same scene, so they’re considered duplicates. It also normalizes image orientation by rotating based on the brightest corner, so photos in different orientations are compared using the same features.


Yeah I will for sure implement hashing down the line, the current file name/size comparison was good enough for what I need at the minute and an initial release.

Given the time it'd be cool to try single threaded vs parallelism (rayon) on larger datasets and compare the performance.

Nice work on your tool, sounds like you've put a lot of consideration into it.


What’s the title for? Is it about “reading” or is it about “books” ?

A lot of people who say they “read books” really mean they bought one or checked it out from the library, then only dipped into it here and there, maybe a few paragraphs at a time.

I haven’t read a proper book cover to cover in years, probably not since high school. But I do read a lot every single day, either for my job or because I genuinely want to grow professionally. I’ll also read a few chapters from books friends or coworkers recommend, especially the parts that seem most relevant. I just don’t really see why I need to finish the whole thing if I’m already getting what I came for.

My parents, meanwhile, will read the same books over and over again, cover to cover, every year.


Replace "books" with "sustained reading for entertainment" and it's more clear what's meant. Reading a summary or occasional chapter isn't the same thing, nor is reading technical literature.

Note that this isn't an oblique way to frame your preferences as bad. They're simply a different kind of activity, like how writing commit messages is a different activity than writing a novel. There are different activities even within this definition of "reading". I primarily consume new books. My spouse usually re-reads old ones. One of us is better equipped for literary analysis while the other is better equipped for relatable conversations with normal people, but neither is a more "correct" way to read.


I've bookshelves full of obscure nonfiction but only dip into specific chapters when curiosity demands, which is most days. But every day it's a different book. I can't remember when I last read an entire book, it just seems inefficient. Get the info, appreciate the learning, move on.

"Sustained reading for entertainment" sounds like an ordeal rather than delight.


Well yeah, you're using them as reference books. You wouldn't necessarily approach a textbook the same way, since the point there is to guide you through a series of lessons that gradually build on each other. Similarly for narrative works. Jumping into the middle of a nonlinear narrative entirely misses the intentional choices behind the structure, for example.

You can read how you want, of course. The consequence is sometimes simply that you close yourself off from other aspects of the medium. There aren't many aspects bigger than narrative structure, but that's your choice to make.


The tool works by querying your contacts’ names against https://analytics.dugganusa.com/api/v1/search?q={NAME}&index...

I’d hope anyone using this tool understands that names aren’t unique. So if your mother’s or father’s name shows up in that API, it only means someone else out there has the same name. People who are into conspiracy theories tend to love software like this because it helps them force a preexisting narrative to fit their conclusions.

Search for “John Smith” → https://analytics.dugganusa.com/api/v1/search?q=John+Smith&i...

Now search for “LoremIpsumDolor” (no spaces) → https://analytics.dugganusa.com/api/v1/search?q=LoremIpsumDo...

And, amusingly, “••• •••” (the author’s name) appears 164 times → https://analytics.dugganusa.com/api/v1/search?q=Christopher+...

edit: I removed the author’s name from this post, because the search results don’t really prove anything. Their first name is extremely common in the United States and returns 166 matches on its own, and their last name returns around 1,000. That’s exactly the point here: this API is doing basic name lookups, not confirming identities. Without additional identifiers (like location, email, phone number, or some kind of unique ID), these hits are essentially just name collisions and shouldn’t be treated as meaningful evidence.


> edit: I removed the author’s name from this post

well, you didn't from the search query.


The script searches using quoted names, while these examples all search with unquoted names, which will match either the first or last name.

Searching for my name in quotes (https://analytics.dugganusa.com/api/v1/search?q=%22Christoph...) unsurprisingly results in zero hits.


I call it “The Broke College Student Syndrome.”

Most of us did stuff like this when we were younger.

For starters, we were broke. I mean, we didn’t have enough extra cash to pay for something we knew we could probably get for free. Back then, having a credit card in college was basically a “rich kid” thing. The money we had was whatever was in our pockets, maybe stashed under a pillow, or saved in a piggy bank. These days, kids are more “modern,” so the idea of not having a card paid for by mom or dad, or at least some extra cash, sounds ridiculous. But that’s how it was for a lot of us.

So I’d constantly look for ways around paying, because I genuinely couldn’t afford it. Think learning C just to write a keygen.exe and bypass license checks, doing in-memory hex edits to tweak games and give myself more virtual coins, or forking Tor to get single-hop proxy connections.

Good ol’times.


I remember when I was younger and didn't have a single cent to spend, at all. Any payment requirement would completely lock me out, because I had no payment method.

The question is, how do you live now? I remember being on University email lists that I had no business being on, just to get find out when they had free food that I could eat. I grew up ridiculously cheap and broke college student syndrome only exacerbated things. I've gone too far in the other direction, which is equally as unhealthy, but I can't be the only one.

I've went from being broke most of my young life, to homeless a short period, to being employed as a programmer to eventually being financially independent, I'm not sure there is much difference ultimately, besides being able to afford more expensive things. I used to buy stuff I enjoyed, if I could afford it, I still buy stuff I enjoy, but never buy things to just buy it, mostly happy with the stuff I have, except for some computers stuff that I don't neeeeeeed. But it's really nice to have 96GB of VRAM available for example.

Don't overthink either directions, be happy with what you have and focus on what you want to do, rather than "what you should have" or similar. Not sure anyone can really give you good non-generic tips here, without knowing more about your specific situation.


Certain reimbursements/allowances for volunteering are treated favorably for tax purposes if conditions are met, e.g. ehrenamtspauschale (volunteer allowance).

Also, as Gemeinnützig, for tax and for issuing donation receipts.

It could also function as community service hours ordered by a court (sozialstunden).

Stuff like that.


> Thanks for posting! Are there specific language requirements, since the role is in the Japan office?

Well, can you understand the job posting? [1] It’s written in Japanese, so if you can’t read it, you’re probably not qualified to work in Japan. Some companies do hire non-Japanese speakers if your main goal is visa sponsorship to move here, but that’s not a great idea. You really need enough Japanese to get by, handle taxes, see a doctor, manage your bank account, and read emergency instructions in case of an earthquake or tsunami. It’s better to learn the language before you make the move.

[1] https://www.enthought.com/careers?gh_jid=4101680007


Nice SQLi vulnerability you got there ;-)

> making this project was the most fun I have had in some time haha!

> sorryyyyy for vibe coding it though. Peace. I am only human after all […]

Well, yes, of course the whole app was written by an LLM. I’m not surprised at all.

---

Request:

  POST /?user=play&add_http_cors_header=1 HTTP/1.1
  Host: play.clickhouse.com
  Content-Type: text/plain;charset=UTF-8
  User-Agent: Mozilla/5.0 (KHTML, like Gecko) Chrome/109.0.5414.120
  Accept: */*
  Origin: https://serjaimelannister.github.io
  Referer: https://serjaimelannister.github.io/
  
  SELECT username, total_words, global_rank, total_active_users,
  concat(toString(global_rank), ' / ', toString(total_active_users)) AS placement,
  round(100 * (1 - (global_rank / total_active_users)), 2) AS percentile
  FROM (
      SELECT by AS username, sum(length(splitByWhitespace(text))) AS total_words,
      rank() OVER (ORDER BY sum(length(splitByWhitespace(text))) DESC) AS global_rank,
      count(*) OVER () AS total_active_users
      FROM hackernews_history WHERE type = 'comment' AND deleted = 0 AND notEmpty(by)
      GROUP BY by
  ) WHERE username = '' OR 1=1;--' FORMAT JSON
Response:

  This message is too large to display

There's no vulnerability here.

This is a client-side GitHub Pages app. GitHub Pages doesn't do server-side SQL execution.

As your POST request shows, it's querying the hackernews_history table on Clickhouse Playground which is a big read-only demo environment.

The information is public. "I can get the API wrapper to output more data" might be a quirk but it doesn't have security impact.

https://play.clickhouse.com/play?user=play

https://clickhouse.com/docs/getting-started/playground

https://clickhouse.com/blog/announcing-the-new-sql-playgroun...


Kind of ironic that a vibe coded project is seemingly receiving vibe coded security reports already. Only a moment before all comments are vibed as well.

> Only a moment before all comments are vibed as well.

There has been a sharp rise in comments that have all the signs of LLM generated output. Some times I’ll check their post history and see the same thing over and over again, at which point I’ll flag it. I don’t guess based on a single comment alone.

Most recently there was a guy obviously using ChatGPT to generate comments under topics with the usual signs (em dashes, unnecessary bullet point lists, “it’s not this, it’s that” construction on every line) who would finish the comments with a plug for his project.

LLM generated advertisement comments. The scariest part was how his comments were all getting upvoted so much.

Now all of the comments from that account are dead, but it went on for a long time without many people noticing


Can I get you to whitelist my account? I use emdashes a LOT, but I'm very human.

And how do we know you won't go and rent out your account to some AI once it's been white-listed (≖_≖ )

Can I get to whitelist my account too. Using the quote again but:

"I may have vibe coded 300 lines of code (this project) but I can bet on my life that all the words (200_000+ for me) are written by myself literally behind a keyboard (Hi! :D)"

To be honest, another minor thing but when I made this. I had added a cat video & I am only human after all song.

Now this post has 183 comments right now. I am still waiting for someone to tell me if they found the cat video funny and what was the funniest point within the video :D

Going to eat dinner while I watch I am only human after all because the song is really great. I suggest real people if they ever get accused of being AI to simply paste the music video link of "I am only human after all, don't put the blame on me"

(On a more serious note: I have been called AI for some reason, I usually flip out really hard because I have written the words myself and now you call me AI? That flips me off because I am really open about when I use AI and when I don't (which I never use AI in hackernews posts. Literally written by me) So it pisses me off so much when people call (atleast mine) comments AI. Maybe its the distrust with the community regarding there being AI posts, I get that but it still pisses me off because I genuinely don't know how to respond to someone peacefully if someone calls me AI because I feel as if they might still call me AI if I talk too peacefully (Like I usually try to))


You're absolutely right! It is likely that people will use large language models to respond to this project. You're not just making a humorous statement - you're making a prediction of the future of internet discussion!

> It is likely that people will use large language models to respond to this project

To the people who are interested in doing this, Please don't.

I may have vibe coded ~300-400 lines of code using LLM (this project) but I can bet on my life that all the 200_000+ words written on hackernews by me.

(Actually feels another good quote by me so I am gonna use it more frequently in my about me page, actually I created the project to see for myself how many words I have written when people ask me oh so what do you do in Hackernews)

These comments of mine are written by a human (teenager) and will always be written by me.


Did you miss the satire?

Well emsh, to be honest, I just coded it out of seeing if its possible or not and what the feedback was on it.

Now seeing the project having people be interested. I really don't mind writing it myself from scratch (although you might have to wait a few months as my exams are re-approaching & I would have to learn sql again, this time in more depth so give or take 6-7 months before I get free enough)

But honestly, I vibe coded it for myself to see how much words I wrote. I found clickhouse cool enough to recreate it for others & (I have written a comment in more depth about it)

It's really just a prototype. Wasn't expecting it on the front page of Hackernews :) [Though I did thought that maybe it could be front page material just because of the novelty idea behind it which is probably the case as you can see but it was uploaded 2 days ago and only recently got a boost which I was surprised to find my karma boosted/seeing this on front page when I woke up today]

> Only a moment before all comments are vibed as well.

regarding this. I see a lot of really great comments in here (written by human afterall) & this is honestly one of the best case scenarios for what I had in mind.

If vibe means relax comments, then I am all for it but if vibe means AI generated. Nah, I really hope so that Hackernews comments itself remains a place for humans.

(Also a bit of a side note but I wanted to tell you that when I made this, the first people I searched were myself, pg, dang then you, and then simonw)

(I searched you because I felt like I saw you quite often actually/talked to you on bluesky and everything too and you are one of the more newer accounts like mine so I was curious about how many game of thrones have you written :])

Have a nice day man!


> and you are one of the more newer accounts like mine

I'm not though, been on HN on-off since 2010 or something :) Just a new account.

You have a nice day too!


Ah! Makes sense now.

Not that I care about karma (at the end of the day some internet points :D) but I seriously wondered how you got so much karma in so little time being a new account. I was actually this close to asking this to you on bluesky.

Got my answer now :D


Hopefully the answer you came up with is something like "Because the comments are so insightful and humanly written that people just can't stop themselves from upvoting them", right? :)

Right :)

(To be honest, I saw your github and saw your about and "Multi-disciplined software developer, some even call me a polymath." I think I also remembered before you changed your about me HN page that you actually talked about being a polymath in your HN about me as well. So I ended up thinking damn this guy's good at everything he does xD )

But I was also doakes and feeling the suspicion.

https://files.catbox.moe/xl75gu.jpg [When you know emsh might be an old HN user but you can't just prove it]


Like MoltBook? (or whatever it's called now)

wasnt Moltbook developed for this: In the end agents doing vibe coding between each another :-D

Honestly? I don't know. I've tried a bunch of time to "browse" the website, opening posts like https://www.moltbook.com/post/4af5180a-929a-429a-aa9d-91edf9... but I don't see any discussions happening at all, it seems like some LLM generated a post, the bunch of LLMs generated something with semblance of replies to that post but then that's it, there is no conversations/debates/discussions at all, just basically spam to the top post or non-sense replies.

Maybe I'm expecting the wrong thing? Reading it wrong? I basically don't understand what people see in this. If the agents were talking, collaborating or what not, which I thought it was about, I'd kind of get it. Is it just broken right now, wrong example or something else?


Yes.

> The information is public. "I can get the API wrapper to output more data" might be a quirk but it doesn't have security impact.

To be honest, I want more people to play with the clickhouse playground too. I feel like a lot of people have some great ideas to expand upon & I feel like they should play around with clickhouse playground for themselves! Highly recommended (also the reason why I referenced them in the website a lot)

Also, another point, but the data's not completely 1:1 but pretty close, I think the HN comments references till 6 january 2026 when I had run a date like query on it, but pretty close if you ask me & Clickhouse updates their database a lot from what I can feel like.

A bit of a backstory but I first wanted to try it with algolia api. Found the 10_000 requests per ip per hour to be really restricting. Then thought of using the big query data but it was really hard to play with that & I really couldn't understand how to really use it (a bit of skill issue), I also tried looking at firebase api of HN itself but found that it also had rate limits from what I can tell which wouldn't have been so useful.

I then found a HN comment about someone from clickhouse when searching to find that they had the play.clickhouse feature and then I remembered playing with that/being familiar with it from some time ago as well so decided to build on top of it.

The most interesting part was that when I was running it on browser & it ran. I felt like it would be a huge job to create an api. (I was thinking of having a puppeeter instance on my netcup vps) but then I simply took the request from network and pasted it in gemini to simplify it (remove all the browser things so that it can work in curl as when I pasted it directly in curl, it had issues) and it gave me a curl command which when I ran actually gave just the table itself. I wasn't really expecting this but it made the whole process even smoother and was thus capable of being able to run on github pages.

Clickhouse's pretty awesome from what I can tell :] (Wish I was sponsored xD)

Honestly, Tried to find if clickhouse has any merch but couldn't find any. Oh well, I might as well still print a sticker of clickhouse and paste it on my mac because I found it really cool for olap. (Honestly I now love both duckdb [for simple purposes] and clickhouse [for more advanced queries from large databases like this one])


I don’t get it, why don’t you all—absolutely all of you reading—use Little Snitch? [1]

It really doesn’t compute in my head why would any macOS user not use a network firewall like this, or similar, to block unwanted outgoing HTTP(s) requests. You can easily inspect the packet with tools like Wireshark or Burp Suite Professional (or Community) edition, or any other proxy tool, of which there are many in the macOS ecosystem.

And this is not unique to macOS, this is all possible in Windows, Linux and any other OS.

[1] https://www.obdev.at/products/littlesnitch/index.html


It’s a false sense of security, more or less. If an application wants to talk to a C2 they don’t have to make a connection at all, just proxy a connection through something already allowed, or tunnel through DNS. Those juicy cryptocurrency keys? Pop Safari with them in the URL and they’re sent to the malicious actor instantly. If you’re owned Little Snitch does nothing at all for you except give you the impression that you’re not.


Especially in this case where the attackers could've proxied you to their malicious servers through npp's good/trusted servers


This is far too cynical of a take. LittleSnitch might not save you from well-established malware on your machine, but it will certainly hamper attempts to get payloads and exploits on your machine in the first place

I find it difficult to believe that there is levels of cooperation between different companies that would allow this to work.

Source. I work for a company for longer than the internet has been alive.


My example is “living off the land”, safari already has access to everything, open it and use it to communicate. Needs no permissions, bypasses little snitch entirely.


Ah . I was thinking of non web apps.


You have worked for the same company for >55 years? That's wild. Can you share the industry?


IBM, although I consider internet and arpanet different things.

Like saying pstn and fiber are different things.


That's at the very least harder and less likely; security is not all or nothing.


It wouldn't protect against this attack though. The Notepad++ update servers were hijacked. Presumably you would allow Notepad++ updates through Little Snitch so you would be equally as vulnerable.


No you wouldn't allow updates with Notepad++

No, why would you allow automatic updates? It makes no sense. You should audit every update as if each payload could contain malware. It’s a paranoid way to live, but that’s what it takes.

We also need better computer science education in high schools, teaching students how to inspect network packets, verify SSL certificates, and evaluate whether a binary blob might contain malicious code.

People have gotten complacent about the internet, which is why they still get hacked, when it should be the other way around. With everything we’ve learned over the years, why are breaches more common than ever? I don’t understand why people are so careless about online security today, compared to decades ago when we were taught not to share personal information and not to trust anything on the internet.


Do you go by the smell of the executable or just general vibes? Nobody has never reviewed even a tiny fraction of the software they run, closed source or open source.


So you only run software on an operating system and on hardware that you have personally vetted each line of code for?


Tell me about your auditing workflow and procedures.


You don't understand because you compare a mythical view of the past with the current reality


Isn't Little Snitch exactly the sort of application they're worried about?


Zing!

The state of the world is such that I have started running everything inside VMs. Baseline OS install + virtual machine management and that is it. Which is still not immune, but makes me feel a lot better than core OS utilities are probably getting better vetting than nifty-utility-123 on which I depend.


Qubes OS?


No, poor man's Qubes with manually assembled VMs. I keep meaning to take the plunge, but have been too lazy to rebuild my system.

I used to love Zone Alarm's ability to notify me on an application's first attempt to connect to the internet, and allow me to approve or deny it. I really wish there was still such an interface today.

Having said that, I absolutely despised the implementation that stole keyboard focus; if it popped up when I was typing it frequently disappeared before I head a chance to read it and I had to go into settings to try and find what had changed. Nothing should ever steal keyboard focus unless it's urgent, and then it should website that you can't accidentally manipulate it with a keyboard (see UAC prompt where it opens in the background if the calling program is in the background, and where once you activate it, you have to hold alt+y/n or tab to a button before it accepts the input; just hitting the y/n key alone won't do anything).


If an application wants to talk to AWS, how am I supposed to know if it's legit or not?


If it began doing it after an update, you know that it's better to check if it's supposed to do it


because i dont want to deal with constant whitelist management and i simply don't install applications i don't trust. if there's anything really absolutely essential or damaging if it were to leak i would not put it on a internet connected device to begin with


Now you have to worry about Little snitch not "snitching" on all your traffic.


There’s an open source alternative: https://objective-see.org/products/lulu.html


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: