Why is Google indexing these harmful images in the first place?
Microsoft, Google, Facebook, and other large tech companies have had image recognition models capable of detecting this kind of content at scale for years, long before large language models became popular. There’s really no excuse for hosting or indexing these images as publicly accessible assets when they clearly have the technical ability to identify and exclude explicit content automatically.
Instead of putting the burden on victims to report these images one by one, companies should be proactively preventing this material from appearing in search results at all. If the technology exists, and it clearly does, then the default approach should be prevention, not reactive cleanup.
I think that “one by one” part allows different interpretations of what guessmyname possibly meant.
But I fail to make sense of it either way. Either the nuance of lack of consent is missing, or Google is blamed for not doing what they just did from the very first version.
From what I can tell, your program treats files as duplicates if they share the same normalized filename and the exact same size; it doesn’t compare contents or hashes.
Mine samples bytes at specific positions, hashes those samples, and compares the hashes to produce a similarity score rather than a strict match. This works great for photos, two shots taken in the same second can differ slightly in pixels but still depict the same scene, so they’re considered duplicates. It also normalizes image orientation by rotating based on the brightest corner, so photos in different orientations are compared
using the same features.
Yeah I will for sure implement hashing down the line, the current file name/size comparison was good enough for what I need at the minute and an initial release.
Given the time it'd be cool to try single threaded vs parallelism (rayon) on larger datasets and compare the performance.
Nice work on your tool, sounds like you've put a lot of consideration into it.
What’s the title for? Is it about “reading” or is it about “books” ?
A lot of people who say they “read books” really mean they bought one or checked it out from the library, then only dipped into it here and there, maybe a few paragraphs at a time.
I haven’t read a proper book cover to cover in years, probably not since high school. But I do read a lot every single day, either for my job or because I genuinely want to grow professionally. I’ll also read a few chapters from books friends or coworkers recommend, especially the parts that seem most relevant. I just don’t really see why I need to finish the whole thing if I’m already getting what I came for.
My parents, meanwhile, will read the same books over and over again, cover to cover, every year.
Replace "books" with "sustained reading for entertainment" and it's more clear what's meant. Reading a summary or occasional chapter isn't the same thing, nor is reading technical literature.
Note that this isn't an oblique way to frame your preferences as bad. They're simply a different kind of activity, like how writing commit messages is a different activity than writing a novel. There are different activities even within this definition of "reading". I primarily consume new books. My spouse usually re-reads old ones. One of us is better equipped for literary analysis while the other is better equipped for relatable conversations with normal people, but neither is a more "correct" way to read.
I've bookshelves full of obscure nonfiction but only dip into specific chapters when curiosity demands, which is most days. But every day it's a different book. I can't remember when I last read an entire book, it just seems inefficient. Get the info, appreciate the learning, move on.
"Sustained reading for entertainment" sounds like an ordeal rather than delight.
Well yeah, you're using them as reference books. You wouldn't necessarily approach a textbook the same way, since the point there is to guide you through a series of lessons that gradually build on each other. Similarly for narrative works. Jumping into the middle of a nonlinear narrative entirely misses the intentional choices behind the structure, for example.
You can read how you want, of course. The consequence is sometimes simply that you close yourself off from other aspects of the medium. There aren't many aspects bigger than narrative structure, but that's your choice to make.
I’d hope anyone using this tool understands that names aren’t unique. So if your mother’s or father’s name shows up in that API, it only means someone else out there has the same name. People who are into conspiracy theories tend to love software like this because it helps them force a preexisting narrative to fit their conclusions.
edit: I removed the author’s name from this post, because the search results don’t really prove anything. Their first name is extremely common in the United States and returns 166 matches on its own, and their last name returns around 1,000. That’s exactly the point here: this API is doing basic name lookups, not confirming identities. Without additional identifiers (like location, email, phone number, or some kind of unique ID), these hits are essentially just name collisions and shouldn’t be treated as meaningful evidence.
Most of us did stuff like this when we were younger.
For starters, we were broke. I mean, we didn’t have enough extra cash to pay for something we knew we could probably get for free. Back then, having a credit card in college was basically a “rich kid” thing. The money we had was whatever was in our pockets, maybe stashed under a pillow, or saved in a piggy bank. These days, kids are more “modern,” so the idea of not having a card paid for by mom or dad, or at least some extra cash, sounds ridiculous. But that’s how it was for a lot of us.
So I’d constantly look for ways around paying, because I genuinely couldn’t afford it. Think learning C just to write a keygen.exe and bypass license checks, doing in-memory hex edits to tweak games and give myself more virtual coins, or forking Tor to get single-hop proxy connections.
I remember when I was younger and didn't have a single cent to spend, at all. Any payment requirement would completely lock me out, because I had no payment method.
The question is, how do you live now? I remember being on University email lists that I had no business being on, just to get find out when they had free food that I could eat. I grew up ridiculously cheap and broke college student syndrome only exacerbated things. I've gone too far in the other direction, which is equally as unhealthy, but I can't be the only one.
I've went from being broke most of my young life, to homeless a short period, to being employed as a programmer to eventually being financially independent, I'm not sure there is much difference ultimately, besides being able to afford more expensive things. I used to buy stuff I enjoyed, if I could afford it, I still buy stuff I enjoy, but never buy things to just buy it, mostly happy with the stuff I have, except for some computers stuff that I don't neeeeeeed. But it's really nice to have 96GB of VRAM available for example.
Don't overthink either directions, be happy with what you have and focus on what you want to do, rather than "what you should have" or similar. Not sure anyone can really give you good non-generic tips here, without knowing more about your specific situation.
Certain reimbursements/allowances for volunteering are treated favorably for tax purposes if conditions are met, e.g. ehrenamtspauschale (volunteer allowance).
Also, as Gemeinnützig, for tax and for issuing donation receipts.
It could also function as community service hours ordered by a court (sozialstunden).
> Thanks for posting! Are there specific language requirements, since the role is in the Japan office?
Well, can you understand the job posting? [1] It’s written in Japanese, so if you can’t read it, you’re probably not qualified to work in Japan. Some companies do hire non-Japanese speakers if your main goal is visa sponsorship to move here, but that’s not a great idea. You really need enough Japanese to get by, handle taxes, see a doctor, manage your bank account, and read emergency instructions in case of an earthquake or tsunami. It’s better to learn the language before you make the move.
> making this project was the most fun I have had in some time haha!
> sorryyyyy for vibe coding it though. Peace. I am only human after all […]
Well, yes, of course the whole app was written by an LLM. I’m not surprised at all.
---
Request:
POST /?user=play&add_http_cors_header=1 HTTP/1.1
Host: play.clickhouse.com
Content-Type: text/plain;charset=UTF-8
User-Agent: Mozilla/5.0 (KHTML, like Gecko) Chrome/109.0.5414.120
Accept: */*
Origin: https://serjaimelannister.github.io
Referer: https://serjaimelannister.github.io/
SELECT username, total_words, global_rank, total_active_users,
concat(toString(global_rank), ' / ', toString(total_active_users)) AS placement,
round(100 * (1 - (global_rank / total_active_users)), 2) AS percentile
FROM (
SELECT by AS username, sum(length(splitByWhitespace(text))) AS total_words,
rank() OVER (ORDER BY sum(length(splitByWhitespace(text))) DESC) AS global_rank,
count(*) OVER () AS total_active_users
FROM hackernews_history WHERE type = 'comment' AND deleted = 0 AND notEmpty(by)
GROUP BY by
) WHERE username = '' OR 1=1;--' FORMAT JSON
Kind of ironic that a vibe coded project is seemingly receiving vibe coded security reports already. Only a moment before all comments are vibed as well.
> Only a moment before all comments are vibed as well.
There has been a sharp rise in comments that have all the signs of LLM generated output. Some times I’ll check their post history and see the same thing over and over again, at which point I’ll flag it. I don’t guess based on a single comment alone.
Most recently there was a guy obviously using ChatGPT to generate comments under topics with the usual signs (em dashes, unnecessary bullet point lists, “it’s not this, it’s that” construction on every line) who would finish the comments with a plug for his project.
LLM generated advertisement comments. The scariest part was how his comments were all getting upvoted so much.
Now all of the comments from that account are dead, but it went on for a long time without many people noticing
Can I get to whitelist my account too. Using the quote again but:
"I may have vibe coded 300 lines of code (this project) but I can bet on my life that all the words (200_000+ for me) are written by myself literally behind a keyboard (Hi! :D)"
To be honest, another minor thing but when I made this. I had added a cat video & I am only human after all song.
Now this post has 183 comments right now. I am still waiting for someone to tell me if they found the cat video funny and what was the funniest point within the video :D
Going to eat dinner while I watch I am only human after all because the song is really great. I suggest real people if they ever get accused of being AI to simply paste the music video link of "I am only human after all, don't put the blame on me"
(On a more serious note: I have been called AI for some reason, I usually flip out really hard because I have written the words myself and now you call me AI? That flips me off because I am really open about when I use AI and when I don't (which I never use AI in hackernews posts. Literally written by me) So it pisses me off so much when people call (atleast mine) comments AI.
Maybe its the distrust with the community regarding there being AI posts, I get that but it still pisses me off because I genuinely don't know how to respond to someone peacefully if someone calls me AI because I feel as if they might still call me AI if I talk too peacefully (Like I usually try to))
You're absolutely right! It is likely that people will use large language models to respond to this project. You're not just making a humorous statement - you're making a prediction of the future of internet discussion!
> It is likely that people will use large language models to respond to this project
To the people who are interested in doing this, Please don't.
I may have vibe coded ~300-400 lines of code using LLM (this project) but I can bet on my life that all the 200_000+ words written on hackernews by me.
(Actually feels another good quote by me so I am gonna use it more frequently in my about me page, actually I created the project to see for myself how many words I have written when people ask me oh so what do you do in Hackernews)
These comments of mine are written by a human (teenager) and will always be written by me.
Well emsh, to be honest, I just coded it out of seeing if its possible or not and what the feedback was on it.
Now seeing the project having people be interested. I really don't mind writing it myself from scratch (although you might have to wait a few months as my exams are re-approaching & I would have to learn sql again, this time in more depth so give or take 6-7 months before I get free enough)
But honestly, I vibe coded it for myself to see how much words I wrote. I found clickhouse cool enough to recreate it for others & (I have written a comment in more depth about it)
It's really just a prototype. Wasn't expecting it on the front page of Hackernews :) [Though I did thought that maybe it could be front page material just because of the novelty idea behind it which is probably the case as you can see but it was uploaded 2 days ago and only recently got a boost which I was surprised to find my karma boosted/seeing this on front page when I woke up today]
> Only a moment before all comments are vibed as well.
regarding this. I see a lot of really great comments in here (written by human afterall) & this is honestly one of the best case scenarios for what I had in mind.
If vibe means relax comments, then I am all for it but if vibe means AI generated. Nah, I really hope so that Hackernews comments itself remains a place for humans.
(Also a bit of a side note but I wanted to tell you that when I made this, the first people I searched were myself, pg, dang then you, and then simonw)
(I searched you because I felt like I saw you quite often actually/talked to you on bluesky and everything too and you are one of the more newer accounts like mine so I was curious about how many game of thrones have you written :])
Not that I care about karma (at the end of the day some internet points :D) but I seriously wondered how you got so much karma in so little time being a new account. I was actually this close to asking this to you on bluesky.
Hopefully the answer you came up with is something like "Because the comments are so insightful and humanly written that people just can't stop themselves from upvoting them", right? :)
(To be honest, I saw your github and saw your about and "Multi-disciplined software developer, some even call me a polymath." I think I also remembered before you changed your about me HN page that you actually talked about being a polymath in your HN about me as well. So I ended up thinking damn this guy's good at everything he does xD )
Honestly? I don't know. I've tried a bunch of time to "browse" the website, opening posts like https://www.moltbook.com/post/4af5180a-929a-429a-aa9d-91edf9... but I don't see any discussions happening at all, it seems like some LLM generated a post, the bunch of LLMs generated something with semblance of replies to that post but then that's it, there is no conversations/debates/discussions at all, just basically spam to the top post or non-sense replies.
Maybe I'm expecting the wrong thing? Reading it wrong? I basically don't understand what people see in this. If the agents were talking, collaborating or what not, which I thought it was about, I'd kind of get it. Is it just broken right now, wrong example or something else?
> The information is public. "I can get the API wrapper to output more data" might be a quirk but it doesn't have security impact.
To be honest, I want more people to play with the clickhouse playground too. I feel like a lot of people have some great ideas to expand upon & I feel like they should play around with clickhouse playground for themselves! Highly recommended (also the reason why I referenced them in the website a lot)
Also, another point, but the data's not completely 1:1 but pretty close, I think the HN comments references till 6 january 2026 when I had run a date like query on it, but pretty close if you ask me & Clickhouse updates their database a lot from what I can feel like.
A bit of a backstory but I first wanted to try it with algolia api. Found the 10_000 requests per ip per hour to be really restricting. Then thought of using the big query data but it was really hard to play with that & I really couldn't understand how to really use it (a bit of skill issue), I also tried looking at firebase api of HN itself but found that it also had rate limits from what I can tell which wouldn't have been so useful.
I then found a HN comment about someone from clickhouse when searching to find that they had the play.clickhouse feature and then I remembered playing with that/being familiar with it from some time ago as well so decided to build on top of it.
The most interesting part was that when I was running it on browser & it ran. I felt like it would be a huge job to create an api. (I was thinking of having a puppeeter instance on my netcup vps) but then I simply took the request from network and pasted it in gemini to simplify it (remove all the browser things so that it can work in curl as when I pasted it directly in curl, it had issues) and it gave me a curl command which when I ran actually gave just the table itself. I wasn't really expecting this but it made the whole process even smoother and was thus capable of being able to run on github pages.
Clickhouse's pretty awesome from what I can tell :] (Wish I was sponsored xD)
Honestly, Tried to find if clickhouse has any merch but couldn't find any. Oh well, I might as well still print a sticker of clickhouse and paste it on my mac because I found it really cool for olap. (Honestly I now love both duckdb [for simple purposes] and clickhouse [for more advanced queries from large databases like this one])
I don’t get it, why don’t you all—absolutely all of you reading—use Little Snitch? [1]
It really doesn’t compute in my head why would any macOS user not use a network firewall like this, or similar, to block unwanted outgoing HTTP(s) requests. You can easily inspect the packet with tools like Wireshark or Burp Suite Professional (or Community) edition, or any other proxy tool, of which there are many in the macOS ecosystem.
And this is not unique to macOS, this is all possible in Windows, Linux and any other OS.
It’s a false sense of security, more or less. If an application wants to talk to a C2 they don’t have to make a connection at all, just proxy a connection through something already allowed, or tunnel through DNS. Those juicy cryptocurrency keys? Pop Safari with them in the URL and they’re sent to the malicious actor instantly. If you’re owned Little Snitch does nothing at all for you except give you the impression that you’re not.
This is far too cynical of a take. LittleSnitch might not save you from well-established malware on your machine, but it will certainly hamper attempts to get payloads and exploits on your machine in the first place
My example is “living off the land”, safari already has access to everything, open it and use it to communicate. Needs no permissions, bypasses little snitch entirely.
It wouldn't protect against this attack though. The Notepad++ update servers were hijacked. Presumably you would allow Notepad++ updates through Little Snitch so you would be equally as vulnerable.
No, why would you allow automatic updates? It makes no sense. You should audit every update as if each payload could contain malware. It’s a paranoid way to live, but that’s what it takes.
We also need better computer science education in high schools, teaching students how to inspect network packets, verify SSL certificates, and evaluate whether a binary blob might contain malicious code.
People have gotten complacent about the internet, which is why they still get hacked, when it should be the other way around. With everything we’ve learned over the years, why are breaches more common than ever? I don’t understand why people are so careless about online security today, compared to decades ago when we were taught not to share personal information and not to trust anything on the internet.
Do you go by the smell of the executable or just general vibes? Nobody has never reviewed even a tiny fraction of the software they run, closed source or open source.
The state of the world is such that I have started running everything inside VMs. Baseline OS install + virtual machine management and that is it. Which is still not immune, but makes me feel a lot better than core OS utilities are probably getting better vetting than nifty-utility-123 on which I depend.
I used to love Zone Alarm's ability to notify me on an application's first attempt to connect to the internet, and allow me to approve or deny it. I really wish there was still such an interface today.
Having said that, I absolutely despised the implementation that stole keyboard focus; if it popped up when I was typing it frequently disappeared before I head a chance to read it and I had to go into settings to try and find what had changed. Nothing should ever steal keyboard focus unless it's urgent, and then it should website that you can't accidentally manipulate it with a keyboard (see UAC prompt where it opens in the background if the calling program is in the background, and where once you activate it, you have to hold alt+y/n or tab to a button before it accepts the input; just hitting the y/n key alone won't do anything).
because i dont want to deal with constant whitelist management and i simply don't install applications i don't trust. if there's anything really absolutely essential or damaging if it were to leak i would not put it on a internet connected device to begin with
Also → https://medium.com/@hypsypops/axes-of-evil-how-to-lie-with-g...
More → https://researchguides.library.yorku.ca/datavisualization/li...
And → https://vdl.sci.utah.edu/blog/2023/04/17/misleading/
reply