Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience it's not that there's better quality results found, but rather low quality results are skipped. There's a lot filtered by default + it's easy to click "block this domain" when you run into yet another stackoverflow copy. It means that when you're searching for code related things, you often get small relevant blogs in position 3+ rather than SEO spam.

For example searching for "current time on JavaScript" on Google, I get SO, MDN, and basically a lot of SEO spam sites. Same thing on Kagi https://kagi.com/search?q=current+time+in+JavaScript&r=au&sh... ends with an actually interesting blog on position 5, link to moment.js on GH, further down posts about accuracy and about the Temporal API proposal, etc.



> it's easy to click "block this domain"

Friendly reminder that google had this feature a decade ago then removed it. Hopefully someone in the C-Suite got a few back pats for that decision.

https://searchengineland.com/google-brings-back-blocking-sit...


Google's gotta make money off Pinterest somehow, based on the most blocked domains on Kagi:

https://kagi.com/stats

Pinterest is a scourge on the modern web, worse than any other.


Might these stats be biased by the demographics that use Kagi? I don't use Pinterest, and never really encounter it on a daily basis, but some of my friends really like the site. Tbf, they might use stackoverflow more, but seeing Hacker News on the top 5 doesn't seem to reflect the average web user's usage...


I used pinterest and still hate(d) when it popped up on search.

Without too much conjecture I think the problem is search-related web crawlers and users have very different experiences with pinterest. To the web crawler, the information is there and easily accesible. To a user, it might be behind a login, or part of an image description, or a sidenote, or whatever. The page doesn't exactly load with what you are looking for front and center

Additionally, I don't use it through a browser, I use it in an app, so im not logged in on my browser.


I only recently discovered pinterest as a useful site, but only because a friend convinced me to create an account. It only becomes usable with an account and it even makes fun, but most people are probably like I was and don’t want to create an account and are annoyed of pinterest hijacking the „save image“ function, redirecting you when you don’t have an account and nagging you with a login wall. It really only becomes great if you give in (only took me 5 years or so).


There was a time where Google image search was dominated by Pinterest results but clinking through never took you to the photo. And most of the photos were rehosting of the original so you wouldn't get that either.


I wonder if it is also biased in the sense that it's only certain people who customise these things. And that it's only the most common ones. You would never see my favourite car forum listed here, but bumping it in search results is where I see value with the feature.


> Might these stats be biased by the demographics that use Kagi?

yes, obviously.

but also: Selection Bias Is A Fact Of Life

https://www.astralcodexten.com/p/selection-bias-is-a-fact-of...


Do your friends who use it use Google to use it? Or do they go to the site itself.

I don’t use the site itself but the irritating thing is a first page full of useless Pinterest links when searching for something.


Of course they are biased. Just look at how NYTimes is both blocked and upvoted on Kagi. Some people like a source and others don’t, it’s that simple really.


For me, I ignore NYT because I'm not a subscriber and it's annoying to always hit the paywall.


It's worth checking your library for a digital pass to the NY times and other papers.


It's telling that the top sites on that list accurately map to companies that have been the most "successful" at blitzing the incentive structures of the current internet economics model.


I don't understand why people hate Pinterest in image search results. If the image is relevant to my search then I don't care who's hosting it. Can somebody explain to me what the problem is here? https://0x0.st/HO57.webm

(I don't have a pinterest account, if that makes a difference.)


I don't like pinterest because the images have no metadata. If I see something I'm interested in--like a piece of furniture--I have no way of knowing how to get more info on it.

Same thing with those displays that rotate earthporn with no info on the location. So annoying to see spectacular things and not know what they are.


It is indeed, hence “https://www.startpage.com/do/search?q=%s -site:quora.com -site:pinterest.com” as my default search syntax.


I have a strong memory that Quora used to be quite good, and then was suddenly horrendous. Is this just me?


Quora was excellent for SEO, so brands and companies started using Quora answers to pump their own products and services. It's completely useless now.


I wouldn't say that's the problem I've observed. around 2015 or so I remember consistent useful Quora answers from Quora. it was Yahoo answers but better because people could justify with qualifications. I actually had an account briefly

then I'm not exactly sure when, but definitely by 2018, every answer I ever see on Google is either incorrect, answering the wrong question, or in broken English. commonly all three at once

I don't recall ever seeing a product placement, although admittedly I stopped clicking a long time ago

my educated guess is that due to some likely seo-related concern they deleted/archived/unlisted their old good quality answers in favour of newer more seo-friendly answers

it may not even be their fault, it's possible that Google's algorithm just doesn't drag up older answers from their website, but given my experience with the decision-making in the brief time I was a user there (e.g. removing the ability to add a description to questions) I suspect not


Not just you, it used to have solely high quality Q&A...then the yahoo answers folks migrated over


if this was the main cause, the expected result would be lots of highly-upvoted bad answers, as opposed to lots of scarcely-upvoted bad answers that somehow rank highly in Google searches


They started monetizing it by paying people to write answers, so it's now full of blogspam.


Quora used to be good but then they had to try to make revenue


Quora has plenty of potential for making consistent profit without whoring itself out, but consistent profit isn't enough in the post-Freedman world that we live in


I had to quit doing so, because I discovered that it didn't just exclude listed domains, but performed a totally different search. Locations or local results were largely missing, when I excluded some domains.


That's curious, please expand on this. Do you really mean it performed a search for different thing? If so - have you figured how it differed?

Or did it just have to perform a non-cached search and thus not only excluded said domains, but could also reorder the other results based on the current relevance, rather than cached relevance from the past that is being served to everyone else who doesn't exclude said domains?


When I search for “pizza hut”, I get: the info panel that shows Wikipedia intro and company social media profiles on the right, locations results and integrated map view as the second result. When I search for “pizza hut -site:pinterest.com”, I get none of those. In addition, results are listed in a different order.

PS: I'm blocking all ads.


Pinterest uses different domains for every country so you'd need to add around a dozen qualifiers to get the big ones excluded.


same here


I was wondering what would be the impact of being in the front page of HN, it's awesome that their stats are open: almost 2k new members!


I’d argue that Google is worse than Pinterest. We got here due to Google.


To be fair, google DID provide quality at scale in the past. I'm not sure the same can be said about Pinterest.


Due to linkrot, a lot of images that stood on independent websites, are now only available on Pinterest, which scraped and cached them before the original site went dark. Clicking through to the original links these days often leads to 404s.


The uBlacklist browser plugin adds support for blocking sites in Google searches.

https://iorate.github.io/ublacklist/docs


This looks really good, thanks. Works on other engines too. Goodbye Quora results.


will check it out, thanks!


This seems the most obvious value add feature for search results at both an individual level and for reviewing overall moderation.

I wonder what possible logic there could be to not allow it? The only one I can think of is they don't want bridgading to create a wider system block but that seems easily enough to resolve.


Eventually someone was going to create an easy to list/share/subscribe list that individuals could easily add to their personal Google domain block list. Think EasyList.

At that point they would be bleeding ad revenue as all the nasty, fake, abusive, spammy websites would be insta blocked.

Imagine being able to add a list and all of a sudden half the SEO blogs are excluded from results. Assuming Google even allows it, they would then have to work even harder to find relevant content to your search query. They can't rely on throwing a huge wall of semi-relevant results that you have to wade through, generating ad impressions as you go along.


The easiest is to look where the money is. Might be risky to let your custo^H products hide the pages where you have most revenue generating ads.


Counterpoint: That feature has very little utility to all but a tiny fraction of users. Those users can readily find other means (e.g. extensions) to achieve the same thing. In the interest of simplicity, it was the right call to remove this. I imagine it was pitched for its ability to gather feedback on search quality, but the type of people using the feature aren't representative.


> when you run into yet another stackoverflow copy

OMG. Why doesn't Google filter out the likes of geeksforgeeks for instance? How is it possible that it always come before the genuine SO answer?

Even without offering the possibility to filter out a domain (which they had, and later removed), how does the ranking algorithm not see those horrible, zero value clones??


Misaligned incentives (in corporate terms, $$$).

I can't tell you what they are, but there are probably internal Google incentives to filter and internal Google incentives to not filter, and the ones to not filter are probably stronger.


My theory is that google went from ads in search results to ads on visited pages. By buying doubleclick etc they are suddenly incentivised to drive traffic to ad-supported websites.

Almost all the interesting factual websites are not ad-monetized. The SO spam etc are all scraps of the factual websites with ads injected. If google simply deprioritized ad-supported websites the search results would be much cleaner, but the part of google that sells the ads on sites instead of in search results would throw a fit.


We could test this. Take a few hundred search queries, strip the pages that display Google ads, and see if the remainder of the search result is better or worse.

We'd need to get some humans in to rank the results, but that's not a big problem. "How well does this web page answer this query, on a scale of 1-10?"

With a collection of ranked pages, we can answer other questions as well. I'd be interested in running the same test but for google analytics, not google ads, as I think there might be a misaligned incentive there too.

It's worth bearing in mind that the stackoverflow clones may actually answer the query just as well as the original site - that is, it might be our definition of "a good result" that's out of whack (because we have an unnecessary bias towards the original source). I doubt this, but again it's something that's testable.


Google searches are ranked by humans, it’s a contractor job


I don't doubt it, but obviously something's going wrong between the human-generated training data and the SERP, else why are we getting utter crap back?

(Or, as I said, it's our idea of what constitutes a good result that's wrong).


Aha. This makes a lot of sense for Google.

But the same websites show up in e.g. DDG (through Bing), as far as I know neither DDG nor Microsoft make a dime from ad-supported websites like Google would, why are these results not nuked similarly to what Kagi is doing?


Aha. Couldn't help but scratch my own itch. I wonder if DDG has a deal with Google where they get a cut of the ad profit if they are mentioned as a `ref` in the doubleclick ad request.

:path: /pagead/viewthroughconversion/796001856/?random=1695374589838&cv=11&fst=1695374589838&bg=ffffff&guid=ON&async=1&gtm=45be39k0&u_w=2704&u_h=1756&url=https%3A%2F%2Fwww.geeksforgeeks.org%2Fc-plus-plus%2F &ref=https%3A%2F%2Fduckduckgo.com%2F. <<<< What does this do? &hn=www.googleadservices.com&frm=0&tiba=C%2B%2B%20Programming%20Language%20-%20GeeksforGeeks&auid=68284397.1695374483&data=event%3Dgtag.config&rfmt=3&fmt=4

Hence providing the same incentives to keep shitty sites like geeksforgeeks in the results.

I guess also geeksforgeeks is incentivized to report these references, so that search engines and other linking services will continue to show their links.

To reproduce: 1. Go to duckduckgo.com and do a search that will turn up a geekforgeeks website 2. click on the link 3. watch the network tab as requests are made to googleads.g.doubleclick.net and check the path.


Most other search engines train with a target of google or with some form of reward which is bootstrapped on google rankings. It makes Bing results implicitly have the same behavior as Google. DDG and others just use BingAPI so googles incentives pass on through.


That doesnt make much sense to me. Google's interests are not microsoft's or DDG's interests and to hold up Google as some sort of ground truth in what the optimal search results for a given query are is, as proven by Kagi, highly deluded and also quite subjective.

If true however, it does go to show that Google is really a monopolist in the search space as well... and to substantiate this claim would go a long way into proving that.


These sites exist precisely _because_ of their expertise in the toxic race-to-the-bottom SEO/SEM game that Google created.


What I don’t get is how many people are looking for stackoverflow answers while a)not aware of so copycats and b)not running adblockers


Adblockers are not a defense against this, as those results are genuine search results.

I run uBlock origin (of course), am extremely aware that geeksforgeeks exist and is utter shit, and yet I get fooled now and again, which makes me very angry at that website, Google, myself, and the world in general...


But to make money those sites have to show ads


If I ran a seal-clubbing business I'd have to club seals to make money. The whole argument is that those sites don't exist to provide a good service yet sadly need to show ads to keep the lights on.


I’m just wondering to whom those ads get shown… not arguing that anyone should turn off their adblocker and keep them running

They are working hard to trick people into clicking on their links, but won’t most people who click those links be running an ad blocker? Are unsophisticated web users searching for questions answered on stack overflow?


This is my experience as well. Low quality junk is often not present, and if it does show up, it's two mouse clicks to never see that domain again.

Also the ability to promote high quality domains helps even more with this (though i have found one needs to be careful with pinning domains, as it can lead to irrelevant results being shown first because they have some if the same keywords).


Feels like I'm paying to do someone elses job.


Well, your alternative is that the job doesn't get done, for free.


> Well, your alternative is that the job doesn't get done, for free.

You can do it with uBlacklist [1]. See also [2].

[1]: https://chrome.google.com/webstore/detail/ublacklist/pncfbmi...

[2]: https://github.com/rjaus/awesome-ublacklist


I didn't know about that, thanks! Excellent.


> yet another stackoverflow copy

I never got why these even ever appear in Google search results (or any search results, really). It feels like it would be super trivial to identify sites that are scraped copies of other sites. Granted, without foreknowledge, the engine doesn't know which is the original. But at the very least this can be determined by a human once, and then the problem goes away forever for that particular site.


Maybe that scraped copy leverages doubleclick so its success is aligned with Google interests, sometimes even more than the original website.


That ship has already sailed, they are already using AI in mass to generate original looking content.


Trying to find any guides for anything in Baldur's Gate 3 returns page upon page of AI generated garbage, a sure sign of things to come.


Funny that you mention this game. bg3.wiki, the community wiki had a lot of troubles with SEO. It got ignored or pushed down in the search results for a very long time, while the awful Fextralife wiki that includes a Twitch view botting iframe on every pages was always first.


Sadly even the Fextralife wiki (garbage that it is) is better than most of the other results and that's still drowned out by the AI spam.


At this point it's just safer to treat any content newer than about a year ago as highly suspect. Bots and fake content have been around for years, but things changed when ChatGPT and the copycats went live.



Which is French for "in mass".


The blue ribbon chef was said to be the cream of the cream, so the restaurant owner was happy for him to have white card over the place. He arranged an outside the work of fatty liver, a main course of rooster of wine with eat all, and as the blow of mercy: burned cream; the full menu was a feat of strength! He made sure to wish the diners good appetite. However, when the owner visited from her foot on the ground she turned into a terrible child and demanded mouth amusers and crescents. She hated the decorative objects of art made of chewed paper.

(When we steal from French, we don't translate it to English, it becomes English).


Well, there are loanwords, and there are calques.


it's kind of upsetting that the first to benefit from LLMs are the scum of the internet.


It's googles fault. They are the ones who make this a viable business model. They pay the ads, and they pollute their search results with this garbage.

100% Google who are destroying this part of the internet.


Google gave and google took


The love of money


How can the search engine not able to tell who the original is? Originals always exist earlier, not to mention SO.com domain rank is way higher than those spammed sites that existed for less years.


Even if it wasn't easy to detect SO rip-offs, surely Google engineers see them all the time when they perform searches.


Is this after you've done a lot of blocking (or other customisation)? For me the top Kagi results are mostly similar to the Google ones, and when I scroll down a bit Kagi doesn't save me from articles with openings like

> Time is an important part of our life and we cannot avoid it. In our daily routine, we need to know the current date or time frequently.

and

> Time, a measure of the passing of moments.


Not too much. I've got maybe 5 straight up SO clones blocked, but that's about it.


> “searching … on Google, I get SO, MDM and basically a lot of SEO spam sites”

I get the same in Kagi clicking your link above.

Both the 1st and 3rd result is SO. 2nd result is MDM.

I’m confused, so what’s different between paid Kagi and free Google search then?

(Note: I’m not hating on Kagi, I’m just genuinely wanting to understand)


SO, MDM are the good results together with the blog Kagi gave in 5th place. In google you got the first two then a lot of spam and not the other good results.


That's still an example of better quality results that should be quantifiable, that's ranking. We have things like precision@n/NDCG@n/etc. where it should be straight forward to show a metric for some smaller n where Kagi beats Google since it doesn't show some set of irrelebant/low quality results interspersed.


> yet another stackoverflow copy

I get those in Google as well. But tbh, I don't care. If I'm looking for "current time in JavaScript", I don't care if the answer comes from stackoverflow or any of it's clones. It's not like I want to interact with that site somehow. I just want answers. If I want interaction, I obviously go to stackoverflow directly.

It might matter that I'm using Ad-blockers, so maybe if I didn't, those sites would feed me obnoxious popups and malware, but as it stands, I don't see any difference...


I just did exactly this search on Google. The first result was this -[0] which is exactly spot on. Not sure if it is because I use Brave browser which also blocks ads on websites.

[0] - https://tecadmin.net/get-current-date-time-javascript/#:~:te...


The result on Google is indeed correct, but I was posting a trivial example that was supposed to show the variety of answers/sources not the accuracy of the top one. For that, they're fairly similar, although Kagi seems to prefer the higher signal-to-crap ratio.


You clearly use an ad blocker. That page is over 50% ads.


Wow. I just loaded it and then turned off the adblocker and reloaded it. It's like you need another search engine just to find the content in the page hidden amongst all those ads.

I can't believe some people actually use the internet like that all the time.


Yes people are. And it’s the least technically literate people with “outdated” machines and bad connections that slug through the web like this. They don’t know what to trust and often fall prey to deceptive tactics.


There's also a class of people who can just filter out all those distractions much better than many of us, and have a high tolerance for slowness.


Strange, I don't see a single ad for this query (current time in javascript), using Chrome logged into my Google account. Results are good as well.


Well I did not see any thanks to Brave and the content was spot on and that is all that really matters to me


My goodness, I thought you were exaggerating. I've been using ad blockers for so long, I forgot the web had this many ads. Or has it just gotten worse over time?


Well, I still remember times when 50% of google results weren’t ads.

Interestingly, Bing almost doesn’t display search ads, and the search results are becoming even better than Google. I haven’t had a need to use google for a few months now.


I wonder if adblockers have contributed to this. In theory we users can reward non-horrible advertisers by whitelisting their ads, but in practice we tend to block as much as possible. The remaining ad-viewing audience will be partly composed of people who are ethically opposed to adblocking or are held back by a lack of tech knowledge, but it will also be relatively insensitive to ads (both in the sense of being able to put up with a lot, and in the sense of requiring a lot to attract their attention).


> ethically opposed to adblocking

How can you be ethically opposed to something that ruins your experience? It’s obviously their choice, I just can’t imaging browsing without adblock, I’m ethically opposed to the pages filled with crap I guess.


Those who go through the effort to make good web content, and who pay the costs for a web server deserve to be paid. So ethically I should not block ads.

I block them anyway because ads also have an ethical contract with me that they have broken. They need to not take up too many resources on my computer, not make noise when the website otherwise has no noise content, not install malware, and be for legitimate products not scams. Probably more as well, but the above are things I regularly caught ads doing before I got an ad blocker.


If Chrome, Edge, Safari all came with uBlock by default, what percentage do you think would be "ethically opposed" enough to disable the extension? How many would turn it right back on?


I think it depends on the site. I remember early 00s where many download sites would have ads with a download button, or pop-ups that blasted sound like YOU JUST WON. Now I think that sorta thing has been normalized to even non shady sites. My primary use of ad blocker is so that I don't get random autoplaying videos.


It has gotten significantly worse.


And by that, you are telling google you liked that result (by clicking on it), even if in the end ads revenue is not increased by your visit. Maybe Google consider less important signals coming from ad-blocking browsers, that I don't know.


But with an extension I can have a personal garbage block list or hide/collapse website preview without removing the result completely that works on other search engines

Btw, can you hide text preview on Kagi instead of removing the domain completely (in case you're not certain the website is garbage and sometimes want to check the results, but just want them less visible)?


Wait? I can kill yummly results when looking for a recipe? Ok, I'm in.


The fact that Kagi has a blocklist tells me their algo isn’t any superior.

The best kind of search engine is the kind that can read your mind (by inferring your intention or something)


Maybe for you that is better, but I want word negation to work, and "verbose" to actually require words I specify to be on the page. Word stemming would be good as an option.

Sure, I'd like Booleans to work again, and intitle:.

That said, Google could probably make an inferred search interpretation work well if they wanted to return results that were good for the user rather than return results that optimise their ad revenue.


Why stop there? The best mind-reading search engine is one that doesn't even let me type queries, it tells me what I need to know before I even know I need to know it. The fact that all search engines still have query fields tells me they all still suck at reading my mind.


I feel that Google's going downhill ever since they started to try reading my mind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: