Back when I was an SRE at google for their web crawler, I thought they should de-index pinterest (did not get any traction with this). I still think they should de-index pinterest. I add "-site:pinterest.*" to all my image searches.
You would have made my week if you have managed to convince them to do that. At the worst point Pintrest's spamming of Google Image Search pretty much rendered the service next to useless.
It wasn't a "growth hack" it was spam, plain and simple.
Yeah I don’t get it. Panda was supposed to clean up results - and it did for a while. Seems priorities changed. Kinda makes me wonder if someone higher up at Google is invested in Pinterest.
Panda was lead by Amit Singhal who was very serious about search relevance. He was kicked out for his sexual behavior though (and also because the leadership wanted to go towards more complex machine learning, while Amit believed in the more explainable models at that time).
They probably look at the metrics and see that a lot of people are going to the Pinterest results. That the results are garbage and undermine the Google images search experience is much harder to measure.
If only there was a way to track the number of seconds they spent on pinterest until the back button was hit and... factor that in as a feature in ML. voila.
I don't think it's that simple. When a user goes back to image search results, is it because they are unsatisfied with the result they clicked on, or are they looking for even more, hoping to compare? And when a user closes out instead of doing back, is it because they got what they wanted or because the results were so garbage they gave up? Based on my own usage of image search results I don't think you would be able to tell when I was happy with the results or when I wasn't. But you can probably tell that my usage of image search went down when it started getting shitty due to this spam.
What are your use-cases for image search? Mine are pretty straightforward; find a higher quality/uncropped version or to find the source (article, recipe, or photo set). Use case #1 I click through save and exit. Use case #2 I engage like I would for web (text) search.
#1 might be confused for not finding the intended result, but could be discerned after looking a bit more.
Also, if a lot of people add in "-site:pinterest.*" (or click on pinterest sites but come back, then find the intended result) that might be a hint it's not adding real value.
I think my use cases are pretty varied, anything from looking for pictures of a place I'm considering going hiking, looking for different views of a product I'm considering purchasing, looking for other examples of an artists art, to trying to figure out where some image originated
No one said it had to do with AI. The point being made above is that while you and me may dislike Pinterest, the data may be showing that to most users, it's actually very valuable.
If anything, ML would be able to detect my and your use cases and intelligently hide Pinterest for us while hiding it for others. So chances are they aren't using ML yet which is why it's just favoring the majority rather than us (the minority).
It could be better than what we have now, assuming somewhat smart AI. Right now, data gives plausible deniability, and is the universal scapegoat - you get to do whatever you want, backing this up with misapplied A/B tests and telemetry, with zero statistical rigor. And if someone complains about your decisions, you point to these tests and that telemetry, and cry "we're a data driven company, and the data proves we're right".
I mean, it would help to be honest about the metrics that the A/B tests are evaluated against. If you optimize purely for growth and engagement, then your test results will reflect that - even moreso if your tests are scientifically sound.
I'm continually surprised how companies manage to sell "let's try to keep users glued to the screen as long as possible" as doing something for the general good.
I have little hope that the people not understanding their data today and lacking any statistical rigor will be able to train an AI to do better. Figuring out whether an AI gives you garbage results seems to be a harder problem than figuring out whether your own results are garbage.
But isn’t this a sign that it’s actually useful for people? At the same time I have to admit that google image search is probably their worst vertical.
If people are actually navigating to the results and not bouncing immediately due to the login wall that is a sign that people are finding them useful.
Is it that weird to believe that lots of people have Pinterest logins?
With image search more than other types of search IMO, because the source of the image is less prominent than the image itself, so ignoring specific domains is harder. I've clicked Pinterest links over and over in google search when on my phone when I had no interest in them, but it's hard to avoid because it isn't always clear what is what.
I run a plugin in Chrome that does that automatically for image searches. The pinterest results are pure trash and never provide me with what I'm looking for, but take up an excessive quantity of results for every search.
I wish we could do the same for Shutterstock and all these useless stock image websites that have totally hacked Google image search yet Google won't do anything about them for some reason.
How do you search for previews of commercially available images then?
What I would like is an open protocol for media licenses. Some media-license.txt file with names of images (and wildcards) and license info.
hero.jpg CC0
*.PNG PD # public domain
# A free preview of a proprietary image.
Teaser.jpg preview
Everything not marked is considered proprietary, like now by default.
Then sites like shutterstock could clearly mark their previews of real images, and galleries of free images could mark their images as free to use, etc. This could be reflected in the search engines' UI.
Huh? I don't see anything wrong with that. If you search for an image and Google shows you that photo on a stock imagery site, I'd count that as a win -- they've correctly pointed you to the source of the image, rather than a downstream licensee of the image.
No like pinterest these sites too have heavily gamed image search and often end up pushing relevant search results.
Also the results are often useless, for example they upscale 320x240 images into several resolutions like 1280x720, etc just to make sure it beats all search filters.
And the many TV/film sites that list "release date", "air date", renewal, season N+1, and so forth, in their page titles & tags even though the page does not contain that information.
Just mentioning that Pinterest used to boast on their engineering blog how they designed the non-dismissable nag-screen [1] and how they increased engagement by making their landing pages more confusing [2].
This is fascinating in a potentially terrible way. In summary: when an image is posted, a reverse-image search is done and top results are scraped then added to a "More like this" section for the post. This ranks highly because it's exactly what Google already associates with that image, except now all in one page instead of across multiple pages.
Applying this same method to other content (blog posts/pages and video posts) would, presumably, also work. The terrible part of all this, is that it would create more junk posts ranking higher and stratifying the word association of that content with the respective search terms. New content could end up having zero chance of ever ranking highly because another factor of result ranking is content age (older content is weighted heavier).
> Applying this same method to other content (blog posts/pages and video posts) would, presumably, also work.
I believe reddit started recently doing something like this for text and as a result has made google searching for reddit posts essentially useless.
Basically when you view any post/thread while logged out there are a bunch of other threads shown on the page that have related text, with all the text of their posts included collapsed in the HTML.
The result is that when you search for the context of any reddit post you get hundreds of results which on vaguely similar topics which don't contain the post that you're looking for (unless you log out and count hidden text collapsed under other threads).
I assumed that the practice was intentional to make reddit be more heavily represented in google results.
Reddit staff doesn't actually use the site that much (or at least that was my impression a couple years ago after having a meeting in the office and finding that I knew a lot more about the meme-art that users sent them than their staff did)-- so it wouldn't be shocking that they'd be indifferent to making google search unusable for users.
For some years now, Google Images has been pretty awful to navigate, all thanks to Pinterest. I do hope Google is just unaware of the mentioned "hack", because the other option is they do not care because Pinterest helps them through ads.
Note: I have not checked whether Pinterest has Adsense since I use Adblock. I'm going by some of the comments here.
A couple of days ago I was trying to find out better image search websites because google removed the feature to find images of an exact size. I was brought to Bing, but it also lacks that feature. It kept me up that night thinking how to make a better service and whether I'd have to scrape Google, which made me think of webcrawlers, site indexing... It's turtles all the way down, I can't imagine a solo developer (or startup) pulling it off. I know there are decentralized search attempts, but anything decentralized simply does not work for 2020.
So I ask here on HN, how could a power-user friendly http image search service work without depending on big corps? Is it just impossible and we need to keep praying billionaire CEOs will listen to us or it's simply something no one has done yet?
The first problem is monetization. Do you think people would pay a yearly subscription fee for a better image search? I'm not sure they would, but maybe.
This doesn't make a lot of sense to me, and it doesn't seem to be backed by any evidence, either in the original Twitter thread nor the linked blog post.
I think far more likely what's happening is Pinterest is actively crawling the web for images and surrounding text. They will then perform their own reverse image search on _their own_ database, and add this context text to their existing copy of the image, or create a new one if one is not present.
Why would they rely on a user submitting a picture then go to the trouble of reverse searching it? I think they are much more proactive than that.
A lot of business accounts sign up to activate Rich Pins which adds a few additional features when people pin from you site.
When you are signed up for this they will scrape the page the pinned image comes from to collect some extra info on the page to show along with the pin.
I’m not sure how if they do this if you are not signed up for Rich Pins though.
Pinterest does a lot of SEO stuff that is interesting.
For one of my boards that ranks #1 in Google I've found that the page Google indexes is quite a bit different than the one I see as a logged-in user.
One of the differences is that they display the text content associated with the pin. This is also used as the image alt text, but then appended with a bunch of keywords.
They also link to other people's boards which have names related to the images so it looks like "tags", but I have the feeling it is probably a mix of keyword stuffing/linking to other content for Google to follow.
The page title is also adjusted to include something like, "237 Best ________ images in 2020" followed by the board name.
In your opinion, is this a growth hack you'd suggest one of your customers try? Or does it cross a line in your opinion?
"SEO Optimization" is difficult because it's such an imperfect science (or perhaps not, based on your company name!) What are a couple examples or anecdotes of SEO "hacks" / optimizations that are least obvious or things most people wouldn't think of, that have had the most significant impact on SEO?
"Growth hack" is an industry euphemism for "unethical practice." You can tell it doesn't literally mean "growth hack" because it's being applied here to the website that Alexa currently ranks #152 worldwide and the typical excuses about how you need to get a little dirty when you're a tiny business just getting started don't remotely hold water.
Not all growth hacking is unethical. WePay's $100 bills in a big block of ice dropped off at the PayPal conference comes to mind. Or Dropbox giving you extra free quota for going through their tutorial, installing on a new device, or inviting a friend (where the friend also gets extra free quota).
I don't about the other ones, but giving you extra when you get someone else to join is unethical. MCI started this trend with their "friends and family" promotion, and it really pissed people off. People who don't use the system hate being pestered by friends and family to join something they don't want or need.
I've got something I've always been curious about. If the website was penalized by Google, is it possible to rank _after_ the penalty is lifted?
Edit: the story behind this question. A friend of mine has a local business. He once paid an SEO agency to do optimizations for his website. After a while, the website was penalized because of some shady stuff the agency did (paid links?). The penalty is long gone, but it looks like Google never forgets anything, so the website is sitting there on 5+ pages for pretty much any keyword.
Yep, this is called a "manual action" by Google, and they'll often tell you exactly what you're doing that they find offensive. Usually this involves buying spammy backlinks. You can resolve the issue and ask them to re-evaluate your site by submitting a Reconsideration request.
Engagement on links clicked in Gmail is likely a ranking factor in Google search.
We've done studies with popular e-mail newsletters that showed URLs performing better in search on days when they've been sent out to hundreds of thousands of people.
At the end of the day, Google is a private company. They get to set the rules and police search results how they see fit. I don't think Pinterest is doing anything unethical. I do think it is spammy, though.
But is spamming not in its essence 'unethical'? It might not be illegal, but it is morally at least questionable and many will say it's unambiguously immoral.
I just started a new job where the company is heavily invested into SEO. Being new to the subject, what resources would you suggest for a webdev to first learn the technical fundamentals and then to get deep into the weeds?
Hey, I run the Growth team at Pinterest. I just wanted to comment on the article to be very clear and say we have never scraped Google search results either currently or at any time in the past.
First, if you insist that you have not scraped data then can you offer an alternative explanation for the data being presented here?
Second, it's pretty clear that the majority of at least the tech community hates what you do with the regwall. Do you not know how bad your site's behavior makes GIS for those of us who aren't interested in joining it? Or do you just not care?
I'm guessing that they use one or more third parties that do indeed scrape Google results but provide plausible deniability in the process. And I'd look very closely at the employment histories of principals at those nominal "third" parties. It's not that uncommon in this industry to find companies with exactly one customer, staffed entirely by ex-employees of that one customer.
At the point where you're actively polluting google search results as if you had such a scraping program, is this a meaningful distinction to make for anyone other than Google's TOS compliance team?
Not specific to the article but about the growth team structure at Pinterest.
Is growth an engineering/product led org? Does it have a marketing leader as well?
I've come across a couple instances where growth responsibilities seem not in the land of marketing and I'm curious how those teams are structured and why.
> we have never scraped Google search results either currently or at any time in the past.
Just an observation that this very specific denial without offering anything else makes me think that what they _actually_ do is probably controversial as well.
There's a Chrome extension [1] just to filter out Pinterest results, because Pinterest fills the results with crap. As the head of the team responsible for the need for this extension, are proud of the work you do?
There are certain sites like Wikipedia that improve the web's utility for all who use it, small and giant companies alike. Google search is improved by Wikipedia's existence.
And then there are sites like Pinterest, that degrade the web's utility for the vast majority of people on Earth. It actually harms Google's search experience (obviously image search) and frustrates us that try to benignly browse images. Why does Google let Pinterest get away with its user-hostile approach?
Given that Google takes MusicBrainz's data and uses it to display infoboxes but doesn't usually list MB itself, I think the motivation for indexing Pinterest is indeed ad revenue.
If you generate money for them, they index you, even if the content is bad, and vice versa.
This question seems not so obvious to me. Or at least, the incentives are reversed from what I assume you're suggesting.
If Pinterest is a buyer of Google's ad services, Google has a strong interest in ensuring that Pinterest doesn't get free organic results. More organic results, less need for ad spend.
"The licenses Wikipedia uses grant free access to our content in the same sense that free software is licensed freely. Wikipedia content can be copied, modified, and redistributed if and only if the copied version is made available on the same terms to others and acknowledgment of the authors of the Wikipedia article used is included (a link back to the article is generally thought to satisfy the attribution requirement; see below for more details). Copied Wikipedia content will therefore remain free under an appropriate license and can continue to be used by anyone subject to certain restrictions, most of which aim to ensure that freedom."
Besides, Wikipedia isn't an ad-supported site, so Google's practice of including their text in the search result doesn't cost them ad revenue. You could argue that it means people will see the banner requesting donations less often, but that's compensated for by Google's own donations to them. So while I have definite concerns about some of Google's other practices, that one is no big deal.
No, not directly, though they have donated to the relevant foundation, but in any case that isn't quite the same thing.
The use of Wikipedia's information in search results is generally useful to the user.
Pinterest's spamming of search results can harm the user experience, making it harder to find content that is hosted elsewhere. Often the content elsewhere is better than that found on Pinterest because it is more likely to be the original source (or at least cite the original).
Any chance of a summary for those who refuse to play TC/Yahoo's privacy game (which last time I did look in more depth made it pretty much impossible to completely opt out of tracking) beyond seeing the interstitial and hitting back?
Please don't post in the flamewar style to HN. It's the opposite of what we're hoping for here. I understand that it's frustrating to be in the minority and represent a contrarian view to an internet forum, but if you don't want to discredit the truth you're arguing for, the only way is to provide correct information and correct errors neutrally.
You may have already discerned, but I don't give two shit's about the style of my comments on this site. The Internet is a bathroom stall, and comments are the graffiti. Taking them as seriously as you do is ridiculous.
> because Pinterest has done such a good job of labeling and categorizing images
No. Nobody cares about the deduplication part. Good for them. And if they used their own users' content to increase search accuracy for their own users, nobody would care either. What people care about is that they're actively seeking out and borrowing non users' content (metadata and descriptions) to trick Google image search and pull in other non users who then can't even see the pictures they clicked for.
> Show me on the doll where it touched you.
Interesting how such an irrelevant metaphor comes so readily to mind for you, and that you think it's something to joke about. Hint: it's a super bad look for you.
And I'm sure that's a very great experience when you have a Pinterest account. For the rest of us it makes half of Google image search useless because we can't view the original site image thanks to not having an account.