I would like to share some notes about a recent experiment my agency did. As a result, Google believed our website to be the canonical version of our own Search Engine Optimization Starter Guide PDF, and ranked Google instead of our own content for “Search Engine Optimization”. and thousands of other phrases.
We do a lot of testing in-house, both on our SEO Spider software and on our clients’ agencies. This particular experiment was purely for fun to highlight the issues we found and had no intention of hurting anyone or making any real profit. bottom.
I had previously been in contact with Google after noticing strange behavior in their search engine results. SEO Starter Guide PDF was Something was wrong with the ranking of related terms like “SEO” and “google SEO guide”….
— Dan Sharp (@screamingfrog) November 7, 2016
A search we did brought up a list of starter guide PDFs, but they linked to various other websites that uploaded it, rather than Google’s own website. So Google wasn’t ranking their own page for some reason. Instead, other websites were displayed using content from Google.
Here is a view of some of the sites ranking in the UK. Because Google changed which sites were canonical, each site seemed to knock the other out of the search results.
I decided to investigate why Google’s page was not indexed and other pages appeared to be appearing in its place. Google has noticed that you appear to be using 302 temporary redirects in your Search Engine Optimization Starter Guide hosted on a different domain.
A 302 redirect should mean that the original URL on google.com was indexed instead of the target URL hosted on static.googleusercontent.com.
but, Absent The URLs were indexed, but they seemed to be having trouble understanding and indexing the original content and normalization of the URLs. Google doesn’t use “noindex” and nothing was blocked via robots.txt. Other content was indexed on subdomains. Also, there didn’t seem to be any conflicting directives on the page or in the HTTP headers with canonical or anything else.
Google states that the PageRank flow is the same whether it’s a 302 temporary redirect or a permanent 301 redirect. So, in theory, the original URL should have been indexed and ranked, but it wasn’t.
Any type of redirect should pass PageRank in a similar way, but Gary Illyes says 301 helps with normalization.
— Gary Illies ᕕ( ᐛ )ᕗ (@methode) August 5, 2016
From previous experiments, we knew that the same content could be hijacked, but generally by more authoritative websites. Google’s SEO starter guide has about 2,100 root domains linking to the original URL and another 485 linking to redirect targets (HTTP/HTTPS protocols combined), so it has a lot of visibility A very powerful page.
The Starter Guide is also on Google.com and is very popular. However, the final target was in another domain.
Obviously, Screaming Frog’s website is not as authoritative as Google’s, but it was previously superseded by a much less authoritative website due to the above issues.
I did a short experiment and ended up just uploading Google’s SEO starter guide to my domain. I then indexed it via Google Search Console, but forgot about it.
A week later we found ourselves hijacking Google’s own rankings (and previous hijackers due to our higher “authority”). Their algorithms seemed to believe we were now the canonical source for their own content. Our URLs returned under info: and cache: queries for any of Google’s URLs increase.
We hijacked hijackers — and Google.
We are a UK site, but in the US we are ranked 4th in ‘Search Engine Optimization’ and in the top 10 in ‘SEO’. Out of the top 50.
PDFs were ranked on “Google SEO”, “Google SEO Guide”, “www google com”, and all other phrases where Google content should appear.
PDF ranked in many other brand type queries in the UK and US. This is courtesy of SEMrush (especially the US screenshots).
And Sistrix highlighted the sudden “new” keywords that we’re currently seeing organically.
Google Search Console recorded around 800,000 impressions for the PDF, especially over four days.
This experiment received a lot of attention at the time. we tweeted it.
So we continued to monitor over the next few days to see if Google made any changes to correct indexing, normalization, and ranking. About 48 hours later, I noticed that Google’s guides, which previously returned no results, started ranking and were apparently indexed (and appearing under site: queries).
Then I found out that Google added the HTTP canonical for PDF to the original URL. This created the index.
However, it still appeared as standard under the info: query and ranking for that query. This meant that both guides ranked in search results, often outperforming Google’s site.
We expected this to change, as Google became the norm again and our page fell out of the rankings.Until 5 days later, we were out there alongside Google in the search results for thousands of search queries. I was in After that, the PDF disappeared from the search results and the experiment ended fairly quickly.
at the end
First of all, I don’t recommend messing with other people’s content. This is simply an unusual and interesting case study rather than a viable strategy or tactic to achieve higher rankings. It can be very difficult.
There are various theories and ideas within the company, but I would like to introduce three points at the end.
1. 302 redirects are (totally) irresponsible
Initially, I thought a 302 redirect might be the root cause, but Google has assured me that using a 302 redirect is fine. We believe there are several contributing reasons for how the files are hosted.
I found some quirks with URLs that change over time (based on the value specified in the Accept-Language header) and bad normalization on HTTPS.
2. Use canonical
Using canonicals to help with indexing is very sensible. As soon as Google updated the PDF’s HTTP canonical to his one URL, it was indexed.
Crawlers can scan your site for missing canonical link elements and HTTP header canonical links.
For PDFs and documents, you can easily set up HTTP canonicals using .htaccess or similar.
3. In rare cases, hijacking can occur
A page’s ranking can be hijacked by another domain with identical content under certain circumstances. While this is generally unlikely, there may be some things Google could improve in ranking the original source.
The opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
What’s New in Search Engine Land