The search marketing community is trying to make sense of a leaked Yandex repository containing files listing what appear to be search ranking factors.
Some may be looking for actionable SEO cues, but that’s probably not the real value.
The general consensus is that it helps in gaining a general understanding of how search engines work.
If you want hacks and shortcuts, you won’t find them here. But if you want to understand more about how search engines work, there’s Gold.
— Ryan Jones (@RyanJones) January 29, 2023
there is a lot to learn
Ryan Jones (@Ryan Jones) thinks this leak is a big deal.
he already Load some Yandex machine learning models to my machine for testing.
Ryan believes there’s a lot to learn, but he believes that just looking at a list of ranking factors isn’t enough.
Ryan explains:
“Yandex is not Google, but there is a lot to be learned from this in terms of similarities.
Yandex uses a lot of technology invented by Google. They refer to his PageRank by name and use things like Map Reduce and BERT.
Obviously, the factors are different and the weights applied to them are different, but the computer science methods of analyzing text for relevance, linking text and performing calculations are very similar across search engines. .
We think you can get a lot of insight from ranking factors, but just looking at leaked listings isn’t enough.
Looking at the default weights applied (before ML), there are negative weights that assume SEO is positive, and vice versa.
Also, the ranking factors calculated in the code are much more than what is mentioned in the list of ranking factors.
That list seems like just static factors and doesn’t take into account how relevance is calculated for a query or the many dynamic factors involved in that query’s result set. ”
Over 200 ranking factors
Based on leaks, Yandex commonly repeats using 1,923 ranking factors (some say less).
Link Research Tools founder Christoph Cemper (LinkedIn profile) said a friend told him there were many other ranking factors.
Christoph shared:
“Friends saw:
- 275 personalization elements
- 220 “web freshness” factor
- 3186 image search elements
- 2,314 video search elements
There are many other things to map.
Perhaps the most surprising thing for many is that Yandex has hundreds of linking elements. ”
The point is, it’s far more than the 200+ ranking factors Google claimed.
And even Google’s John Mueller said that Google has moved away from more than 200 ranking factors.
Perhaps it will help the search industry stop thinking of Google’s algorithms in such terms.
Anyone know the whole Google algorithm?
What’s surprising about the data breach is that the ranking factors were collected and organized in a very simple way.
This leak calls into question the notion that Google’s algorithm is so tightly guarded that no one, even at Google, knows the entire algorithm.
Is it possible that Google has a spreadsheet with over 1,000 ranking factors?
Christoph Cemper questions the idea that no one knows Google’s algorithm.
Christoph commented to Search Engine Journal:
“Someone said on LinkedIn that they couldn’t imagine Google ‘documenting’ their ranking factors in such a way.
But you have to build a complex system like that. This leak comes from a very authoritative insider.
Google also has code that can be leaked.
The oft-repeated statement that even Google employees don’t know ranking factors always seemed silly to a techie like me.
Very few people know all the details.
But it must be in your code because it’s the code that runs the search engine. ”
What parts of Yandex are similar to Google?
The leaked Yandex files offer a glimpse into how the search engine works.
The data don’t show how Google works. However, it does give you the opportunity to see some of the ways the search engine (Yandex) ranks search results.
Do not confuse what the data is with what Google may use.
Still, there are interesting similarities between the two search engines.
MatrixNet is not RankBrain
One interesting insight that some people are digging for relates to a Yandex neural network called MatrixNet.
MatrixNet is an older technology, introduced in 2009 (archive.org link to announcement).
Contrary to some claims, MatrixNet is not the Yandex version of Google’s RankBrain.
Google RankBrain is a limited algorithm focused on understanding the 15% of search queries Google has never seen before.
A Bloomberg article revealed RankBrain in 2015. In this article, he says that RankBrain was added to Google’s algorithm that year, six years after the introduction of Yandex MatrixNet (snapshot from Archive.org article).
A Bloomberg article describes RankBrain’s limited purpose.
“When RankBrain finds an unfamiliar word or phrase, the machine can infer which words or phrases have similar meanings and filter the results accordingly, allowing searches like never before. It can handle more effectively queries.
MatrixNet, on the other hand, is a machine learning algorithm that does a lot.
One of the things it does is categorize search queries and apply the appropriate ranking algorithms to them.
This is part of the 2016 English announcement of the 2009 algorithm.
“MatrixNet allows us to generate very long and complex ranking formulas that consider many different factors and their combinations.
Another important feature of MatrixNet is the ability to customize ranking formulas for specific classes of search queries.
By the way, tweaking the ranking algorithm for music search, for example, doesn’t compromise the ranking quality for other types of queries.
A ranking algorithm is like a complex machine with many buttons, switches, levers and gauges. In general, one turn of any single switch in the mechanism causes a global change in the entire machine.
However, MatrixNet allows tuning of specific parameters for specific classes of queries without a major overhaul of the entire system.
Additionally, MatrixNet can automatically select the sensitivity of a certain range of ranking factors. ”
MatrixNet does a lot more than RankBrain, but obviously they are not the same.
But what’s great about MatrixNet is that the ranking factors are dynamic in that they categorize search queries and apply different factors.
MatrixNet is referenced in several ranking factor documents, so it is important to put MatrixNet in the right context so that the ranking factors are seen in the right light and are more meaningful.
To understand Yandex leaks, it may be helpful to read more about the Yandex algorithm.
read: Yandex artificial intelligence and machine learning algorithms
Some Yandex Factors Match SEO Practices
Dominic Woodman (@dom_woodman) has some interesting observations about leaks.
Some of the leaked ranking factors are consistent with specific SEO practices such as various anchor texts.
Change your anchor text baby!
4/× pic.twitter.com/qSGH4xF5UQ
— Dominic Woodman (@dom_woodman) January 27, 2023
Alex Brax (@alex_buraks) published a huge Twitter thread about topics that resonate with SEO practices.
Optimizing internal links to minimize the crawl depth of important pages is one such factor Alex highlights.
Google’s John Mueller has long encouraged publishers to make sure important pages are prominently linked.
Mueller discourages embedding important pages deep within the site architecture.
John Mueller shared in 2020:
“What happens then is that the home page is very important, and what the home page links to is generally very important as well.
And…the further away you are from the home page, the less important you probably think it is. ”
It’s important to place important pages near the main page that your site visitors will visit.
So if the link points to the home page, the page linked from the home page is considered more important.
John Mueller didn’t say that crawl depth is a ranking factor. He simply said he’s letting Google know which pages are important.
The Yandex rule cited by Alex uses crawl depth from the home page as a ranking rule.
#1 Crawl depth is a ranking factor.
Keep important pages near the main page.
– Top page: 1 click from main page
– Important pages: <3 clicks pic.twitter.com/BB1YPT9Egk— Alex Buraks (@alex_buraks) January 28, 2023
It makes sense to think of the home page as the starting point for importance, and calculate that the deeper you click into the site, the less important it becomes.
There are also Google research papers (Rational Surfer Model, Random Surfer Model) with similar ideas. This is a calculation of the probability that a random surfer will land on a particular webpage of hers just by following a link.
Alex discovers factors that prioritize the important main pages.
#3 Backlinks from the main page are more important than backlinks from internal pages.
Understand. pic.twitter.com/Mts9jHsRjE
— Alex Buraks (@alex_buraks) January 28, 2023
The SEO rule of thumb has long been to keep important content within a few clicks of your home page (or any internal page that attracts inbound links).
Yandex Update Vega… Related to expertise and authority?
Yandex updated its search engine in 2019 with an update named Vega.
The Yandex Vega update featured neural networks trained by topic experts.
With this 2019 update, our goal was to bring expert and authoritative pages to your search results.
However, search marketers scrutinizing the document have yet to find anything that correlates with the expertise Google seeks or author bios that are believed to be related to authoritativeness.
Ryan Jones tweeted:
Second fun fact. Nothing I have found that many SEO is on par with what he thinks EAT sees. (Author biography/profile, etc.)
— Ryan Jones (@RyanJones) January 30, 2023
learn, learn, learn
We are in the early stages of a leak and hope to give you a better understanding of how search engines work in general.
Featured image from Shutterstock/san4ezz
var s_trigger_pixel_load = false; function s_trigger_pixel(){ if( !s_trigger_pixel_load ){ striggerEvent( 'load2' ); console.log('s_trigger_pix'); } s_trigger_pixel_load = true; } window.addEventListener( 'cmpready', s_trigger_pixel, false);
window.addEventListener( 'load2', function() {
if( sopp != 'yes' && !ss_u ){
!function(f,b,e,v,n,t,s) {if(f.fbq)return;n=f.fbq=function(){n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}; if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0'; n.queue=[];t=b.createElement(e);t.async=!0; t.src=v;s=b.getElementsByTagName(e)[0]; s.parentNode.insertBefore(t,s)}(window,document,'script', 'https://connect.facebook.net/en_US/fbevents.js');
if( typeof sopp !== "undefined" && sopp === 'yes' ){ fbq('dataProcessingOptions', ['LDU'], 1, 1000); }else{ fbq('dataProcessingOptions', []); }
fbq('init', '1321385257908563');
fbq('track', 'PageView');
fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'yandex-search-ranking-factors', content_category: 'news seo' }); } });