Search Marketing

Something is leaking in Google’s engine

Anders Bohman 30 May 2024

The big buzz in the SEO world this week has been the documents that claim to offer a glimpse into Google’s algorithm. For those of us working in SEO, this is of course extremely interesting, as we dedicate a large part of our lives to understanding how Google Search works.
In this article, we’ll go through what has happened, what the leaked documents reveal, and whether or not they can be trusted.

What happened

On March 5th, Rand Fishkin (SparkToro) received an email from an anonymous individual claiming to have gained access to leaked documents from inside Google Search. The person who shared the leaked documents, Erfan Azimi (EA Eagle Digital), has no direct affiliation with Google but was able to provide more than 2,500 pages of API documentation containing over 14,000 attributes (API functions) that appear to originate from Google’s internal “Content API Warehouse.”

Rand Fishkin has had the leaked documents “verified” by former Google employees. However, it’s worth noting that “verified” in this case means that ex-Googlers have stated that the code looks legitimate and that it definitely appears to be a genuine Google API.

What do the leaked Google documents contain?

Google is notoriously secretive about its algorithm, and it’s been said over the years that no one—not even those working at Google—fully understands how the algorithm works. From time to time, Google issues guidance on how website owners should approach its search engine, though this advice doesn’t always align with the reality we SEO professionals experience.

However, the leaked documents paint a different picture of Google’s algorithm than the one portrayed through official channels. Below, we’ll summarize the key insights that emerge from these documents.

It remains unclear how ranking factors are weighted against each other, but their existence within Google’s algorithm is confirmed.
Content can be downgraded for several reasons:
Manipulated or low-quality links Exact-match domains
Unsatisfied users (based on user behavior data)
Pornographic content

Links remain important for ranking, and PageRank is still active and relevant.
Google uses something referred to as “site authority” to assess websites.
User data, especially click-through rate (CTR), influences rankings. (Google differentiates between good and bad clicks, for example.)
Content freshness matters to Google.
Meta titles are still relevant for ranking.
Data from Google’s Chrome browser is used for ranking.
Google considers authorship in its evaluations.
Brand awareness plays a significant role. (Well-known brands outside of Google Search hold an advantage.)

As you may notice, this information contradicts many of the things Google has stated over the years. Claims such as not using clicks as a ranking factor, the denial of a so-called “sandbox” where penalized sites are placed, and the assertion that there is no such thing as “domain authority” within Google—all are challenged by the contents of these documents.

Can the documents be trusted, and what does Google say?

At the time of writing, Google has not commented on the leak. It’s important to note that there is also no concrete evidence that this is a leak from Google Search. The documents suggest that they may be part of “an external API for building a document warehouse, as the name implies,” and are not necessarily related to how websites are ranked in Google Search.

Right now, much of the SEO community is actively testing and experimenting to determine whether there is any truth behind the leaked documents.
Since Google has neither confirmed nor denied the authenticity of the documents, it is likely that the truthfulness of the information they contain will remain uncertain.

Written by: Anders Bohman

May 30, 2024

Updated on January 22, 2026

Previous post Next post