We finally (sort of) know how Google Search works Premium
The Hindu
SEO expert Rand Fishkin receives leaked Google Search algorithm documents, revealing secrets and debunking company claims.
On May 5, Rand Fishkin, the CEO of marketing research firm, SparkToro and SEO expert, received an anonymous email making the wild claim of having access to API documents of Google’s Search algorithm. Given how secretive Google is about how its Search mechanism works, Mr. Fishkin was immediately sceptical of these extraordinary claims. After exchanging several emails between them, Mr. Fishkin spoke to the emailer over video call on May 24. Four days later, the source disclosed his identity. Erfan Azimi was the founder of a digital marketing agency and a SEO practitioner himself and had plenty of mutual friends with Mr. Fishkin.
Over the call, Mr. Erfan showed Mr. Fishkin the documents, running to more than 2,500 pages of API documentation and containing 14,014 attributes or API features. While it isn’t confirmed who exactly put them up, the document history showed a “yoshi-code-bot /elixer-google-api” as the origin which indicates that Google’s own internal Content API Warehouse possibly accidentally published them on the repo. The code was published on March 27 and stayed up until May 7 allowing enough time for the public to pick them up.
Even though the documents didn’t explicitly share what exactly tickled the Search algorithm to push a story up in ranking, they laid bare a list of factors that Google Search was definitely tracking which in itself is revelatory. The secret sauce of Google’s algorithm has been as much a black box as that of a large language model or the human mind. Company execs have protected details around how the Search ranking works, going out of their way to lie about what’s important when publishing content deceiving marketing professionals and publishers and content makers much of whose jobs revolve around “optimising content on Google Search.”
Mr. Fishkin shared the documents with another SEO veteran and CEO of a marketing agency, iPullRank, Mike King after which both shared their own analyses of the leak popularising findings valuable to an industry working in the dark. A lot of this was debunking what Google employees had lied about.
In the past, in multiple instances, Google had explicitly repeated that domain authority wasn’t considered as a focal point. But turns out, Google has a feature called “siteAuthority”, even though there’s little clarity around how the metric is calculated.
Also, contrary to their previous assertions that clicks aren’t used as a way to calculate ranking, there is solid evidence now that clicks are very much a measure. During his testimony at the Department of Justice (DOJ) antitrust trial in November last year, Vice President of Search, Pandu Nayak spoke about the NavBoost and Glue ranking systems both of which use click-driven ways to boost, demote or reinforce a ranking in Search. Mr. Nayak revealed that Google had been employing NavBoost since 2005 and historically used 18 months of click data. Google reps have also stated earlier that “dwell time” wasn’t a feature but Navboost does indeed consider long clicks which is basically the same thing.
Another major point is that Google may use Chrome data to determine rankings — something that they had denied earlier. Mr. King noted that Chrome appears in more than one module — one related to page quality scores has a site-level measure of views from Chrome, while another module that seems to be related to the generation of sitelinks has a Chrome-related attribute as well.