As an online marketer, I spend a fair bit of time staying up on the latest happenings in my industry. As I’ve read articles across various key industry blogs recently, it has become pretty clear that few if any SEOs really grasp the scope of what RankBrain is and how it’s working (and even more important, where it’s going).
One of my hobbies is artificial intelligence research (maybe a bit more than a hobby, as I’m seriously considering starting a company in the AI space), and I read research papers and blog posts on the subject on a daily basis. While Google has been (and likely always will be) fairly closed mouthed regarding exactly what RankBrain does and doesn’t do, finding enough information to come up with a few good guesses is possible.
But first, a quick primer on AI and deep learning (as I understand it…I don’t consider myself an expert by any means). AI as it exists today is typically composed of 4 key things:
- A system for data analysis and pattern recognition (machine learning, deep learning, MCTS, etc.)
- A source of “sensory” input (primarily vision and/or speech for now)
- A goal or goals (such as, improving a score)
- Constraints (# of attempts, time limits, etc.; sometimes a goal can co-serve as a constraint)
The combination of goal(s) and constraint(s) allow the data being analyzed to be weighted, and this weighting can occur in either a supervised (human decides what does and doesn’t matter) or an unsupervised manner (machine decides what does and doesn’t matter).
Google, via their DeepMind acquisition and their Google Brain initiative, is putting a lot of their eggs into the unsupervised deep reinforcement learning basket, a great example of which is their Deep Q-learning program, which taught itself to master Atari games in record time.
In the case of this particular program, its sensory data was vision only, in the form of pixels on a screen, and its goal was to maximize its score. With just that, it was able to VERY quickly master complex games.
AI can take seemingly vague input (pixels) and a clear goal (increase score) and can find the right pattern just by reinforcement learning (learning from experience, both failure and success, much as humans do).
When it comes to the Internet, there are many, many trillions of data points, with more and more coming online each day. Google is like a giant vacuum, sucking in as much data as it can, and then using machine learning to look for patterns in the chaos.
For the longest time, those patterns were distilled by people into The Algorithm. Though changes were made to it constantly by a large team of engineers, it was still human driven and fairly limited in its abilities. No human or team of humans can possibly hope to consume and distill an exponentially growing pool of data, nor to figure out context in all possible scenarios.
There’s too much, and it’s growing and changing way too quickly…which brings us to RankBrain and why it’s so important to Google. Based on what I know of Google’s AI technology and research projects, here’s how I believe RankBrain (and/or the other AI pieces that are being used on search) might be impacting organic search.
For starters, I don’t think there is a static or even semi-static “list” of ranking signals any longer, and I don’t believe that RankBrain is actually a signal per se; rather, I believe it is the mechanism that defines what is and isn’t a signal for any given query (or at least that adjusts the weighting for those signals). The “list” of signals is likely being assembled and weighted for each query based on a huge number of personalization factors (location, search history, natural language analysis, etc.)
I think it is likely that this is being done on the fly, though Google claims RankBrain is run offline on older query sets. If it isn’t done on the fly yet, I see no reason why it won’t get to that point in the future.
Because of this, the concept of tracking rankings probably needs to die. Your ranking for any query will vary, often, based on a tremendous number of factors…some of which may not even be “factors” until the moment of your query. This isn’t new, but it’s becoming more and more apparent.
Also, it is known that Google injects “noise” into search results and rankings to mess with people trying to unravel their algorithm, and it’s not hard for Google to identify the source of rank scraping…so do you think your “rankings” as tracked by common tools are even close to accurate? Not likely.
RankBrain is pattern-matching for patterns that nobody else can see, because it has a point of view that nobody else has. It is particularly exceptional at processing one-off queries (queries that are new to Google). One of the great strengths of deep learning is that the more data it gets, the better it gets. The “algorithm” is changing itself constantly, based on a tremendous number of context points, and it is only going to get better, faster.
Because RankBrain sees patterns that aren’t otherwise obvious, and because it (probably) has access to numerous other AI tools from Google’s arsenal (image recognition, text analysis, etc.) it (or some variation of it or another AI tool) is likely being used to find spam and manipulative tactics in new and sophisticated ways.
A few methods of spam detection might be “perform text analysis to find instances where the same writer wrote content under different names but linked to common sites to find paid links from ghost written content or link networks” or “find sites with similar topics that link to similar things that use very similar images to identify link networks”.
Of course, since Google doesn’t want people to understand the full scope of what they can do on the spam fighting front, in many cases where they apply this sort of deep learning to penalization they might only devalue a link or slightly demote a site on the back-end rather than hitting it with a more clear or unilateral penalty.
There are of course numerous limitation to what machine learning can do, and you can still slide things under the radar if you’re careful, but it is getting harder and harder all the time.
Again, because of the various AI tools at its disposal, Google is better able to analyze everything on a page/site. It can understand exactly what is in your images, and how the content of the images matches up to your text content. It can compare your image usage to other pages and sites of similar topics for both similarities and differences.
AI can watch a video, understand the content and theme of the video, analyze vocal tonality and facial expressions for emotion, look at objects in the background to determine location, and numerous other insanely complex things.
It can analyze your text for readability, relevancy (across many factors), and even for knowledge and authority on a topic (by comparing your writing and word usage to that of known topical experts). It can then compare all of those data points to usability signals (CTR, bounce rate, search revision, etc.) for people who see your site or content in SERPs to match searcher intent to your specific content based on query, location, etc.
It can analyze code to find sites built by the same developer (coding style can be as unique as a fingerprint). It can process code to find things that were once hidden. Yes, it still struggles with some elements of this, but it’s getting better very, very quickly.
There’s certainly more, but this is a good basic rundown of what’s going on for Google on the AI front (and yes, some of what I outline above certainly falls outside the scope of RankBrain in particular…but still fits in the Google AI bucket).
Initially, RankBrain was given relatively little control of the overall search results, while A/B testing was performed to determine the quality of SERPs before and after. RankBrain was so insanely effective that it was very quickly given greater and greater control over SERPs. At this point, it could very well have full control over the SERPs…and if it doesn’t yet, you can bet your bottom dollar that it will if it continues to deliver higher quality results than other methods.
So what does this mean for SEOs?
It means that you really, really need to understand the vertical you’re working in, the customers you hope to serve, and how your customers and their searches and intent vary by location. You need to have experts writing topical content, not just the cheapest copywriter you can find.
You need to create best in class resources if you’re trying to answer complicated queries. You need to get your content translated by a native if you plan to tackle other countries. You need to deeply analyze the sites that do well for your target vertical, to better understand how Google views that vertical.
In short, you need to stop trying so hard to game the system, and work harder to be the best possible result for any given topic that matters to you in the places that you care about…because Google’s AI systems are gunning for you 🙂