Today the Czech SEO community was honored by Google Webmasters Hangout with Gary Illyes from Google. This 60 minutes event, organized by Pavel Ungr and SEOloger, was full of very interesting information so we decided to choose some of them and make an overview about main discussed topics and we’re very happy that we had there two representatives (Filip Podstavec and Zdeněk Nešpor).
You can find recording from the hangout here:
Let’s start with few interesting facts about Gary:
https://translate.google.com/#hu/en/Illyes (Click on sound button)
And now to the main topics of our discussion:
New Google Search Console
Hangout question: https://youtu.be/J_Eu9wj05fk?t=2m23s
Google doesn’t have any time frame for the completion of new Search Console but it should be completed in months. Their first focus is on the reports they miss in current version of Search Console and after that they’ll work on API update to provide developers new (and historic) data.
In the new GSC you can find URL status:
- Discovered - currently not indexed
- Crawled - currently not indexed
- Crawl anomaly
The first one means that Google discovered this URL but haven’t crawled it yet. In the second case (crawled - currently not indexed) that means that URL was discovered and that link also followed (and crawled) by Googlebot but it’s not in the index.
And the last case with crawl anomaly mean Google wasn’t able to crawl your URL for some reason (e. g. server error).
Indexed internal search results
Hangout question: https://youtu.be/J_Eu9wj05fk?t=8m53s
Google knows about this problem but in some cases it makes sense to index and show them to users. Google divides search results only to two different kinds:
- Useful
- Useless
If they’re useful for users then there is no reason not to use them in Google search results.
Does it make sense to implement structured data which are not explicitly mentioned in Search gallery?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=13m11s
Generally...yes. It is something that can help Google to understand better your content. :-)
Mobile-First Index
Hangout question: https://youtu.be/J_Eu9wj05fk?t=16m11s
Google hasn’t any timeline for worldwide mobile-first switch but webmasters can expect that in next months. So if you have separate mobile version for your website with some obstacles related to mobile-first index then hurry up and finish you updates and fixations.
Google has also great article about best practices for mobile-first indexing that you can find here:
https://developers.google.com/search/mobile-sites/mobile-first-indexing
Does Google plan to share data about Voice search in Search Console?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=21m5s
This is something that Gary and his colleagues talked internally. The problem is they don’t see much of a value for webmasters in data like that.
But it seems like Google probably in the future wiil show us the ratio between the fulltext and voice search for separate countries. For US you can find now ratio of voice search on Think with Google.
Google behaviour on spider traps
Hangout question: https://youtu.be/J_Eu9wj05fk?t=26m22s
1. Redirect loops
Google follows 5 redirects (actually it’s 8 redirections but they publicly say it's 5) and then stop.
2. Infinite pagination
Google goes through the pagination and after few dozens of URLs before it stops. For ecommerce websites Google tries to estimate what are the important URL parameters and prioritize them in terms of crawling. And also tries to identify useless parameters and don’t crawl them so much.
Gary also doesn’t recommend to using hash in the URL to prevent parameter crawling.
How does Google works with social signals especially from Facebook?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=32m32s
Gary says that Google doesn’t use metrics like shares and tweets as a ranking signal. The reason is they need ranking signals that are reliable in the long run.
What is the best solution for Lazy loading in term of SEO?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=35m32s
There is no good solution for lazy loading but Gary recommends to use <noscript> with the image in combination with noscript in image sitemap. With <noscript> indexer can asociate image with the landing page.
Do we need to worry about spammy directories that add our sites automatically with link to our website?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=40m6s
In general Google can handle them so you don’t need to worry. What Google does with links like that is basically just ignoring them. But if you want to be sure then use the Disavow tool.
Does Google use entity recognition and OCR algorithm on crawled images?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=44m37s
Great news for high authority websites! Google use entity recognition algorithm for high quality images on authoritative websites. And bad news for websites with low and medium authority because these algorithm costs a lot so Google don’t use them for all images.
That means Google is able for some authoritative websites to recognize if article about cars has the images with something related to cars or not.
QDF (Query deserves freshness)
Hangout question: https://youtu.be/J_Eu9wj05fk?t=52m38s
Yes, Google has something called query deserves freshness algorithm that can predict if people expect from some query fresh content (and how much). It seems like this algorithm uses data from news cluster.
How to deindex a lot of URLs fast
Hangout question: https://youtu.be/J_Eu9wj05fk?t=55m
Easy solution is to add these URLs to Fetch and render tool in Search Console and submit those to indexing. But if you have a lot of URLs then Gary recommends to use on these URLs noindex directive and add them temporarily to the sitemap (or separate sitemap).
Does AMP version of the content on alternative URL affect crawl budget?
Hangout question: https://youtu.be/J_Eu9wj05fk?t=59m50s
Yes, Google count that as a separate URL to crawl.