Gary Illyes’ Post

View profile for Gary Illyes

Analyst at Google

PSA from my inbox: check what traffic your firewalls and CDN are blocking. By far the most common issue in my inbox is related to firewalls or CDNs blocking googlebot traffic. If I reach out to the blocking site, in the vast majority of the cases the blockage is unintended. I've said this before, but want to emphasize it again: make a habit of checking your block rules. We publish our IP ranges so it should be very easy to run an automation that checks the block rules against the googlebot subnets. https://lnkd.in/e4DiAbx4

Googlebot and other Google crawler Verification | Google Search Central | Documentation | Google Developers

Googlebot and other Google crawler Verification | Google Search Central | Documentation | Google Developers

developers.google.com

Kristine Schachinger

Consultant | SEO, Technical SEO, SEO Website Audits, Growth Strategies, Google Penalty Recovery, Accessibility, Usability, LLMs, & Social Media

7mo

I have a site tonight that is showing that in tools. How can I confirm that the tools are right since they use an emulated GB. Gary Illyes

Like
Reply
Arne Böckenhauer

Marketing / SEO + Ingenieur | Nein, Maschinen verkaufen sich nicht von allein. 10 Jahre SEO & Marketing + 14 Jahre in internationalen Unternehmen als Maschinenbauingenieur.

1y

I'm currently trying to filter bad bots and non-human traffic on purpose and only let good bots in like Google. Because to me, it looks like through the machine traffic, whatever is happening there, content is being rehashed by AI, content spinners, etc. Is it not possible to run all google bot IP crawler addresses on ASN 15169 instead of ASN 396982? Then I just have a short fine handled nginx rule: if ( $geoip2_data_autonomous_system_number ~* (15169|...)) { set $badbotasn 0; } Some IPs running on ASN 396982 (Google Cloud) and I don't have a good feeling and the code is also much longer (I hope this is correct) if ($remote_addr ~* (34\.100\.182\.96|34\.101 ... .|35\.247\.243\.240)) { set $badbotasn 0; } Indeed, from the Google Cloud platform also comes something like this: GET /robots.txt HTTP/1.1" 301 162 "-" "Apache-HttpClient/4.5.13 (Java/17.0.3) I'm sure it's not John. :-) We have more bits, longer rules. Fun fact, the last person who stole my content 1:1 also copied the internal links at the same time. This made the plagiarism check easy due to valuable link hints in GSC :-) And a nice tool to see, how is the reputation of an IP and what happen: https://www.abuseipdb.com/check/35.245.188.175

Baptiste Wallerich

Expert SEO et développeur fullstack | 125+ clients accompagnés | 10 ans d’expérience | NPS : 95

1y

Gary Illyes what does PSA stand for? Google is telling me "prostate-specific antigen" (Google France, mobile desktop) but I'm quite sure it's not what you intended 😅 (it's not a joke)

A little offtopic but still worth sharing. I'm consulting with a brand on a pending migration. It must be completed within 11 months. We are testing a lot of things. Different tech stacks. Different CDNs. And yes, different hosting providers. I could not believe it when a few days ago we had 100% packet loss (no traffic) on some (classic) ports. That's not even blocking a bot. It's no traffic at all for some services. Took me almost an hour to find out what was going on. And less than 2 minutes to rewrite the firewall rules. Problem solved! And then, a few minutes after solving this, I noticed another issue. This hosting brand has been removed from our shortlist.

Peter Macinkovic

Technical SEO Lead - Makes the SERPs dance with tech magic

1y

It would be great if we had better debugging tools. We have a persistent bug that *only* trigger on the real Google WRS, occasionally, on a few select pages. Never were able to identify to conditions that caused it (maybe Cloudflare ruled, maybe API not running. Live tests always were fine.) so had to rely on a system of fall backs to mitigate damage whenever it would randomly occur.

Like
Reply
Adnan Islamovic

SEO - Marketing Specialist - 20+ years of SEO experience - Security Analyst - Offensive Security

1y

This is a common issue with shared hostings. Their firewalls block part of legit traffic along with google bots and what not.

Dido Grigorov

Student @ Stanford (AI Program) | Computer Science Student @IU of Applied Sciences | Backend Developer | SEO Expert

1y

This a very common issue with improperly configured CloudFlare...

Like
Reply
Damian S.

SEO & Marketing Strategist Beyond Marketing

1y

If you don't know, here is SEO Gold.

Like
Reply
Louis Smith

Ecommerce SEO Consultant To Increase Your Traffic and Profits 📈 helped drive $80million+ MRR | Enterprise Shopify Search Engine Optimisation Specialist

1y

Gold dust in these documents Gary and every brand should be checking these! Thank you 🔥

Sergejs Ponomarjovs

Director of SEO | SEO Problem Solver | 20 Years XP | Award Winning

1y

Thank you for sharing

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics