Bot Traffic, Again
One of annoying things I had happen last time this blog was in active use was getting hammered by a rogue bot. It has happened again.
blog hits from 12am March 1st to 2pm March 9th | 35121 |
---|---|
blog hits in that time not from bots | 528 |
Hits by bot:
count | User-Agent |
---|---|
27543 | "Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" |
4998 | "Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)" |
1001 | "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" |
449 | "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" |
216 | "Mozilla/5.0 (compatible; AhrefsBot/6.1; +http://ahrefs.com/robot/)" |
114 | "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.92 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" |
110 | "istellabot/t.1.13" |
74 | "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" |
65 | "Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)" |
37 | "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" |
34 | "Mozilla/5.0 (compatible; SemrushBot/1.0~bm; +http://www.semrush.com/bot.html)" |
32 | "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.75 Safari/537.36 (compatible; SMTBot/1.0; +http://www.similartech.com/smtbot)" |
22 | "Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)" |
16 | "PHP-Curl-Class/8.0.1 (+https://github.com/php-curl-class/php-curl-class) PHP/7.0.33-0ubuntu0.16.04.12 curl/7.47.0" |
16 | "Mozilla/5.0 (compatible; SemrushBot/6~bl; +http://www.semrush.com/bot.html)" |
16 | "Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" |
16 | "SearchAtlas.com SEO Crawler" |
13 | "Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/2.0; +http://go.mail.ru/help/robots)" |
12 | "Mozilla/5.0 (compatible; Linespider/1.1; +https://lin.ee/4dwXkTH)" |
11 | "Jigsaw/2.3.0 W3C_CSS_Validator_JFouffa/2.0 (See <http://validator.w3.org/services>)" |
10 | "Validator.nu/LV http://validator.w3.org/services" |
10 | "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; AspiegelBot)" |
10 | "Mozilla/5.0 (compatible;Linespider/1.1;+https://lin.ee/4dwXkTH)" |
9 | "Mozilla/5.0 (compatible; SEOkicks; +https://www.seokicks.de/robot.html)" |
7 | "Googlebot-Image/1.0" |
6 | "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106" |
4 | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b" |
4 | "Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)" |
2 | "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (Applebot/0.1; +http://www.apple.com/go/applebot)" |
2 | "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebit/53.7.36 (KHTML, like Gecko) Chrome/63.0.3239.0 Safari/537.36 (compatible; Linespider/1.1; +https://lin.ee/4dwXkTH)" |
2 | "Mozilla/5.0 (compatible; Pinterestbot/1.0; +http://www.pinterest.com/bot.html)" |
2 | "Mozilla/5.0 (compatible;AspiegelBot)" |
2 | "Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots)" |
2 | "ltx71 - (http://ltx71.com/)" |
1 | "W3C_Validator/1.3 http://validator.w3.org/services" |
1 | "DomainStatsBot/1.0 (https://domainstats.com/pages/our-bot)" |
Hits by non-bots: 38 unique User-Agents (across ~500 hits)
One user agent really stands out. And one other is suspicious. I'm talking about the two that hit my site more than world-famous Google.
I don't know everything MJ12bot does, but I do know one thing it does is power paid access to "incoming" links reports via "Majestic Site Explorer": "Access raw exports from £79.99 a month". So let me get this, you crawl sites to sell people lists of who links to them? Why should I waste my bandwidth giving you pages?
But clearly it is Megaindex that is abusive. At the .com version of the site I read "MegaIndex is a powerful and versatile competitive intelligence suite for online marketing, from SEO and PPC to social media and advertising research." Again, this is a bullshit use of my resources (bandwidth, web server CPU) for some commercial enterprise that cannot benefit me.
So: another new plugin is born, browser_block
. Goodbye
Megaindex. Goodbye Majestic.