Lmst

psa: "TikTokSpider" does not obey robots.txt if you're looking for something to block today (if you haven't blocked it already)

#webadmin #bots

psa: that "Thinkbot" user agent that you may have seen in your server logs does not obey robots.txt and should be blocked. rather than blocking its ip (it actually uses multiple ip addresses) as its user agent suggests, i would add a rule in your server config to give it a 403 on anything it tries to access

#webadmin #bots

in this house we don't turn off selinux because it's inconvenient, and if i catch you turning it off i will give you a talking to as i turn it back on and configure it properly while scowling profusely

#linux #webadmin

Hey people with websites:

I noticed that the URL inspector of the Google Search Console rather consistently claims a 'URL is not on Google', while anyone can actually find it using Google Search.

Why?!

I actually trusted the (very slow) tool and wasted a lot of time adding pages manually. Because Google also refuses to check the sitemap.xml on a regular basis.

#seo #google #webadmin #webmaster

Screenshot of error message described before

Let's Encrypt API is having trouble. 😄 The current status is orange; they are a little okay but still recovering due to high traffic. 😄 I want to make my next Mastodon instance that maybe lasts for ≤ 1 hour lol 😄. If God wills, He will grant my instance to last for ≥ 1 month. 😄

#LetsEncrypt #Mastodon #Fediverse #WebAdmin #TechHumor #SSL #API #WebHosting #ShortLivedInstance #SysAdminLife #DevOps #WebSecurity #CertificateProblems #HighTraffic #ExperimentalHosting

<IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteCond %{HTTP:x-firefox-ai} !^$
    RewriteRule ^ - [F,L]
</IfModule>

:blobfoxthinking:

#webadmin #apache #apache2

website traffic continues to be like 80% bots

#webadmin #bots

Bookmarking this: https://billauer.co.il/blog/2025/05/phpbb-attack-bots-ip-addresses/

#tech #WebAdmin #Bots #Scraping #ScraperBots #DevOps #security

another day, another cloud hosting provider banned from the website for hosting exploit scanning bots

#webadmin #bots

the Chrome/129 bots still haven't figured out that i'm onto their bullshit and none of their requests are succeeding

#webadmin #bots

seems that the Chrome/129 botnet still hasn't figured out that i'm onto their bullshit and all they're getting for every request is a "please update your browser" page

#webadmin #bots

seems like a lot of the web scanning bots (the ones searching for admin login pages) are switching to a Chrome/121 user agent

* note: the Chrome/129 web bots are still present, and afaict are not looking for admin login pages or exploits, they seem to be trying to scrape all the site's content, probably for an llm or something (conjecture), but why they would go to the trouble of setting that up through what appears to be tor is beyond me

#webadmin #bots

i also found out that googlebot (the legit one operating from google's address space) will occasionally use a user agent of Chrome/99, despite using a more up to date ua most of the time

#webadmin

looking at the unusual Chrome/129 traffic again, it's definitely some kind of botnet, it's using hundreds of mostly unique ips, and there's periodic (hourly?) surges of requests. i'm sure there's some real people with outdated browsers getting mixed in with my apache rule, so i'm glad i didn't just do a straight block and went with a redirect page instead

#webadmin #bots

-A INPUT -s 220.187.192.0/19 -m comment --comment "bot using spoofed googlebot user agent" -j DROP

#webadmin #bots

lol someone has a bot scanning with GET /.DS_Store HTTP/1.1

#webadmin #bots

additionally, Mozilla/4.0 is only used by Internet Explorer in compatibility mode, so you may want to block that as well, since all versions of IE are deprecated at this point and shouldn't be used for normal web browsing. most versions of IE can't even connect to modern websites anymore, the only one that supports TLSv1.2 is IE11 on Win10, so if you have old protocols disabled it's almost guaranteed that any traffic you see with older IE user agents is not real (i do know there are ways to get older versions of IE working through a local tls proxy, i've done it myself for retrocomputing purposes)

although i'm not sure if edge in IE mode will use that user agent, i haven't played around with it much

#webadmin #bots

you can block user agents that match ^Mozilla but not ^Mozilla/[45]\.0 \(

* note: this will block some web crawlers such as bingbot/gptbot/claudebot as well because they put the AppleWebKit/537\.36 token in the wrong place (before the first parentheses), but you should be blocking at least the latter two of those anyway

#webadmin #bots

-A INPUT -s 168.220.247.0/24 -m comment --comment "bot using spoofed googlebot user agent" -j DROP

i am now naming and shaming for doing this

#webadmin #bots

psa: noticing a lot of unusual bot-like requests using the Chrome/129 user agent. rather than a straight up 403 block, i'm 302 redirecting them to a "please update your browser" page, in case any real users are still using an 8 month old browser for some reason

#webadmin #bots

#webadmin

Client Info