#AIbots

2025-05-18

Den zweiten Tag* online - und die Bots fallen darüber her. Was für eine Pest!
#AIbots #Bots #traffic #website
*Edit: noch keine 24 h.

Screenshot mit den Zugriffsdaten einer Webinstanz. 1025 Zugriffe und die Site ist noch keine 24 h online.
2025-05-14

If you are interested in an easy and low-tech way to separate bots from humans in your logs without exposing any user data, you can do this with a little web server config and a single line of JavaScript.

michaelnordmeyer.com/a-simple-

#Bots #AIBots #LowTech #Analytics #SelfHosting #Blogging

Mr Tech Kingmrtechking
2025-05-06

AI bots shook Reddit. Now, user verification is planned to keep it human. But can Reddit protect its vital anonymity, a core promise to users?

Reddit's AI Bot Fight: Verification Up, Anonymity at Risk?
Mr Tech Kingmrtechking
2025-05-06

AI bots infiltrated Reddit's ChangeMyView, sparking a crisis. Now Reddit mulls user verification to stay human, risking its cherished anonymity. Can they balance authenticity & privacy?

Reddit Acts on AI Bots: New Verification, Anonymity Safe?
Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-05-03

"When Reddit rebranded itself as “the heart of the internet” a couple of years ago, the slogan was meant to evoke the site’s organic character. In an age of social media dominated by algorithms, Reddit took pride in being curated by a community that expressed its feelings in the form of upvotes and downvotes—in other words, being shaped by actual people.

So earlier this week, when members of a popular subreddit learned that their community had been infiltrated by undercover researchers posting AI-written comments and passing them off as human thoughts, the Redditors were predictably incensed. They called the experiment “violating,” “shameful,” “infuriating,” and “very disturbing.” As the backlash intensified, the researchers went silent, refusing to reveal their identity or answer questions about their methodology. The university that employs them has announced that it’s investigating. Meanwhile, Reddit’s chief legal officer, Ben Lee, wrote that the company intends to “ensure that the researchers are held accountable for their misdeeds.”

Joining the chorus of disapproval were fellow internet researchers, who condemned what they saw as a plainly unethical experiment. Amy Bruckman, a professor at the Georgia Institute of Technology who has studied online communities for more than two decades, told me the Reddit fiasco is “the worst internet-research ethics violation I have ever seen, no contest.” What’s more, she and others worry that the uproar could undermine the work of scholars who are using more conventional methods to study a crucial problem: how AI influences the way humans think and relate to one another."

theatlantic.com/technology/arc

#AI #GenerativeAI #SocialMedia #Chatbots #AIBots #Reddit #MediaManipulation #Persuasion

2025-05-02

AI Crawlers stealing your content? Time to fight back! 💪

LLMs and AI bots are scraping the web, stealing up your data, hogging bandwidth, and even crashing servers under aggressive loads.

Don’t let them freeload! The CrowdSec AI Crawlers Blocklist stops unwanted harvesting before it hurts your site’s performance or privacy.

Regain control over your digital assets: crowdsec.net/blog/protect-agai

#AIcrawlers #blocklists #threatintelligence #cybersecurity #infosec #AIbots #dataprotection

2025-04-21

The requests from residential and even mobile IPs, which I have been noticing at least for a year, might be explained by sketchy SDKs embedded in apps to form a botnet of at least 15 million IPs. These SDKs are embedding voluntarily by their developers, because they get a percentage of the proceeds. The resulting networks are sometimes called a peer-to-business (P2B) service.

jan.wildeboer.net/2025/04/Web-

#AIBots #BadBots

2025-04-21

Today on my blog, a bad bot requested the same URL 30 times in less than a second, all from different IPs from residential and mobile ISPs with faked browser user-agents. I understand if they would have requested 30 different URLs in the same time frame, but 30 times the same URL?

If I would be using a dynamic blog like #WordPress without proper caching, the blog could have been down. Thanks to a static site generator like #Jekyll or #Hugo, I’m just annoyed, but not impacted.

#SSG #AIBots

Albin Larssonabbe98
2025-04-21

The wast majority of AI scraper traffic hitting the sites I maintain come from Cloudflare’s, Amazon’s, Microsoft’s, or Google’s networks.

They must make so much money on all of this given that they host so much of the Internet.

Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-04-18

"American police departments near the United States-Mexico border are paying hundreds of thousands of dollars for an unproven and secretive technology that uses AI-generated online personas designed to interact with and collect intelligence on “college protesters,” “radicalized” political activists, and suspected drug and human traffickers, according to internal documents, contracts, and communications 404 Media obtained via public records requests.

Massive Blue, the New York-based company that is selling police departments this technology, calls its product Overwatch, which it markets as an “AI-powered force multiplier for public safety” that “deploys lifelike virtual agents, which infiltrate and engage criminal networks across various channels.” According to a presentation obtained by 404 Media, Massive Blue is offering cops these virtual personas that can be deployed across the internet with the express purpose of interacting with suspects over text messages and social media.

Massive Blue lists “border security,” “school safety,” and stopping “human trafficking” among Overwatch’s use cases. The technology—which as of last summer had not led to any known arrests—demonstrates the types of social media monitoring and undercover tools private companies are pitching to police and border agents. Concerns about tools like Massive Blue have taken on new urgency considering that the Trump administration has revoked the visas of hundreds of students, many of whom have protested against Israel’s war in Gaza."

404media.co/this-college-prote

#AI #AIBots #SocialMedia #Bots #MassiveBlue #USA #PoliceState #Surveillance

2025-04-15

Erweitert um die Nutzung der robots.txt Datei.

linux-nerds.org/topic/1689/ai-

#linux #nginx #aibots

Elliott Bledsoeelliottbledsoe
2025-04-14

Here's WTF happened last week:

eb.wtf/3G6WotT

1️⃣ More arts industry organisations come out with election asks.

2️⃣ Spotify’s Turn Up AUS campaign hopes to turn up the volume on Australian music.

3️⃣ AI web scaping is costing WIkimedia and lots of other content creators money in increased bandwidth.
@wikipedia @wikimediafoundation

A large question mark in two shades of orange  and a large exclamation mark is in two shades of green. Both sit on a bright pink background.
Cloudbookletcloudbooklet
2025-04-14

🚨 OpenAI Week is HERE! 🚨

OpenAI is preparing to launch GPT-4.1 nano, alongside o4-mini and o3. Could this be the open-source model Sam Altman hinted at?

A smaller, more efficient GPT-4 could make AI more accessible, sparking new innovations for developers and researchers. Can't wait to see how this unfolds!

OpenA
AiBayaibay
2025-04-10

🤖 I bot AI stanno mettendo sotto assedio Wikimedia! Scopri cosa succede e come ci sta influenzando.

🔗 aibay.it/notizie/il-boom-dei-b

2025-04-06

eine sache, die mir sehr am herzen liegt und bei der ich mich frage, warum das überhaupt noch ein thema sein muß.

weact.campact.de/petitions/dee

Grumpy Old Techie 🕊️grumpyoldtechie@hostux.social
2025-04-04

At least openai is now trying to read robots.txt but still abusively hitting the server from multiple ip addresses. They also want you to do robots.txt setup specific to them. They also tried to bypass cloudflare.

52.159.249.97 - - [04/Apr/2025:19:41:53 +0200] "GET / HTTP/2.0" 444 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +openai.com/bot"

They do publish their ip ranges, we will see if they speak the truth.

#ai #aibots

2025-04-04

Imagine: Text corporas, curated by #libraries, whether #OpenAccess or licensed, ready to use out of the #IDE of a #DataScientist in a secure environment. No need to download lots of data, no need to worry about #AIBots. Here it is:
blog.oceanprotocol.com/free-co

We already delivered a Proof-Of-Concept for @CrossAsia and will continue...

2025-04-03

This is a CPU graph of a web host that began having AI bots absolutely slam it starting at 4am UTC.

I blocked all Chrome user agents older than 120 at about 10:45 UTC.

These AI bots aren't using "nice" names like ChatGPT or AmazonBot. No, more like Chrome/116 or similar and they come from ALL OVER.

I am so tempted to put Iocaine or Nepenthes on the machine to generate Markov Chain garbage to poison the well, but I'd have to have Nginx map the older user agent string with regex. It probably could be done but this might piss off my employer. 😂

#SysAdmin #AIbots #FuckAI

A line graph over time of a web server CPU utilisation graph. At 4am UTC it spikes to between 80 and 100% CPU then drops down to 10% or below at 10:45 am UTC when I blocked the bots.
2025-03-30

Enabled Cloudflare AI Labyrinth

feddit.online/post/485431

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst