Lmst

Zero surprised that Google is scanning Gmail emails for AI training. 😕

#google #degoogle #gmail #ai #aiscraping

https://www.classaction.org/news/thele-v.-google-llc

Lawsuit: Reddit caught Perplexity “red-handed” stealing data from Google results https://arstechni.ca/cJ4u #Perplexity.ai #googlesearch #webscraping #aiscraping #Perplexity #Policy #google #reddit #AI

Although the bland the "A.I." generated voice is detestable in the extreme the overall concept is mildly amusing if unoriginal, plus, there's a special treat for #DoctorWho fans in trying to fathom where the contents of the Laser Tracking Room were illicitly scraped from...

https://www.youtube.com/watch?v=sZkB11pO9R8

#cats #Caturday #AIart #TARDIS #copyright #AIscraping

Pay-per-output? AI firms blindsided by beefed up robots.txt instructions. https://arstechni.ca/tpDy #reallysimplylicensing #rslstandard #aicrawlers #aiscraping #AItraining #Policy #google #openai #meta #RSS #xAI #AI

Reddit blocks Internet Archive to end sneaky AI scraping https://arstechni.ca/pTjk #ArtificialIntelligence #InternetArchive #aiscraping #AItraining #Policy #google #openai #reddit #AI

(⁠⌐⁠■⁠-⁠■⁠) Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives

https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives

#perplexity #aisearch #aiscraping

Scraping for AI training may or may not be legal. But the effort crawlers put into evading detection and blocking is a smoking gun, an admission this scraping is not fair.

https://arstechnica.com/tech-policy/2025/08/ai-industry-horrified-to-face-largest-copyright-class-action-ever-certified

#AIscraping #scrapers #ai

Today, Meta's list of sites they've targeted for training their AI was leaked. We're on their list.

I do everything possible to block AI bots. I use Cloudflare AI bot protection. I block what I can. I don't know if they actually get to read anything, but they want to read us.

https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower

#Meta #AIscraping

Small image from Meta's list of sites they target for AI scraping showing that files.hear-me.social is on their list

Habe den aktualisierten AGBs von #vinted widersprochen, da sie von Datenauswertung via #Ki #AIscraping sprechen, also Nutzerdaten damit verarbeiten möchten. Da der Account sowieso schon deaktiviert war, habe ich zusätzlich um Löschung gebeten. Nun kriege ich die Antwort, dass vinted ein legitimes Interesse hätte und laut Article 17 of the EU General Data Protection Regulation (GDPR) so ziemlich alle meine Daten weiterhin auswerten darf. Ich könne ja juristische Schritte gehen. (1/2)

Publishers rally against AI scraping at IAB Tech Lab summit: Mediavine CRO Amanda Martin joined Google, Meta, and 80+ media executives to establish technical standards forcing AI platforms to respect publisher consent and compensation. https://ppc.land/publishers-rally-against-ai-scraping-at-iab-tech-lab-summit/ #AIscraping #PublisherRights #IABTechLab #MediaExecutives #TechStandards

#KI randaliert im Netz 🤖🪓 – #Admins halten dagegen 🦸

Meine @campact -Kolumne aus Mai ist heute tagesaktuell dabei!

> Herzlichen Dank an alle Admins, die unermüdlich dafür kämpfen, uns Nutzende und den Planeten vor der Gier von KI zu schützen. Ich hoffe, dieser Text ist ein Beitrag für mehr Verständnis zu diesem Thema.

👉 https://blog.campact.de/2025/05/ki-randaliert-im-netz-admins-halten-dagegen/

#SysAdmins #SystemadminAppreciationDay #FediAdmins #AI #KIScraping
#AIScraping #TDM #AdminLeiden #MastoAdmin #DataPoisoning #aitxt #GPT #GreenIT

🔍 / #software / #automation / #scraping

You can build some pretty insane applications using just #LLMs, even if you don't really know what you're doing. But what separates a good AI app from a great AI app is one thing, and that's data.

🐱🔗 https://laravista.altervista.org/CatLink/links/321

#catlink #SoftwareAutomation #SoftwareAutomationScraping #Python #BrightData #AIScraping #AI

Cloudflare wants Google to change its AI search crawling. Google likely won’t. https://arstechni.ca/eTm6 #ArtificialIntelligence #googleaioverviews #googlesearch #aiscraping #AItraining #cloudflare #aicrawler #googlebot #Google #Policy #google #AI

News Summary: Cloudflare Launches Pay Per Crawl for AI Scraping; Amazon Hits One Million Robots

You’ve heard, of course, of pay-per-view. And we are used to streaming revenue on a pay-per basis from the likes of Audible and Spotify. This week has seen the launch (admittedly at the moment in beta) of possibly the most transformative source of pay-per revenue…
https://selfpublishingadvice.org/cloudflare/

#AIscraping #Amazonrobots #Cloudflare #generativeAI #PayPerCrawl
@indieauthors

News Summary: Cloudflare Launches Pay Per Crawl for AI Scraping; Amazon Hits One Million Robots https://selfpublishingadvice.org/cloudflare/ #websitemonetization #Amazonrobots #generativeAI #PayPerCrawl #AIscraping #Cloudflare #News

Open-Access Blogs vs. Members-Only Blogs: Which Is Better?

https://genxnotes.com/post/id/open-access-blogs-vs-members-only-blogs-which-is-better

#AIScraping #Login

Pay up or stop scraping: Cloudflare program charges bots for each crawl https://arstechni.ca/MJGn #ArtificialIntelligence #aiscraping #AItraining #cloudflare #robots.txt #aicrawler #Policy #aibots #AI

A website appears to be scraping hashtags and creating AI articles, and then replying to the OG post

It stole one of my posts (https://oldfriends.live/@paul/114770093020700675) for its AI created article then spammed me from s00laiman@mastodon.social

It's doing it with #HashTagGames tags and other trending hashtags.

Edit: making links dead as it appears to serve malware now: www.trend247daily.com/articles

#MastoAdmin

Article created from scraped post: www.trend247daily.com/article/mastering-the-art-of-the-productive-day-wake-up-look-busy-go-to-bed

See this thread above, unless the AI content spammer deletes its reply and breaks the thread.

I don't know where it is getting its content, from it's Mastodon Account ( s00laiman@mastodon.social ) account, rss, or the API. If it has an application I would hope staff@mastodon.social and moderation@mastodon.social would shut it down from scraping the API.

#Spam #Fediblock #AIScraping