#Scraper

MinmiTheDinominmi@sfba.social
2025-12-09

Ok you beautiful nerds, can I pay one of you to build a #scraper of all the report cards of sfusd schools?

They all live in drives like this:
drive.google.com/drive/mobile/

Also can I see your code/any documentation when you’re done? This isn’t something I know how to do, but I’d like to learn and just don’t have the bandwidth to tinker it all out.

If interested, send me an idea of how much you’d charge.

#python #getFediHired

2025-12-06

Một nhà phát triển đang thăm dò nhu cầu về công cụ thu thập dữ liệu LinkedIn (scraper) để lấy thông tin hồ sơ, việc làm, công ty. Họ muốn tìm hiểu các trường hợp sử dụng từ cộng đồng. Giải pháp tương tự cũng có sẵn cho Facebook, Zillow, Google Maps.

#LinkedInScraper #CôngCụ #ThuThậpDữLiệu #SideProject #PhátTriểnPhầnMềm #Developer #DataExtraction #Scraper

reddit.com/r/SideProject/comme

2025-11-17

Tự động hóa 40% quá trình tìm kiếm việc làm với bộ công cụ scraper và AI đơn giản #TìmKiếmViệc #Automation #AI #Scraper #JobSearch #TựĐộngHóa #CôngNghệ

reddit.com/r/SideProject/comme

shom ✊🏽🐧📷🤿🏔️🪚shom@gts.shom.dev
2025-11-17

staying stealthy and reliable so your agents always get what they need without friction

"Staying stealthy"!? Hey #Mozilla, stealthy against whom? The companies trying to track users across the open web? That's the type of tech your charter calls for right!? No?

It can click, scroll, search, and submit just like a human, navigating complex flows with real-time feedback and adaptive behavior. Get full control of
the web, without the complexity.

You're building tech for the parasites to scrape the open web "without friction"? You're mimicing the very users your foundation claims to put first against extractive systems. Do you have no sense of shame or at least irony!? We're looking to you to be the last bastion that can defend the web at scale and you're out here building a Trojan horse in the cover of darkness.

It's not that I have just lost respect I've lost hope, I am not sorry for this very confrontational and scorched earth post. I'm sad and seething!

via @elilla
https://transmom.love/@elilla/115564272417922503

#Mozilla #OpenWeb #AI #Bot #Scraper

2025-10-29

"fact-checked" indeed. #grokipedia #scraper #brokenpedia

Screenshot of a supposed article by Grokipedia: "Verbatim"

the site stated tha it was fact checked by grokipedia 2 days ago. The content is verbatim copied by the original Wikipedia disambiguation page "Verbatim":

Verbatim
Look up verbatim in Wiktionary, the free dictionary. Verbatim means word for word. It may refer to:

    Verbatim (album), a 1996 album by Bob Ostertag
    "Verbatim" (song), a 2015 song by Blackbear
    Verbatim (brand), a brand of storage media and flash memory
    Verbatim (horse), an American racehorse
    Verbatim (magazine), edited by Erin McKean
    Verbatim theatre, a form of documentary theatre
Kiwixkiwix
2025-10-03

MWoffliner 1.17 has been released!
npmjs.com/package/mwoffliner/v

This is a minor new version of our flagship but the changelog is still huge with really a lot of bug fixes!
github.com/openzim/mwoffliner/

2025-09-04

Eine "poor-man's defense" in #bash gegen #Bots und #scraper als Repo "zum Codeberg getragen" 😜

blog.jakobs.systems/micro/2025

Screenshot des Repos in einem Webbrowser
Dave/Loebas :verified_pride:venthewolf@meow.social
2025-09-04

Bad morning,

I just woke up to find out that openAI scraped my entire website, ignoring robots.txt

#webmaster #scraper #openai

❀𝓪𝓵𝓬𝓮𝓪𖤐alcea@alceawis.com
2025-08-31
@PeterKratz Bis auf #Fehler macht #KI / #AI wenig wirklich beeindruckendes..
Gestohlene Datensätze die zusammengeklebt werden

Die meisten speziellen Infos musst selbst suchen und wer von #Sicherungskopien nix hält kanns gleich lassen.

Halluzinierte #Endpoints , #Scraper Ansätze die nichts bringen.

KI ist auch "Kind isst Bastelkleber" Niveau

noch.
(Hindert #CEO s aber nicht viele Leute rauszuschmeißen und #DALLE #ChatGPT sowie #DeepSeek das Feld zu überlassen.

Die Info's von #Reddit lieben es 😹

•acws #acws
2025-08-17

Building a scraper on top of pg_net in PostgreSQL: I found myself in need of a way to chain requests together (which is what you do when you scrape). It is a simple library built on top of pg_net.

A diagram showing the architecture of what I am building.
Austin Huang ❤️austin@mstdn.party
2025-08-12

Since people are dunking on #Meta #scraping again I'll share one tidbit: when @jonah and I was investigating some performance issues, I noticed that Meta-ExternalAgent was scraping /auth/sign_up and one specific invite link with different `accept` parameters (which indicates acceptance of rules), however because Mastodon returns 200 (and shows the rules again) on invalid `accept` parameters the #scraper just keeps going...

Rod2ik 🇪🇺 🇨🇵 🇪🇸 🇺🇦 🇨🇦 🇩🇰 🇬🇱☮🕊️rod2ik
2025-07-24

- L'outil ( ) qui fouille le pour vous

En gros,vous tapez un mot-clé,et l’outil va les sites .onion pour en extraire des emails, des métadonnées,des mots-clés,des images,des liens vers les réseaux sociaux

korben.info/darkdump-outil-osi

Rod2ik 🇪🇺 🇨🇵 🇪🇸 🇺🇦 🇨🇦 🇩🇰 🇬🇱 ☮🕊️rod2ik.bsky.social@bsky.brid.gy
2025-07-24

#Darkdump - L'outil #OSINT ( #Open #Source #Intelligence ) qui fouille le #dark #web pour vous En gros,vous tapez un mot-clé,et l’outil va #scraper les sites .onion pour en extraire des emails, des métadonnées,des mots-clés,des images,des liens vers les réseaux sociaux korben.info/darkdump-out...

Darkdump - L'outil OSINT qui f...

🍮 Vanilla(static)CMSlaravista@mastodon.uno
2025-07-07

🔍 / #software / #automation / #API

#ZenRows turns hours of tedious lead hunting into a fast, reliable, and automated data workflow that fills your data pipelines.

🐱🔗 laravista.altervista.org/CatLi

#catlink #SoftwareAutomation #SoftwareAutomationAPI #scraper #DataExtraction

2025-07-05

Ich hasse #ki #scraper so sehr dafür, dass ich meine Werke mit hässlichen #Wasserzeichen und XML veröffentlichen muss… Was ein Aufwand. x_X

2025-07-01

»Cloudflare Introduces Default Blocking of A.I. Data Scrapers«

Nett, wird aber kaum funktionieren. Weil: Fortgeschrittene Scraper nutzen Browser-Emulation und rotierende IPs, um sich als echte Nutzer auszugeben und technische Erkennung zu umgehen. Da es sich nur um eine serverseitige Maßnahme ohne rechtliche Bindung handelt, können solche Akteure die Sperren leicht und folgenlos ignorieren.

nytimes.com/2025/07/01/technol

#cloudflare #ai #ki #scraper

/kuk

eicker.TV ▹ Tech Newseickertv
2025-06-25

Mastodon untersagt das Training von KIs mit Inhalten der eigenen Plattform.

Neue gegen 🚫 verbietet ab 1. Juli explizit das und die Nutzung von User-Daten zum von auf seinem .

Klarer der 🤝 Die neuen Regeln untersagen automatisierte Tools wie und , um Daten abzugreifen – mit Ausnahme von normalen Suchmaschinen und Browsern. (1/2)

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst