#WebScraper

AI Daily Postaidailypost
2025-12-07

Bright Data’s new API lets developers weave AI/ML models, LLMs and generative AI directly into web‑scraping workflows while keeping bots at bay. JavaScript‑ready, open‑source friendly, and built for seamless anti‑bot protection. Dive into the benchmarks and see how it powers smarter data pipelines.

🔗 aidailypost.com/news/bright-da

🍮 Vanilla(static)CMSlaravista@mastodon.uno
2025-11-19

🔍 / #software / #automation / #scraping

#WebScraper - The #1 web scraping extension

The most popular web scraping extension. Start scraping in minutes. Automate your tasks with our Cloud Scraper. No software to download, no coding needed.

🐱🔗 laravista.altervista.org/CatLi

#catlink #softwareautomation

2025-09-25

🤖 Ottenere l'elenco di tutte le immagini di una pagina HTML con PHP
Sviluppare un web scraper con PHP per ottenere l'elenco completo di tutte le immagini di un url...

👉 selectallfromdual.com/blog/1639

#html #php #webscraper

Philip Thomasphilipthomas
2025-08-28

8 Web Scraping & Crawling Tools mit n8n-Anbindung (Workflow-Vorlage zum kostenlosen Download)

Wir schauen uns heute an, wie ihr Web Scraping und Crawling betreiben könnt. Dazu schauen wir uns 8 verschiedene Tools an und verbinden diese auch direkt mit n8n, damit ihr die extrahierten Daten in einem Workflow weiter verarbeiten könnt.

youtube.com/watch?v=LP571gnIg7A

@reiver ⊼ (Charles) :batman:reiver
2025-05-21
@reiver ⊼ (Charles) :batman:reiver
2025-05-21

2/

Scraping (as in Web Scraping) is the act of extracting data from HTML web-pages where the data is NOT machine-legible.

If the data, even in an HTML web-page, is in a machine-legible format, then it is NOT scraping.

...

And, getting data in JSON (key-value pairs) is definitely NOT scraping — as JSON's purpose is to communicate data in a machine-legible manner.

CC: @404mediaco

@reiver ⊼ (Charles) :batman:reiver
2025-05-21

1/

If these researchers used a typical HTTP-based API that returns JSON, then —

What these researchers did is NOT scraping.

CC: @404mediaco

RE: 404media.co/researchers-scrape

"Researchers published a massive database of more than 2 billion Discord messages that they say they scraped using Discord’s public API. The data was pulled from 3,167 servers and covers posts made between 2015 and 2024, the entire time Discord has been active."
N-gated Hacker Newsngate
2025-05-11

Oh joy, another "game-changing" named 🤖—because apparently, the internet was just crying out for one more script-kiddie to scrape and bloat their hard drives with HTML they’ll never use. Congrats, user, your contribution to the overload of useless data is truly groundbreaking. 🚀🎉
github.com/jaypyles/Scraperr

@reiver ⊼ (Charles) :batman:reiver
2025-04-17
@reiver ⊼ (Charles) :batman:reiver
2025-04-17
@reiver ⊼ (Charles) :batman:reiver
2025-04-17
@reiver ⊼ (Charles) :batman:reiver
2025-04-17
@reiver ⊼ (Charles) :batman:reiver
2025-04-17
Enzyklopädie Roter Kreiswissen@sozial.roter-kreis.de
2024-03-23

Um im föderalen Verband zu erfahren, welche Aktivitäten es in bestimmten Tätigkeitsbereichen gibt, wird im DRK mit Webscraping der Websites der Kreis- und Landesverbände experimentiert.
➡️ drk-wohlfahrt.de/blog/eintrag/ ("Wie Data Science das DRK in der Wohnungslosenhilfe unterstützen kann")

#DRK #RotesKreuz #DataScience #DataScienceHub #Webscraping #Webscraper #DSSG #Wohlfahrt #Wohlfahrtspflege

Inautiloinautilo
2024-02-15


The text file that runs the internet · Is a basic social contract of the web falling apart? ilo.im/15xzdk

_____

Vedran Mandićvekzdran@hachyderm.io
2024-01-27

I am so happy with the first own web application 🎉 I have developed: Tris, a simple and free web crawler 🕸️ 🕷️ !

You can try it for free online: tris.fly.dev, limited to 3 parallel crawls and 100 links of path depth of 3.

Next thing I will add will be a text input to set a target domain hhh, now I am making it hard! 🙈

#node #nodejs #web #webcrawler #crawler #seo #datatools #webscraper #scraping #seotools #seotool #tris #triswebcrawler #webapp #indie #indiedev

Vedran Mandićvekzdran@hachyderm.io
2024-01-25

I am so happy to get recommendations on fly.io here. I managed to finally deploy my NodeJS web scraper app. World meet Tris: tris.fly.dev

#webscraper #scraping #nodejs #seotools #seo

Inautiloinautilo
2024-01-06

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst