#HTTrack

the biggest things i need ai to do for me is to have a high initial elo ranking but also be trainable to scan all local docs and then also bring in a lots of real time data and open datasets 24/7, display results on series of dashboards #rag #pydantic #yacy #httrack #cached version #best stacks #free for commercial use #competitive intel #tailored data

GripNewsGripNews
2025-03-18

🌖 HTTrack 網站複製工具 - 免費軟體離線瀏覽器(GNU GPL)
➤ HTTrack 為網站複製提供便利,讓您隨時離線瀏覽網頁內容。
httrack.com/
HTTrack 是一款免費的離線瀏覽器工具,可以將網站從網際網路下載到本機目錄,遞迴建立所有目錄,將HTML、圖片和其他文件從伺服器傳輸到您的電腦。它保留原始網站的相對連結結構。
+ 這款工具看起來非常實用,可以讓我在沒有網路的情況下隨時瀏覽網站內容。
+ 非常感謝開發這樣實用工具的人,對於需要經常參考網站內容的人來說,這絕對是個好幫手。

N-gated Hacker Newsngate
2025-03-18

🌐🤦‍♂️ "Look Ma, I copied the entire internet! , the digital hoarder's dream, lets you download the web so you can finally browse those cat memes offline. Because nothing screams cutting-edge technology like reading 2005 forum threads in 2023." 📂😂
httrack.com/

Verfassungklage@troet.cafeVerfassungklage@troet.cafe
2025-02-02

#HTTrack - Der #Website Downloader

In diesem Tutorial zeige ich dir, wie du ganze Websites mit HTTrack für den Offline-Zugriff speichern kannst. Egal, ob für die eigene Sicherung oder einfach zum Stöbern ohne Internet – ich zeige dir Schritt für Schritt, wie es funktioniert.

gnulinux.ch/httrack-der-websit

2025-02-01

HTTrack - Der Website Downloader

In diesem Tutorial zeige ich dir, wie du ganze Websites mit HTTrack für den Offline-Zugriff speichern kannst. Egal, ob für die eigene Sicherung oder einfach zum Stöbern ohne Internet – ich zeige dir Schritt für Schritt, wie es funktioniert.

#httrack #Curl #wget #Website #Linux

gnulinux.ch/httrack-der-websit

Hacker Public Radiohpr@infosec.exchange
2025-01-15

New Episode: hpr4293 :: HTTrack website copier software

Henrik uses the HTTrack software to get his own copy of websites.

Hosted by Henrik Hemrin on Wednesday, 2025-01-15 is flagged as Clean and is released under a CC-BY-SA license.

Tags: #httrack, #website, #software.

Today on the #HackerPublicRadio #Community #Podcast

#HPR ❤️ #CreativeCommons

hackerpublicradio.org/eps/hpr4

2025-01-13

I am looking for archive.org as a self hosted service.

I want to have an automated static copy of a website, which preserves old copied versions.

It should provide a #crawler and a web interface to access the archived versions of the website.

The use case is a lousy CMS which often destroys content. I want to be able to restore content from the archive and to have a static website copy in the worst case.

#SelfHosting #WebsiteArchive #Archive #OffsiteBackp #Backup #HTTrack #WebsiteCopy

Mit HTTrack kannst du komplette Webseiten auf deinen Linux-Computer herunterladen! 🌎

In meinem neuesten Tutorial zeige ich dir, wie du HTTrack installierst und dein erstes Projekt anlegst. Wir nutzen die benutzerfreundliche grafische Variante HTTraQt, die die Bedienung super einfach macht.

youtu.be/jtis19cclhg

spacefun.ch/linux-videos#extra

#HTTrack #Linux #Webscraping #Tutorial

Reinder Dijkhuis Does Artreinderdijkhuis@mastodon.art
2024-10-19

Actually, lemme think out loud about what I need #HTTrack to do, before I forget. It needs to pull jpg, png and svg images, javascript and any external CSS*) from any level within the Comicfury.com domain, but external links need to be skipped for mirroring.

*)AFAIK all CSS within ComicFury is inline! A baffling decision but one that will make my life easier with this. But I may be mistaken.

search engine on a stick would be a fun project 1tb nvme enc persistent bootable and you can spider your own sites in addition to top 10k sites already crawled and indexed - yacy could stand to be much more automated - it is a bit of work to get it set - not the config just all the sites loaded #httrack

Rihards Olupsrichlv
2024-02-19

seems to be unmaintained (last release in 2017).

Any maintained recent opensource mirroring solution than can offload auth to a browser (for example, like can)?

httrack.com/

Bashinho - Sohn der Bashbashinho@social.tchncs.de
2024-01-18

Manchmal will man ja auch eine ganze Webpräsenz sichern. #Httrack ist dafür auch ein gutes Tool, aber die Voreinstellungen müssen angepasst werden. #OSINT bashinho.de/2024/01/18/webseit

razzlom is always sadrazzlom@quietplace.xyz
2023-09-08

В очередной раз убеждаюсь, что #wget великая вещь!

Одна мелкая бура сообщила о своём закрытии, и я решил её сохранить себе.

Попробовал сначала
#HTTrack, он пыхтел полдня и сохранил только html файлы.

wget сначала отказывался зеркалить сайт, но я добавил
-U и всё заработало. Примерно за 2 два часа он скачал весь сайт и все картинки.

Теперь я обладаю ~1800 картинками среднего качества и не знаю что с этим делать.
​:blobcatshrug:​

2023-08-31

I would cancel one web server from an old-company, but wanted to keep the site somewhere (a php one).
Using httrack and gitlab pages I could do it quite easily! The site looks exactly the same, now static, and no cost to keep it running (only need to pay the domain).

Some days I like technology, mainly the free/libre ones :)

#gitlab #floss #httrack

Had to use #HTTrack to backup a website.
So here is my 2 cents command line to download a little bit faster than the default options :

humanize.me/nerd/httrack.html

#backup #mirror #website

2023-04-10

Puras broncas al tratar de hacer un #WebScraping de un Google site, ni con el famoso #httrack

𝖑𝖔𝖗𝖊𝖓 𝖉𝖎𝖆𝖘lorendias@qoto.org
2021-07-03

@jgoerzen Sounds like #httrack but a little better, although in my experience html sites are a PITA -- It would be nice if it was literally a .tar archive containing the website & files like how many file formats are archives in disguise.

2020-10-25

@papabjoern #httrack? mit Webschnittstelle

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst