#Jsoup

Jonathan Hedleyjhy@tilde.zone
2025-04-29

Very happy to announce that I've just released #jsoup 1.20.1!

Lots of improvements and bugfixes -- improved HTML parse rules to align with modern browsers, improved XML namespace handling, and a redesigned HTML pretty-printer for better consistency and customizability. This release also delivers performance optimizations, new API enhancements such as flexible tag definitions via TagSet, concise CSS selectors, and parser thread-safety improvements.

Big thanks to everyone who helped out.

jsoup.org/news/release-1.20.1

Jonathan Hedleyjhy@tilde.zone
2025-03-04

Good news everybody! I just released jsoup v1.19.1. It adds http/2 request support, and has a bunch of other improvements and bug fixes.

jsoup.org/news/release-1.19.1
#jsoup

Jonathan Hedleyjhy@tilde.zone
2025-01-08

The upcoming version of #jsoup, 1.19.1 will (finally!) support making http/2 requests, if you're running on Java 11+. It still works down to Java 8 if you need that.

It's a drop-in update with no changes required for existing Jsoup.connect() code, other than setting a system property (jsoup.useHttpClient) to enable.

The implementation uses Java's multi-release JAR feature to make requests via the HttpClient impl if it's available, or will fallback to the current HttpURLConnection. This also gives a path to http/3 support when that PEP lands in Java.

github.com/jhy/jsoup/pull/2257

Erik C. Thauvinethauvin
2024-12-03
2024-11-06

Have you listened to #74 yet?
@javajuneau @dhinojosa @ianhlavats and @kito99 are joined by pilot and contributor, @lprimak. They discuss , , @devoxx Genie, Assistant, , , , , , , , , , and more! pubhouse.net/2024/10/stackd74-

2024-10-30

On your next outing, listen to #74: @javajuneau @dhinojosa @ianhlavats and @kito99@mastadon.social are joined by pilot and contributor, @lprimak. They discuss , , @devoxx Genie, Assistant, , , , , , , , , , Apple and much more! pubhouse.net/2024/10/stackd74-

2024-10-21

#74: But it’s soup

After a long hiatus, the whole gang is back! @javajuneau @dhinojosa @ianhlavats and @kito99@mastadon.social are joined by pilot and contributor, lprimak@mastodon.social. They discuss , , @devoxx Genie, Assistant, , , , , , , , , , and much more! pubhouse.net/2024/10/stackd74-

Kito D. Mannkito99
2024-10-18

#74: But it’s soup

After a long hiatus, the whole gang is back! @javajuneau @dhinojosa @ianhlavats and @kito99 are joined by pilot and contributor, @lprimak. They discuss , , @devoxx Genie, Assistant, , , , , , , , , , and much more! pubhouse.net/2024/10/stackd74-

Jonathan Hedleyjhy@tilde.zone
2024-01-05

I've been working on a new feature for jsoup that I think is pretty cool: the new StreamParser lets you parse a document progressively with stream(), or lazily with selectNext(query). Elements are parsed from the backing input stream on demand, and when emitted will include all their children. This gives the benefits and simplicity of a DOM parser, but also enables chunked parsing that would otherwise cause out of memory exceptions, or to terminate the parse early.

The actual parse tree is backed by the full HTML or XML parser, and so all that functionality remains (like implicit elements, source position tracking, error tracking, etc).

If you're interested in this, please take a look at the implementation, and try it out by installing a snapshot. It would be great to incorporate any initial feedback / bug-fixes prior to releasing it in the next version of #jsoup.

github.com/jhy/jsoup/pull/2096

Jonathan Hedleyjhy@tilde.zone
2023-12-29

I just released jsoup 1.17.2! Mostly bug fixes this round.

jsoup.org/news/release-1.17.2
#jsoup

Jonathan Hedleyjhy@tilde.zone
2023-11-27

I'm very happy to announce the launch of #jsoup version 1.17.1!

Out now with support for request-level authentication, attribute name & value source ranges, stream() iterable support, the :is() selector, and a bunch of other improvements and bug fixes.

jsoup.org/news/release-1.17.1

Tim Allisontallison
2023-09-26

And merged, it is!

newmediamaxnewmediamax
2023-08-31

2023年最流行的Java網頁抓取庫對比:Jsoup、HtmlUnit和Selenium

newmediamax.com/article/1izvre

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst