Just released FTS5 ICU Tokenizer for SQLite!
This C extension provides robust multilingual text search using ICU library. Supports Chinese, Japanese, Thai, Arabic, Russian, Korean, Hebrew, Greek and more with proper word segmentation and language-specific normalization.
Works on Linux, macOS and Windows. Available now on GitHub (https://github.com/cwt/fts5-icu-tokenizer/).