Good news: languages that are more widespread have a higher complexity. This means that underserved languages are more likely to be learned well using a smaller corpus, which could help a bit with the rich-get-richer problem of LLMs and existing corpora.
https://phys.org/news/2025-02-complex-languages-efficient-communication.html#google_vignette