UKP Lab

The Ubiquitous Knowledge Processing Lab researches Natural Language Processing (#NLProc) with a strong emphasis on Large Language Models (#LLMs), Conversational AI and Question Answering ยท Department of Computer Science ยท @TU Darmstadt

2025-05-30

Are LM more than their behavior? ๐Ÿค”

Join our Conference on Language Modeling (COLM) workshop and explore the interplay between what LMs answer and what happens internally โœจ

See you in Montrรฉal ๐Ÿ

CfP: shorturl.at/sBomu
Page: shorturl.at/FT3fX
Reviewer Nomination: shorturl.at/Jg1BP

#nlproc #interpretability

Call for Papers, Interplay Workshop at COLM: June 23rd - submissions due. July 24th - acceptance notification. October 10th - workshop day.
2025-05-20

His work was supervised by Prof. Dr. Iryna Gurevych and Dr. Nils Reimers. The examination committee included Prof. Dr. Carsten Binnig (chair), Prof. Dr. Dan Roth (University of Pennsylvania) (co-reviewer) and Prof. Dr. Kristian Kersting.

Kexin has contributed significantly to the UKP Labโ€™s research and we are very excited to follow his next steps in academia or industry.

Congratulations again, Dr. Wang!

๐Ÿ“š Google Scholar Profile
scholar.google.com/citations?u

#UKPLab #PhDDefense #NLProc #TUDA

2025-05-20

๐ŸŽ“ ๐—–๐—ผ๐—ป๐—ด๐—ฟ๐—ฎ๐˜๐˜‚๐—น๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€ ๐˜๐—ผ ๐——๐—ฟ. ๐—ž๐—ฒ๐˜…๐—ถ๐—ป ๐—ช๐—ฎ๐—ป๐—ด!

We are delighted to share that Kexin Wang has successfully defended his PhD thesis today at the Department of Computer Science, TU Darmstadt.

In his dissertation titled โ€œImproving Dense Retrieval on Domain Adaptation and Decontextualizationโ€, Kexin explored innovative methods to enhance dense retrieval systemsโ€”especially in settings where domain shift and decontextualization pose major challenges.
(1/๐Ÿงต)

Screenshot
2025-05-19

And consider following the authors Jingcheng Niu, Subhabrata Dutta, Ahmed Elshabrawy, @harish (University of Bathโ€ฌ) & Iryna Gurevych if you are interested in more information or an exchange of ideas.

2025-05-19

4โƒฃ Finally, by investigating the internals of the LLMs ๐Ÿ‘ฉโ€๐Ÿ”ง we find that

๐Ÿ‘‰the development of ICL on downstream tasks during pre-training correlates with the model learning to use a common subspace across tasks๐Ÿ“ 

๐Ÿ‘‰and that both developments follow a predictable pattern ๐Ÿ“ˆโœจ

5/๐Ÿงต

2025-05-19

3โƒฃ ICL performance depends on how difficult a task is!

More distractors (more complex) โ†’ Lower Performance ๐Ÿ˜ถ

4/๐Ÿงต

2025-05-19

2โƒฃ ICL performance is NOT emergent ๐Ÿ›‘

We observe that ICL development is both
โžก๏ธGRADUAL and
โžก๏ธPREDICTABLE
in complex tasks

3/๐Ÿงต

2025-05-19

1โƒฃ Olsson et al.โ€™s (2022) work on ICL and Induction Heads (arxiv.org/abs/2209.11895) suggested that LLMs can follow patterns presented in-context using random token orderings:

[A][B] ... [A]โ†’[B]

โ›”๏ธBUT

๐Ÿ’กWe show that frequency matters!

More frequent โ†’ Better ICL ๐Ÿคจ

2/๐Ÿงต

2025-05-19

Is In-Context Learning (ICL) in LLMs Memorisation? Emergence? Some Algorithmic Capability? ๐Ÿค”

๐Ÿ“ขNew work exploring ICL in LLMs: arxiv.org/abs/2505.11004

๐Ÿ’กKey Finding:
ICL capabilities are linked to token frequency ๐Ÿคจ

Strap in for the unexpected ๐Ÿคฏ

A ๐Ÿงต๐Ÿ‘‡ #NLProc

2025-05-15

Read the full guest article on page 3 (in German):
๐Ÿ‘‰ www.tu-darmstadt.de/media/daa_responsives_design/01_die_universitaet_medien/aktuelles_6/publikationen_km/hoch3/pdf/hoch3_2025_2.pdf

(2/2)

#UKPLab #LLMs #Reasoning #DeepSeek #AIResearch #TUDarmstadt

2025-05-15

๐Ÿง  How good are todayโ€™s LLMs at reasoning โ€“ really?

In the latest issue of hochยณ, the university magazine of TU Darmstadt, Prof. Iryna Gurevych and Irina Bigoulaeva from the UKP Lab examine the performance of DeepSeek-R1 and R1-Zero.

Their takeaway? Even cutting-edge models struggle with atypical question formats that deviate from standard training data โ€“ highlighting ongoing challenges in robustness and generalization for generative AI.
(1/๐Ÿงต )

2025-05-13

His time at UKP Lab played an important role in shaping his interdisciplinary approach and his enduring interest in the more playful and puzzling sides of language.

Stay tuned as Tristan shares insights into his current projects and what he took with him from his time at UKP Lab.
(3/3)

2025-05-13

Today, Tristan leads the Computational Linguistics at Manitoba (CLAM) Lab at the University of Manitobaโ€™s Department of Computer Science. He is best known for his work on the computational analysis of puns, jokes, and other forms of linguistic creativity.
(2/๐Ÿงต)

2025-05-13

We launched Spotlight at #UKPLab Alumni and are happy to continue with @Logological!

Tristan was part of UKP Lab from 2011 to 2019, working on lexical semantics, especially figurative language and wordplay. This laid the groundwork for a career at the intersection of linguistics and humor.
(1/๐Ÿงต)

2025-04-29

And consider following the authors Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych and Alham Fikri Aji if you are interested in more information or an exchange of ideas. (6/6)

See you in Albuquerque ๐ŸŒต! #NAACL2025

2025-04-29

Statement-Tuning delivers powerful, data-efficient NLU โ€” ideal for low-resource settings. Dive into our paper and code for all the details!

๐Ÿ“„ Paper: arxiv.org/abs/2404.12897
๐Ÿ’ป Code: github.com/afz225/statement-tu

(5/๐Ÿงต)

2025-04-29

Remarkably, using as few as 1K statements per training task (just 16K examples in total) yields 96% of optimal performance, which is both efficient and robust. For generalization, increasing training task diversity is more effective than increasing data size per task.
(4/๐Ÿงต)

2025-04-29

Our experiments show that multi-task, statement-tuned encoders rival SOTA LLMsโ€”with up to 200ร— fewer parameters!โ€”across 7 zero-shot NLU tasks.

We also outperform previous lightweight methods on accuracy, robustness to spurious patterns and supported tasks!๐Ÿš€ (3/๐Ÿงต)

2025-04-29

Any discriminative task with a finite set of targets can be verbalized into statements. We fine-tune RoBERTa to classify statements as True or False. Just like LLM instruction-tuning, multi-task statement-tuning improves 0-shot generalization to unseen tasks. (2/๐Ÿงต)

2025-04-29

๐Ÿš€ ๐—ฆ๐—ข๐—ง๐—” ๐Ÿฌ-๐˜€๐—ต๐—ผ๐˜ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐—ด๐—ฟ๐—ผ๐˜„๐—ถ๐—ป๐—ด? ๐˜ฟ๐™ค๐™ฃ'๐™ฉ ๐™—๐™ง๐™š๐™–๐™  ๐™ฉ๐™๐™š ๐™˜๐™ค๐™ข๐™ฅ๐™ช๐™ฉ๐™š-๐™—๐™–๐™ฃ๐™ ! โšก๐Ÿ’ก

Discover Statement-Tuning in our #NAACL2025 paper: we transform NLU tasks into natural language statements, letting small models like RoBERTa shine โœจ in zero & few-shot settings at a fraction of the cost. ๐Ÿ”ฅ
(1/๐Ÿงต)

#NLProc

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst