Ivan Habernal

Full professor at Ruhr-Universität Bochum | Leading the Trustworthy Human Language Technologies group | Playing the bass | He/him/his

2024-09-24

In the next edition of PrivateNLP '25 workshop, we'd like to include under-represented groups in the organizer team and/or the program committe! If that sparks your interest, please get in touch. Thanks for reposting :)

2024-04-04

Share your cool research on privacy and submit to the 5th WS on Privacy in NLP - this year co-located with ACL in Bangkok in August! Submission deadline May 17, more in the CfP below:

sites.google.com/view/privaten

2024-03-18

"DP-NMT: Scalable Differentially Private Machine Translation"

TLDR; Fast and scalable DP framework with JAX for transformers for NMT

Paper: aclanthology.org/2024.eacl-dem
Code: github.com/trusthlt/dp-nmt

(3/3)

2024-03-18

"Answering legal questions from laymen in German civil law system"

TLDR; Legal QA for laymen in Germany - a new task & benchmark data

Paper: aclanthology.org/2024.eacl-lon
Code & Data: github.com/trusthlt/eacl24-ger

(2/3)

2024-03-18

TrustHLT is proudly presenting two papers at @eaclmeeting !!

I personally have FOMO :) but talk to Timour Igamberdiev if you're interested in differential privacy for NLP and to Mahammad Namazov if you're into legal NLP

(1/3)

2024-03-09

If you're also reviewing for #starsem2024 , I've added their review template to the collection of "offline" markdown blank review forms -> feel free to reuse!
github.com/habernal/blank-peer

2024-03-08

@j2kun Cool -- looking forward to your new book. By the way, the first one was a game changer for me, many thanks for that!!

2024-03-06

@tedted ...and even if it is, don't tell your VCs :) "vintage math" or "traditional math" sounds much cooler

2024-03-06

@tedted "old-fashioned math" -- nice framing :)

2024-02-22

"How to win #SemEval2024 Starter Pack"

1) GPT-4
2) ... eh, that's it

Is it good news or bad news for research?

2024-01-22

DP-NMT just accepted to EACL'24 demo track! 🎉 Talk to us in Malta if you're interested in privacy and machine translation. Super proud of the DP-NMT team led by Timour Igamberdiev 💪

Paper: arxiv.org/abs/2311.14465

Code: github.com/trusthlt/dp-nmt/

2023-07-17

@drgroftehauge I have no idea what they use for which products. Here machinelearning.apple.com/rese they mention 2, 4, and other values for various experiments.

2023-07-14

Will you give me your sensitive text data if I promise you differential privacy? And if yes, how "strongly" (ε) do I have to protect it?

Laypeople *do* understand DP risks for different ε, and won't give you anything for ε>4.5

arxiv.org/abs/2307.06708

w/ Chris Weiß, Frauke Kreuter

2023-05-24

@j2kun I think it should be the other way around: Learn software engineering first to better understand math (or, to understand math at all). C.S./C.Eng = structure, clarity, non-ambiguity. Math = messy code you inherited without docs :)

2023-02-23

@leon Oh these are hilarous!! Almost like genuine prof's replies :))

2023-02-22

@leon If this is a joke, I don't get it, but if it's real - can you pls send me the code? :)

2023-02-16

* With a couple of other sophisticated trics and formal proofs, we need much smaller privacy budget to get meaningful down-stream performance!

Read more here: arxiv.org/abs/2302.07636

Try it yourself here (yeah, nothing beats full reproducibility and transparency in privacy-preserving NLP research): github.com/trusthlt/dp-bart-pr

(3/3)

2023-02-16

We found out that BART has a lot of redundancy in the latest layer!

* You can "zero-out" up to 25% of the neurons -> it still regenerates the input

* This "pruning" can be learned on public data -> it reduces largly the sensitivity

(2/n)

2023-02-16

We push the boundaries of state-of-the-art text rewriting under local differential privacy for text classification!

The biggest problem with noisifying latent representations?

Small models (SoTA) -> Gets much much worse with noise

Big models? Huge sensitivity -> huge noise -> destroys utility

Retrain big models to be smaller in latent? -> Extremely costly

(1/n)

2023-02-15

@tschfflr No need to get upset, just reply with a true consulting fee, that'll do

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst