Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
https://arxiv.org/abs/2502.17424
#HackerNews #EmergentMisalignment #NarrowFinetuning #LLMs #AIAlignment #MachineLearning
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
https://arxiv.org/abs/2502.17424
#HackerNews #EmergentMisalignment #NarrowFinetuning #LLMs #AIAlignment #MachineLearning
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs [pdf] — https://martins1612.github.io/emergent_misalignment_betley.pdf
#HackerNews #EmergentMisalignment #NarrowFinetuning #LLMs #AIAlignment #ResearchPDF