AI Safety Logo

AI, safe for human beings

The UKP Lab is pioneering research on AI Safety under the umbrella of the National Research Center for Applied Cyber­security ATHENE. While generative AI has revolutionized how we consume and interact with information, huge safety risks emerge. UKP develops novel solutions to make Generative AI safe and compatible with societal values.

Our Research Areas

Misinformation Logo

Fighting Multimodal Misinformation

Devising datasets and methods to create better fact-checking systems.

Learn More
Interpretability Logo

Interpretability & Controlling LLMs

Studying the internal mechanisms of LLMs to better control them.

Learn More
Reliability Logo

Reliability & Robustness of LLMs

Improving the defenses of LLMs against jailbreak attacks.

Learn More
Secure Code Logo

Secure Code Generation

Preventing insecure code and vulnerabilities in AI-generated software.

Learn More
Cultural Alignment Logo

Cultural Alignment

Ensuring AI models respect diverse cultural values and norms.

Learn More
Privacy Risks Logo

Privacy Risks

Protecting user privacy and preventing data leakage in AI applications.

Learn More
Copyright Risks Logo

Copyright & IP Risks

Addressing copyright and intellectual property issues in AI-generated content.

Learn More

News

Our Team

Iryna Gurevych
Iryna Gurevych

ATHENE Distinguished Professor and Lab Director

Shivam Sharma
Shivam Sharma

Postdoc

Subhabrata Dutta
Subhabrata Dutta

Postdoc

Frank Niu
Frank Niu

Postdoc

Federico Marcuzzi
Federico Marcuzzi

Postdoc

Hiba Arnaout
Hiba Arnaout

Postdoc

Hovhannes Tamoyan
Hovhannes Tamoyan

Ph.D. Student

Indraneil Paul
Indraneil Paul

Ph.D. Student

Vatsal Venkatkrishna
Vatsal Venkatkrishna

Ph.D. Student

Hassan Soliman
Hassan Soliman

Ph.D. Student

Jonathan Tonglet
Jonathan Tonglet

Ph.D. Student

Justus-Jonas Erker
Justus-Jonas Erker

Ph.D. Student

Shivam Sharma
Shivam Sharma

Ph.D. Student

Manisha Venkat
Manisha Venkat

Ph.D. Student

Luke Bates
Luke Bates

Ph.D. Student

German Ortiz
German Ortiz

Ph.D. Student

Haritz Puerto
Haritz Puerto

Ph.D. Student

Rachneet Sachdeva
Rachneet Sachdeva

Ph.D. Student

Anmol Goel
Anmol Goel

Ph.D. Student

Haishuo Fang
Haishuo Fang

Ph.D. Student

Federico Tiblias
Federico Tiblias

Ph.D. Student

Alireza Makou
Alireza Makou

Ph.D. Student

Irina Bigoulaeva
Irina Bigoulaeva

Ph.D. Student

Aishik Mandal
Aishik Mandal

Ph.D. Student

Nils Dycke
Nils Dycke

Ph.D. Student

Cecilia Liu
Cecilia Liu

Ph.D. Student

Former Members

Publications

Funding

We are grateful for the generous support of our projects by the National Research Center for Applied Cyber­security ATHENE. This research has also been partially supported by the Research Center Emergency Responsive Digital Cities https://www.emergencity.de/

  • ATHENE Distinguished Professorship
  • Safeguarding LLMs against Misleading Evidence Attacks
  • Trustworthy and Explainable AI-generated Text Detection
  • Fake News and Conspiracy Theories
  • Quality Assurance of Biomedical Literature
  • Privacy-Aware Domain-Adaptive Medical NLP
  • Adversarial Attacks on NLP systems
  • Protecting Privacy and Sensitive Information in Texts
  • Security in Large Language Models
  • Code Transformers and Knowledge Graphs for Vulnerability Detection