Foundations of Language Model Security

Theory, Practice, and Open Problems
EurIPS 2025 Workshop • December 6, 2025

About the Workshop

Add your questions for Discussion →

Language model security remains fundamentally misunderstood. While researchers have catalogued countless adversarial attacks and proposed numerous defenses, we've barely scratched the surface of why these vulnerabilities exist. The mathematical foundations that enable them, the internal mechanisms that process malicious inputs, and the gap between our benchmarks and actual security threats remain opaque.

This workshop will bring together researchers to share research and discuss the root causes of model vulnerability and how we might design secure and robust architectures from first principles.

Emphasizing foundational understanding over incremental improvements, we ask:

Our goal is to catalyze rigorous, cross-disciplinary discussion that advances the theoretical, empirical, and evaluative foundations of language model security.

Workshop Format

The workshop consists of four thematic blocks. Each block includes an expert keynote (45 minutes), two contributed talks (15 minutes), and an extended guided discussion (30 minutes) among participants, presenters, and domain experts. Our format prioritizes deep engagement and discussion over talk density.

3

Trade-offs in System-level defences against Prompt Injection

Keynote: Ilia Shumailov
4

AI Security in Industry

Keynote: Kathrin Grosse (IBM Research)
1

Securing Real-World AI Agents

Keynote: Luca Beurer-Kellner (Snyk)
2

Securing AI Agents with Information Flow Control

Keynote: Santiago Zanella-Béguelin (Microsoft Research Cambridge)
45 min
Expert Keynote
15 min
2 Contributed Talks
30 min
Guided Discussion
×4

Invited Speakers

Ilia Shumailov
AI Sequrity Company
ML Security & Privacy
Trade-offs in System-level defences against Prompt Injections
In this talk I will discuss what we currently know about prompt injection defences today, followed by what the next generation of defences is likely to look like.
Kathrin Grosse
IBM Research, Zurich
AI Security in Industry
LLM Security - An industrial and an end-users' perspective
LLMs are widely used, in both industry and by end users. Yet, on both ends of these spectrums, there are very different challenges, which we will discuss in this talk. On the one hand, companies must ensure the availability, confidentiality, and integrity of systems - on a level that supports both governance and audit-ability. On the other hand, end-users face different dilemmas when interacting with LLM. They are often unaware of pitfalls and engage, potentially unaware, in security-relevant behavior. Lastly, we briefly discuss AI security incident reporting as a possible solution to the open questions raised in both areas.
Luca Kellner
Snyk
Securing Real-World AI Agents
Securing Real-World AI Agents
AI agents introduce a paradigm shift in software security, moving from predictable, deterministic systems to components with non-deterministic behavior. This presents critical challenges, especially when agents are given autonomy and access to sensitive real-world computer systems (such as coding IDEs). This talk provides insight into Snyk's security research, detailing ongoing work to both exploit and secure real-world agents. Based on extensive red teaming against Model Context Protocol (MCP) systems, we illustrate key vulnerabilities and present a framework for effectively protecting against sophisticated attacks.
Santiago Zanella-Béguelin
Microsoft Research Cambridge
Confidential AI Team
Securing AI Agents with Information Flow Control
Indirect prompt injection attacks are the result of information flow security violations: we knowingly allow untrusted data to taint the context of an agent without constraining its actions. In this talk, I will describe how information flow control offers a robust, deterministic defense against prompt injection attacks. In a nutshell, by attaching confidentiality and integrity labels to data ingested by an agent and propagating these labels as the data is processed, information flow analysis can determine when confidential or untrusted data influences the actions of an agent. A runtime monitor can then enforce security policies before executing consequential actions (e.g., for tool-calling agents, based on the label of the context that generated a tool call and the labels of the arguments of the call).

Schedule

Location: IT University, Rued Langgaards Vej 7, 2300 Copenhagen, Aud. 2 (16 min away form Bella Center)

08:50 - 09:00
Opening Remarks
09:00 - 10:30
Block 1: Securing Real-World AI Agents
Keynote: Luca Beurer-Kellner (Snyk)
Contributed talk: HackAgent: An Open-Source Framework for AI Agent Security Testing
Contributed talk: From Traditional to Agentic: AI Security Models Autonomously Acting Language Models Against Emergent Threats
Guided discussion session (Aud. 2, Aud. 3, Aud. 4, 3C03)
10:30 - 11:00
Break
11:00 - 12:30
Block 2: Practical LLM Security
Keynote: Kathrin Grosse (IBM Research)
Contributed talk: NLP Security and Ethics, In the Wild
Contributed talk: Beyond Human-Like: Measuring Vulnerabilities from LLM Anthropomorphism across cultures
Guided discussion session (Aud. 2, Aud. 3, Aud. 4, 3C03)
12:30 - 13:30
Lunch Break
13:30 - 15:00
Block 3: Trade-offs in System-level defences against Prompt Injections
Keynote: Ilia Shumailov
Contributed talk: Systems thinking for LLM security
Contributed talk:A Compositional Algebra for Provable LLM Safety
Guided discussion session (Aud. 2, Aud. 3, Aud. 4, 3C03)
15:00 - 15:30
Break
15:30 - 17:00
Block 4: Securing AI Agents with Information Flow Control
Keynote: Santiago Zanella-Béguelin
Contributed talk: PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit Patching
Contributed talk: Fairness-Safety Alignment in Audio Large Language Models
Guided discussion session (Aud. 2, Aud. 3, Aud. 4, 3C03)

Contributed Talks

We are excited to share that the following researchers which be presenting their work as lightning talks at the workshop. Thank you to all who submitted talks and supported the review process.

Block 2: Practical Security
NLP Security and Ethics, In the Wild
Heather Lent
Block 1: AI Agents
From Traditional to Agentic AI: Securing Autonomously Acting Language Models Against Emergent Threats
Alyssa Columbus
Block 3: System Defences
A Compositional Algebra for Provable LLM Safety
Sam Wang, Alasdair Paren
Block 1: AI Agents
HackAgent: An Open-Source Framework for AI Agent Security Testing
Nicola Franco
Block 2: Practical Security
Beyond Human-Like: Measuring Vulnerabilities from LLM Anthropomorphism across cultures
Siddharth Milind Pawar, Sarah Masud
Block 4: Privacy & Alignment
Fairness-Safety Alignment in Audio Large Language Models
Ranya Aloufi, Srishti Gupta, Lea Schönherr
Block 4: Privacy & Alignment
PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit Patching
Anthony Hughes
Block 3: System Defences
Systems thinking for LLM security
Maia Fraser

Topics of Interest

Organizers

Egor Zverev
Egor Zverev
Institute of Science and Technology Austria
Aideen Fay
Aideen Fay
Microsoft & Imperial College London
Sahar Abdelnabi
Sahar Abdelnabi
Microsoft, ELLIS Institute Tübingen, MPI-IS & Tübingen AI Center
Mario Fritz
Mario Fritz
CISPA Helmholtz Center & Saarland University
Christoph H. Lampert
Christoph H. Lampert
Institute of Science and Technology Austria

Volunteers

Alexander Panfilov
Alexander Panfilov
ELLIS / IMPRS-IS

Discussion Chairs

Guided discussion in each thematic block will be moderated by a group of discussion chairs consisting of the speaker, organizers and volunteers who will guide the conversation between the workshop participants.

Block 1: Securing Real-World AI Agents
Luca Beurer-Kellner
Luca Beurer-Kellner
Snyk / Invariant Labs
Egor Zverev
Egor Zverev
ISTA
Marc Fischer
Marc Fischer
Snyk / Invariant Labs
Maura Pintor
Maura Pintor
University of Cagliari
Block 2: Practical LLM Security
Kathrin Grosse
Kathrin Grosse
IBM Research, Zurich
Egor Zverev
Egor Zverev
ISTA
Maura Pintor
Maura Pintor
University of Cagliari
Marc Fischer
Marc Fischer
Snyk / Invariant Labs
Block 3: Trade-offs in System-level defences against Prompt Injections
Ilia Shumailov
Ilia Shumailov
AI Sequrity Company
Alasdair Paren
Alasdair Paren
University of Oxford
Marc Fischer
Marc Fischer
Snyk / Invariant Labs
Block 4: Privacy and Alignment
Santiago Zanella-Béguelin
Santiago Zanella-Béguelin
Microsoft Research Cambridge
Srishti Gupta
Srishti Gupta
University of Cagliari
Anthony Hughes
Anthony Hughes
University of Sheffield

Supported By

ELSA
ELSA

Contact

For questions about the workshop, please contact:

egor [dot] zverev [at] ist.ac.at

EurIPS 2025 Workshop on Foundations of Language Model Security
December 6, 2025 • Copenhagen, Denmark