Foundations of Language Model Security EurIPS 2025 Workshop

About the Workshop

Language model security remains fundamentally misunderstood. While researchers have catalogued countless adversarial attacks and proposed numerous defenses, we've barely scratched the surface of why these vulnerabilities exist. The mathematical foundations that enable them, the internal mechanisms that process malicious inputs, and the gap between our benchmarks and actual security threats remain opaque.

This workshop will bring together researchers to share research and discuss the root causes of model vulnerability and how we might design secure and robust architectures from first principles.

Emphasizing foundational understanding over incremental improvements, we ask:

What mathematical and computational properties make language models inherently vulnerable?
How can interpretability inform our view of attack surfaces and defense mechanisms?
Which evaluation frameworks can bridge the gap between benchtop metrics and real-world security failures?
What blind spots persist in our current research programs and conceptual frameworks?

Our goal is to catalyze rigorous, cross-disciplinary discussion that advances the theoretical, empirical, and evaluative foundations of language model security.

Workshop Format

The workshop consists of four thematic blocks. Each block includes an expert keynote (45 minutes), two contributed talks (15 minutes), and an extended guided discussion (45 minutes) among participants, presenters, and domain experts. Our format prioritizes deep engagement and discussion over talk density.

Trade-offs in System-level defences against Prompt Injection

Keynote: Ilia Shumailov

AI Security in Industry

Keynote: Kathrin Grosse (IBM Research)

Securing Real-World AI Agents

Keynote: Luca Beurer-Kellner (Snyk)

TBA

Keynote: Santiago Zanella-Béguelin (Microsoft Research Cambridge)

45 min

Expert Keynote

15 min

2 Contributed Talks

45 min

Guided Discussion

×4

Invited Speakers

Ilia Shumailov

AI Sequrity Company

ML Security & Privacy

Trade-offs in System-level defences against Prompt Injections
In this talk I will discuss what we currently know about prompt injection defences today, followed by what the next generation of defences is likely to look like.

Kathrin Grosse

IBM Research, Zurich

AI Security in Industry

LLM Security - An industrial and an end-users' perspective
LLMs are widely used, in both industry and by end users. Yet, on both ends of these spectrums, there are very different challenges, which we will discuss in this talk. On the one hand, companies must ensure the availability, confidentiality, and integrity of systems - on a level that supports both governance and audit-ability. On the other hand, end-users face different dilemmas when interacting with LLM. They are often unaware of pitfalls and engage, potentially unaware, in security-relevant behavior. Lastly, we briefly discuss AI security incident reporting as a possible solution to the open questions raised in both areas.

Luca Beurer-Kellner

Snyk

Securing Real-World AI Agents

Securing Real-World AI Agents
AI agents introduce a paradigm shift in software security, moving from predictable, deterministic systems to components with non-deterministic behavior. This presents critical challenges, especially when agents are given autonomy and access to sensitive real-world computer systems (such as coding IDEs). This talk provides insight into Snyk's security research, detailing ongoing work to both exploit and secure real-world agents. Based on extensive red teaming against Model Context Protocol (MCP) systems, we illustrate key vulnerabilities and present a framework for effectively protecting against sophisticated attacks.

Santiago Zanella-Béguelin

Microsoft Research Cambridge

Confidential AI Team

Schedule

08:50 - 09:00

Opening Remarks

09:00 - 11:45

Block 1: Trade-offs in System-level defences against Prompt Injections

Keynote: Ilia Shumailov

11:45 - 13:00

Lunch Break

13:00 - 14:45

Block 2: Practical LLM Security

Keynote: Kathrin Grosse (IBM Research)

14:45 - 15:00

Coffee Break

15:00 - 16:45

Block 3:

Keynote: Santiago Zanella-Béguelin

16:45 - 17:00

Break

17:00 - 18:30

Block 4: Securing Real-World AI Agents

Keynote: Luca Beurer-Kellner (Snyk)

18:30

Closing Remarks & Networking

Contributed Talks

We are excited to share that the following researchers which be presenting their work as lightning talks at the workshop. Thank you to all who submitted talks and supported the review process.

NLP Security and Ethics, In the Wild

Heather Lent

From Traditional to Agentic: AI Security Models Autonomously Acting Language Models Against Emergent Threats

Alyssa Columbus

A Compositional Algebra for Provable LLM Safety

Sam Wang, Alasdair Paren

HackAgent: An Open-Source Framework for AI Agent Security Testing

Nicola Franco

Beyond Human-Like: Measuring Vulnerabilities from LLM Anthropomorphism across cultures

Siddharth Milind Pawar, Sarah Masud

Fairness-Safety Alignment in Audio Large Language Models

Ranya Aloufi, Srishti Gupta, Lea Schönherr