• Bio
  • Papers
  • Talks
  • News
  • Experience
  • Projects
  • Teaching
  • Recent & Upcoming Talks
    • Reading Between the Tokens: Uncovering the Semantic Minima of AI Monologues
    • Large Language Models: Potenzialità, Limiti e Sistemi Multi-Agent
    • Bridging Logic and Learning: Decoding Temporal Logic Embeddings via Transformers
  • Projects
    • Distilling Formal Logic into Neural Spaces
    • STLDec: Decoding Temporal Logic Embeddings via Transformers
    • OverRef: Studying Over-Refusal in Large Language Models
  • Publications
    • Distilling Formal Logic into Neural Spaces: A Kernel Alignment Approach for Signal Temporal Logic
    • A Dialectic Pipeline for Improving LLM Robustness
    • Bridging Logic and Learning: Decoding Temporal Logic Embeddings via Transformers
  • Teaching
    • Probabilistic Machine Learning
  • Projects
  • Experience
  • Blog

OverRef: Studying Over-Refusal in Large Language Models

Jan 1, 2025 · 1 min read
Go to Project Site

Ongoing project on over-refusal in LLMs: studying when and why models refuse legitimate user queries, with benchmarking and dataset resources.

Last updated on Jan 1, 2025
Large Language Models Evaluation Benchmarking
Sara Candussio
Authors
Sara Candussio
PhD student

← STLDec: Decoding Temporal Logic Embeddings via Transformers Jul 10, 2025

© 2026 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.