Research

Show all

LLMs In-Context Learning Reinforcement Learning RLHF Statistical Learning Theory Bandits

Post-training Large Language Models for Diverse High-Quality Responses

Post-training Large Language Models for Diverse High-Quality Responses

Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Ioannis Paschalidis, Aldo Pacchiano

ICLR 2026

In-Context Learning Approach for Pure Exploration

In-Context Learning Approach for Pure Exploration

Alessio Russo*, Ryan Welch*, Aldo Pacchiano

ICLR 2026

Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms

Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms

Xinyi Hu, Aldo Pacchiano

AISTATS 2026

The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification

The Good, the Bad, and the Sampled: a No-Regret Approach to Safe Online Classification

Tavor Baharav*, Spyros Dragazis*, Aldo Pacchiano

AISTATS 2026 (spotlight)

Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward

Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward

Dipendra Misra*, Aldo Pacchiano*, Ta-Chung Chi, Ge Gao

NeuRIPS 2025

Language Model Personalization via Reward Factorization

Language Model Personalization via Reward Factorization

Idan Shenfeld*, Felix Faltings*, Pulkit Agrawal, Aldo Pacchiano

COLM 2025

Contextual Bandits with Stage-wise Constraints

Contextual Bandits with Stage-wise Constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett

JMLR 2025

Multiple-policy Evaluation via Density Estimation

Multiple-policy Evaluation via Density Estimation

Yilei Chen, Aldo Pacchiano, Ioannis Ch. Paschalidis

ICML 2025

On the Hardness of Bandit Learning

On the Hardness of Bandit Learning

Nataly Brukhim*, Aldo Pacchiano*, Miroslav Dudik, Robert Schapire

COLT 2025

Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Alessio Russo, Aldo Pacchiano

ICML 2025

Active Preference Optimization for Sample Efficient RLHF

Active Preference Optimization for Sample Efficient RLHF

Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

ECML-PKDD 2025

Second Order Bounds for Contextual Bandits with Function Approximation

Second Order Bounds for Contextual Bandits with Function Approximation

Aldo Pacchiano

ICLR 2025

Pure Exploration with Feedback Graphs

Pure Exploration with Feedback Graphs

Alessio Russo, Yichen Song, Aldo Pacchiano

AISTATS 2025

Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives

Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives

Aida Afshar, Aldo Pacchiano

ArXiv

A Theoretical Framework for Partially-Observed Reward States in RLHF

A Theoretical Framework for Partially-Observed Reward States in RLHF

Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari

ICLR 2025

Provable Interactive Learning with Hindsight Instruction Feedback

Provable Interactive Learning with Hindsight Instruction Feedback

Dipendra Misra*, Aldo Pacchiano*, Robert E Schapire

ICML 2024

Data-Driven Regret Balancing for Online Model Selection in Bandits

Data-Driven Regret Balancing for Online Model Selection in Bandits

Aldo Pacchiano, Christoph Dann, Claudio Gentile

AISTATS 2024

A Unified Model and Dimension for Interactive Estimation

A Unified Model and Dimension for Interactive Estimation

Nataly Brukhim, Aldo Pacchiano, Miroslav Dudik, Robert Schapire

NeuRIPS 2023

Anytime Model Selection in Linear Bandits

Anytime Model Selection in Linear Bandits

Parnian Kassraie, Aldo Pacchiano, Nicolas Emmenegger, Andreas Krause

NeuRIPS 2023

Experiment Planning with Function Approximation

Experiment Planning with Function Approximation

Aldo Pacchiano, Jonathan Lee, Emma Brunskill

NeuRIPS 2023

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Jonathan Lee*, Annie Xie*, Aldo Pacchiano, Yash Chandak, Chelsea Finn, Ofir Nachum, Emma Brunskill

NeuRIPS 2023