Research
Post-training Large Language Models for Diverse High-Quality Responses
Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Ioannis Paschalidis, Aldo Pacchiano
arXiv
Principled Fine-tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward
Dipendra Misra*, Aldo Pacchiano*, Ta-Chung Chi, Ge Gao
NeuRIPS 2025
Language Model Personalization via Reward Factorization
Idan Shenfeld*, Felix Faltings*, Pulkit Agrawal, Aldo Pacchiano
COLM 2025
Contextual Bandits with Stage-wise Constraints
Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett
JMLR 2025
Learning to Explore: An In-Context Learning Approach for Pure Exploration
Alessio Russo*, Ryan Welch*, Aldo Pacchiano
arXiv
Multiple-policy Evaluation via Density Estimation
Yilei Chen, Aldo Pacchiano, Ioannis Ch. Paschalidis
ICML 2025
On the Hardness of Bandit Learning
Nataly Brukhim*, Aldo Pacchiano*, Miroslav Dudik, Robert Schapire
COLT 2025
Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms
Xinyi Hu, Aldo Pacchiano
arXiv
Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
Alessio Russo, Aldo Pacchiano
ICML 2025
Active Preference Optimization for Sample Efficient RLHF
Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury
ECML-PKDD 2025
Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Aida Afshar, Aldo Pacchiano
ArXiv
A Theoretical Framework for Partially-Observed Reward States in RLHF
Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari
ICLR 2025
Provable Interactive Learning with Hindsight Instruction Feedback
Dipendra Misra*, Aldo Pacchiano*, Robert E Schapire
ICML 2024
Data-Driven Regret Balancing for Online Model Selection in Bandits
Aldo Pacchiano, Christoph Dann, Claudio Gentile
AISTATS 2024
A Unified Model and Dimension for Interactive Estimation
Nataly Brukhim, Aldo Pacchiano, Miroslav Dudik, Robert Schapire
NeuRIPS 2023
Anytime Model Selection in Linear Bandits
Parnian Kassraie, Aldo Pacchiano, Nicolas Emmenegger, Andreas Krause
NeuRIPS 2023
Experiment Planning with Function Approximation
Aldo Pacchiano, Jonathan Lee, Emma Brunskill
NeuRIPS 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Jonathan Lee*, Annie Xie*, Aldo Pacchiano, Yash Chandak, Chelsea Finn, Ofir Nachum, Emma Brunskill
NeuRIPS 2023



















