Hi, I’m an incoming EECS PhD at MIT. This summer, I’m working on continual learning at DeepSeek. Previously, I was a research fellow at Anthropic researching model distillation. I did my BA and MS from UC Berkeley, advised by Dawn Song.

I’m broadly interested in LLM safety and post-training. Lately, I’ve been thinking about how to get denser reward signals from RL rollouts, and how to make LLMs better at AI research itself.