Guy Davidson
Guy Davidson

PhD Candidate

About Me

I’m a cognitive scientist and PhD candidate at the NYU Center for Data Science, advised by Brenden Lake and Todd Gureckis. My dissertation (writing in progress) offers theoretical, empirical, and computational advances in the study of goals: how do we represent, reason about, and come up with them? My recent work follows it to study task representations in large language models using mechanistic interpretability tools (as a visiting researcher at Meta FAIR with Adina Williams). I’m currently on the job market for post-PhD roles, focusing on building LLM systems that better understand user intents and pursue complex objectives or continue contributing to other interpretability and alignment efforts.

In my non-academic life, I live with my wife Sarah and our dog Lila, and spend time making homemade hot sauces, playing board games, and lifting weights.

Download CV
Interests
  • Goal representation and generation
  • Human-like goals for agents
  • Compuational cognitive science
  • Goal/intent inference in LLMs
Education
  • PhD in Data Science

    New York University

  • MPhil in Data Science

    New York University

  • BSc in Computational Sciences

    Minerva University

Projects
Publications
(2025). Do different prompting methods yield a common task representation in language models?.
(2025). Goal Inference using Reward-Producing Programs in a Novel Physics Environment. Forthcoming in the Proceedings of the 47th Annual Meeting of the Cognitive Science Society, CogSci 2025.
(2025). Novel Goal Creation and Evaluation in Open-Ended Games. Forthcoming in the Proceedings of the 47th Annual Meeting of the Cognitive Science Society, CogSci 2025.
(2025). Goals as Reward-Producing Programs. Nature Machine Intelligence.
(2025). Generate-Feedback-Refine: How Much Does Model Quality in Each Role Matter?. Deep Learning 4 Code @ ICLR 2025.
(2024). Toward Complex and Structured Goals in Reinforcement Learning. Finding the Frame @ RLC 2024.
(2024). Spatial relation categorization in infants and deep neural networks. Cognition.
(2024). Toward Human-AI Alignment in Large-Scale Multi-Player Games. Wordplay @ ACL 2024, Association for Computational Linguistics.
(2023). Generating Human-Like Goals by Synthesizing Reward-Producing Programs. Intrinsically Motivated Open-Ended Learning @ NeurIPS 2023.
(2022). Creativity, Compositionality, and Common Sense in Human Goal Generation. Proceedings of the 44th Annual Meeting of the Cognitive Science Society, CogSci 2022.
(2022). A model of mood as integrated advantage. Psychological Review.
(2021). Examining Infant Relation Categorization Through Deep Neural Networks. Proceedings of the 43rd Annual Meeting of the Cognitive Science Society, CogSci 2021.
(2020). Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning. Proceedings of the 42nd Annual Meeting of the Cognitive Science Society, CogSci 2020.
(2020). Systematically Comparing Neural Network Architectures in Relation Learning. Object-Oriented Learning (OOL): Perception, Representation, and Reasoning @ ICML 2020.
(2020). Sequential mastery of multiple visual tasks: Networks naturally learn to learn and forget to forget . The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
(2019). Contrasting the effects of prospective attention and retrospective decay in representation learning. The 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making.
(2019). Momentum and mood in policy-gradient reinforcement learning. The 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making.