Close

Sahand
Rezaei-Shoshtari

PhD Candidate

Download Resume

About Me

I am a PhD candidate in Computer Science at McGill University and Mila co-supervised by Prof. David Meger and Prof. Doina Precup.


My research explores various aspects of reinforcement learning, including state abstraction, fairness, and transfer learning. Additionally, I have worked on projects involving material discovery and reinforcement learning from human feedback.

Education

McGill University, Montreal, Canada

Sep. 2020 - Present

PhD in Computer Science

McGill University, Montreal, Canada

Sep. 2017 - Dec. 2019

Master of Engineering (Thesis) in Mechanical Engineering

Thesis: Learning Manipulator Dynamics for Control and Interaction Inference

University of Tehran, Tehran, Iran

Sep. 2012 - Sep. 2016

Bachelor of Engineering in Mechanical Engineering

Experience

Microsoft Research AI for Science, Montreal, Canada

Researcher

AI for Science

Microsoft Research AI for Science, Amsterdam, Netherlands

Research Intern

Deep reinforcement learning for chemical reactions discovery

Samsung AI Center, Montreal, Canada

Research Intern

Hypernetworks for meta imitation learning and meta reinforcement learning

Samsung AI Center, Montreal, Canada

Research Intern

Multimodal generative modeling for learning intuitive physics

Ubisoft La Forge, Montreal, Canada

AI Programmer

Deep reinforcement learning for automated video game testing

Samsung AI Center, Montreal, Canada

Research Intern

Object detection neural networks for human-robot Interaction

Publications

Fairness in Reinforcement Learning with Bisimulation Metrics

Sahand Rezaei-Shoshtari, Hanna Yurchyk, Scott Fujimoto, Doina Precup, and David Meger. Pre-print. 2024.

Cite Paper

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Prakash Panangaden*, Sahand Rezaei-Shoshtari*, Rosie Zhao*, David Meger, and Doina Precup. Journal of Machine Learning Research (JMLR). 2024.

Cite Paper Code Env Code

Hypernetworks for Zero-shot Transfer in Reinforcement Learning

Sahand Rezaei-Shoshtari, Charlotte Morissette, Francois R. Hogan, Gregory Dudek, and David Meger. In Thirty-Seventh AAAI Conference on Artificial Intelligence. 2023.

Cite Paper Code Env Code Webpage

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, and Doina Precup. In Advances in Neural Information Processing Systems (NeurIPS). 2022.

Cite Paper Code Webpage

Learning Intuitive Physics with Multimodal Generative Models

Sahand Rezaei-Shoshtari, Francois R. Hogan, Michael Jenkin, David Meger, and Gregory Dudek. In Thirty-Fifth AAAI Conference on Artificial Intelligence. 2021.

Cite Paper Code Webpage

Seeing Through your Skin: Recognizing Objects with a Novel Visuotactile Sensor

Francois R. Hogan, Michael Jenkin, Sahand Rezaei-Shoshtari, Yogesh Girdhar, David Meger, and Gregory Dudek. In The IEEE Winter Conference on Applications of Computer Vision (WACV). 2021.

Cite Paper

Jacobian of Conditional Generative Models for Sensitivity Analysis of Photovoltaic Device Processes

Maryam Molamohammadi, Sahand Rezaei-Shoshtari, Nathaniel Quitoriano. In Machine Learning for Engineering Workshop @ NeurIPS 2020.

Cite Paper

Learning the Latent Space of Robot Dynamics for Cutting Interaction Inference

Sahand Rezaei-Shoshtari, David Meger, and Inna Sharf. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2020.

Cite Paper Dataset Video

Cascaded Gaussian Processes for Data-efficient Robot Dynamics Learning

Sahand Rezaei-Shoshtari, David Meger, and Inna Sharf. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2019.

Cite Paper Video

Skills

Select Projects

Unpaired RLHF

  • Reinforcement learning from human feedback (RLHF) for LLM fine-tuning using unpaired preference data.
  • Code

    Contextual Control Suite

  • A series of contextual Markov Decision Processes (MDP) based on Mujoco for continuous control tasks.
  • Code

    Gym Forest Fire

  • A Gym environment of a forest fire simulation for tackling wildfires with RL.
  • Code

    Motion Planning and Control Utilities for Kinova Jaco 2

  • A ROS package for the Kinova Jaco 2 robot with various control and motion planning utilities.
  • Code

    Select Awards

    NSERC Canada Graduate Scholarship-Doctoral (CGS-D) Award.
    Total amount of $105,000 over 3 years.

    Fonds de Recherche du Quebec - Nature et Technologies (FRQ-NT) Award.
    Total amount of $70,000 over 3.5 years.

    Grad Excellence Award.
    Amount of $7,000 per year. McGill University.

    DeepMind Grad Award.
    Amount of $25,000 per year. DeepMind and McGill University.

    NeurIPS 2022 Outstanding Reviewer.
    Top 8% of all reviewers.

    ICML 2022 Outstanding Reviewer.
    Top 10% of all reviewers.

    National University Entrance Exam.
    Ranked 19th. Iran.

    Certifications

    Trustworthy and Responsible AI Learning (TRAIL)
    Hosted by Mila. Montreal, Canada. April 2023.

    Simons Institute Mathematics of Online Decision Making Workshop
    Hosted by Simons Institute. Virtual. October 2020.

    CIFAR Deep Learning and Reinforcement Learning Summer School
    Hosted by CIFAR and Amii. Edmonton, Canada. July 2019.