Anirudh S Chakravarthy

I'm a Senior Machine Learning Engineer at Cruise, working on multi-task learning and large models for long-tail recognition in autonomous driving. I graduated with my Master of Science in Computer Vision (MSCV) at the Robotics Institute, Carnegie Mellon University (CMU), where I was advised by Prof. Deva Ramanan. My research at CMU focused on extending panoptic segmentation into the open world and discovering novel objects without explicit supervision.

Want to chat? Feel free to reach out!

Email  |  Resume  |  Google Scholar  |  LinkedIn

profile photo


Education
  • M.S in Computer Vision, 2022
    Carnegie Mellon University, USA

  • B.E in Computer Science, 2021
    BITS Pilani, India

Research
Lidar Panoptic Segmentation in an Open World
Anirudh Chakravarthy, Meghana Reddy Ganesina, Peiyun Hu, Laura Leal-Taixé, Shu Kong, Deva Ramanan, Aljosa Osep
International Journal of Computer Vision, 2024
[pdf] [arxiv] [project page] [code]

Current Lidar Panoptic Segmentation (LPS) methods make an unrealistic assumption that the semantic class vocabulary is fixed in the real world, but in fact, class ontologies usually evolve over time as robots encounter instances of novel classes. To address this unrealistic assumption, we study LPS in the Open World (LiPSOW).

PROFIT: A Specialized Optimizer for Deep Fine Tuning
Anirudh Chakravarthy, Shuai Kyle Zheng, Xin Huang, Sachithra Hemachandra, Xiao Zhang, Yuning Chai, Zhao Chen
Under Review at ICML, 2025
[arxiv]

Fine-tuning pre-trained models has become invaluable in computer vision and robotics. We present PROFIT, one of the first optimizers specifically designed for incrementally fine-tuning converged models on new tasks or datasets.

YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemkar, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[project page] [pdf]

We introduce a new dataset and benchmark, YouMVOS, for multi-shot video object segmentation.

Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation
Anirudh Chakravarthy, Won-Dong Jang, Zudi Lin, Donglai Wei, Song Bai, Hanspeter Pfister
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf] [arXiv] [code]

We identify mask quality as a bottleneck for video instance segmentation. To overcome this, we propose an attention-based network to propagate missing object instances. Our method significantly outperforms previous state-of-the-art algorithms using the Mask R-CNN backbone, by achieving 36.0% mAP on the YouTube-VIS benchmark.

MRSCAtt: A Spatio-Channel Attention-Guided Network for Mars Rover Image Classification
Anirudh Chakravarthy*, Roshan Roy*, Praveen Ravirathinam*
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf] [code]

We propose a network, MRSCAtt (Mars Rover Spatial and Channel Attention), which jointly uses spatial and channel attention to accurately classify images. We use images taken by NASA's Curiosity rover on Mars as a dataset to show the superiority of our approach by achieving state-of-the-art results with 81.53% test set accuracy on the MSL Surface Dataset, outperforming other methods.

Work Experience
Cruise LLC
Machine Learning Engineer Feb 2023 - Present

Large Models for Long-tail Perception

Cruise LLC
Machine Learning Engineer Intern May 2022 - Aug 2022

Multi-task Learning for Long-tail Perception

Visual Computing Group, Harvard University
Research Intern May 2020 - July 2021

Video Instance Segmentation

Computer Vision and Robotics Lab, University of Illinois Urbana-Champaign
Research Intern May 2020 - Dec 2020

Vital Parameter Estimation

Projects
Self-Supervised Camera Pose Estimation with Geometric Consistency
[pdf] [code]

Existing camera pose estimation methods make use of ground-truth odometry as supervision, which may be expensive to obtain. In this work, we train a transformer-based pose estimation network in a self-supervised manner, leveraging advances in monocular depth estimation.

Is Monocular Vision Sufficient for Multi-View Visual Odometry?
[pdf] [code]

For visual localization, we often have multiple cameras mounted onto a robot which can be used to infer odometry (known as multi-view visual odometry). Existing works either heavily rely on the scene geometry or use complicated networks posing challenges for real-world generalization. In this work, we aim to develop simple yet strong baselines for multi-view visual odometry, by fusing estimates using monocular visual odometry.

Constrained Humanification: Improving Multi-Person Reconstruction Using Temporal Constraints
[pdf] [code]

Multi-person 3D reconstruction is challenging, yet no prior work aims to disambiguate inter-person occlusions using temporal information. Motivated by this, we leverage optical flow as a cue to improve 3D human pose estimation in crowded scenes.

Latent Space Robustness of Generative Models
[project page] [code]

Generative models such as StyleGAN have shown very promising results. However, while using such GANs for face generation, we often encounter cases of non-photorealistic generations (e.g: artifacts, not face-like, etc.). In this project, we aim to formally establish the existence of such failure modes in GANs.

News
[Sept 2024] Our work on Lidar Panoptic Segmentation in an Open World is accepted into IJCV 2024!
[Feb 2023] I joined Cruise full-time!
[Dec 2022] I graduated from Carnegie Mellon University with a Masters in Computer Vision!
[May 2022] I'll be joining Cruise over the summer.
[Mar 2022] Our work and dataset on multi-shot video object segmentation is accepted to CVPR 2022!
[Oct 2021] I'll be joining the CMU Argo AI Center for Autonomous Vehicle Research as a Research Collaborator under the guidance of Prof. Deva Ramanan.
[Aug 2021] I began my Masters in Computer Vision (MSCV) at the Robotics Institute, Carnegie Mellon University (CMU).
[May 2021] I graduated from BITS Pilani, India with a Bachelor's in Computer Science.
[Apr 2021] Two papers accepted at CVPRW 2021!
[Aug 2020] I'll be joining UIUC CVRL under the guidance of Prof. Narendra Ahuja as a research intern.
[May 2020] I'll be joining Harvard VCG under the guidance of Prof. Hanspeter Pfister for my undergraduate thesis.

Source code from Jon Barron

This page has been accessed at least several times since 30th Dec 2022.