Anirudh S Chakravarthy

I'm a Machine Learning Engineer at Cruise, working on multi-task learning and large models for long-tail recognition in autonomous driving. I graduated with my Master of Science in Computer Vision (MSCV) at the Robotics Institute, Carnegie Mellon University (CMU), where I was advised by Prof. Deva Ramanan. My research at CMU focused on extending panoptic segmentation into the open world and discovering novel objects without explicit supervision.

Want to chat? Feel free to reach out!

Email  |  Resume  |  Google Scholar  |  LinkedIn

profile photo

  • M.S in Computer Vision, 2022
    Carnegie Mellon University, USA

  • B.E in Computer Science, 2021
    BITS Pilani, India

Lidar Panoptic Segmentation in an Open World
Anirudh Chakravarthy*, Meghana Reddy Ganesina*, Peiyun Hu, Laura Leal-Taixé, Shu Kong, Deva Ramanan, Aljosa Osep
Under Review at IJCV, 2024
[project page]

We introduce a new problem setting, Lidar Panoptic Segmentation in an Open World (LiPSOW). We also establish evaluation protocols to evaluate methods under LiPSOW. We propose our method, Hierarchical LiDAR Panoptic Segmentation (HLPS), which combines geometric clustering and lidar semantic segmentation to achieve strong performance on LiPSOW.

PROFIT: A PROximal FIne Tuning Optimizer
Anirudh Chakravarthy*, Shuai Zheng, Xin Huang, Sachithra Hemachandra, Yuning Chai, Zhao Chen
Under Review at ECCV, 2024

We present PROFIT, one of the first optimizers that has been specifically designed to operate on converged models that need to be incrementally fine-tuned on a new task/dataset. Unlike standard optimizers like SGD or Adam, which operate with minimal assumptions since the model weights might be randomly initialized, PROFIT takes advantage of the additional structure of a converged model to regularize the optimization process for better results.

YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
Donglai Wei, Siddhant Kharbanda, Sarthak Arora, Roshan Roy, Nishant Jain, Akash Palrecha, Tanav Shah, Shray Mathur, Ritik Mathur, Abhijay Kemkar, Anirudh Chakravarthy, Zudi Lin, Won-Dong Jang, Yansong Tang, Song Bai, James Tompkin, Philip H.S. Torr, Hanspeter Pfister
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[project page] [pdf]

We introduce a new dataset and benchmark, YouMVOS, for multi-shot video object segmentation.

Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation
Anirudh Chakravarthy, Won-Dong Jang, Zudi Lin, Donglai Wei, Song Bai, Hanspeter Pfister
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf] [arXiv] [code]

We identify mask quality as a bottleneck for video instance segmentation. To overcome this, we propose an attention-based network to propagate missing object instances. Our method significantly outperforms previous state-of-the-art algorithms using the Mask R-CNN backbone, by achieving 36.0% mAP on the YouTube-VIS benchmark.

MRSCAtt: A Spatio-Channel Attention-Guided Network for Mars Rover Image Classification
Anirudh Chakravarthy*, Roshan Roy*, Praveen Ravirathinam*
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf] [code]

We propose a network, MRSCAtt (Mars Rover Spatial and Channel Attention), which jointly uses spatial and channel attention to accurately classify images. We use images taken by NASA's Curiosity rover on Mars as a dataset to show the superiority of our approach by achieving state-of-the-art results with 81.53% test set accuracy on the MSL Surface Dataset, outperforming other methods.

Work Experience
Cruise LLC
Machine Learning Engineer Feb 2023 - Present

Large Models for Long-tail Perception

Cruise LLC
Machine Learning Engineer Intern May 2022 - Aug 2022

Multi-task Learning for Long-tail Perception

Visual Computing Group, Harvard University
Research Intern May 2020 - July 2021

Video Instance Segmentation

Computer Vision and Robotics Lab, University of Illinois Urbana-Champaign
Research Intern May 2020 - Dec 2020

Vital Parameter Estimation

Self-Supervised Camera Pose Estimation with Geometric Consistency
[pdf] [code]

Existing camera pose estimation methods make use of ground-truth odometry as supervision, which may be expensive to obtain. In this work, we train a transformer-based pose estimation network in a self-supervised manner, leveraging advances in monocular depth estimation.

Is Monocular Vision Sufficient for Multi-View Visual Odometry?
[pdf] [code]

For visual localization, we often have multiple cameras mounted onto a robot which can be used to infer odometry (known as multi-view visual odometry). Existing works either heavily rely on the scene geometry or use complicated networks posing challenges for real-world generalization. In this work, we aim to develop simple yet strong baselines for multi-view visual odometry, by fusing estimates using monocular visual odometry.

Constrained Humanification: Improving Multi-Person Reconstruction Using Temporal Constraints
[pdf] [code]

Multi-person 3D reconstruction is challenging, yet no prior work aims to disambiguate inter-person occlusions using temporal information. Motivated by this, we leverage optical flow as a cue to improve 3D human pose estimation in crowded scenes.

Latent Space Robustness of Generative Models
[project page] [code]

Generative models such as StyleGAN have shown very promising results. However, while using such GANs for face generation, we often encounter cases of non-photorealistic generations (e.g: artifacts, not face-like, etc.). In this project, we aim to formally establish the existence of such failure modes in GANs.

[Feb 2023] I joined Cruise full-time!
[Dec 2022] I graduated from Carnegie Mellon University with a Masters in Computer Vision!
[May 2022] I'll be joining Cruise over the summer.
[Mar 2022] Our work and dataset on multi-shot video object segmentation is accepted to CVPR 2022!
[Oct 2021] I'll be joining the CMU Argo AI Center for Autonomous Vehicle Research as a Research Collaborator under the guidance of Prof. Deva Ramanan.
[Aug 2021] I began my Masters in Computer Vision (MSCV) at the Robotics Institute, Carnegie Mellon University (CMU).
[May 2021] I graduated from BITS Pilani, India with a Bachelor's in Computer Science.
[Apr 2021] Two papers accepted at CVPRW 2021!
[Aug 2020] I'll be joining UIUC CVRL under the guidance of Prof. Narendra Ahuja as a research intern.
[May 2020] I'll be joining Harvard VCG under the guidance of Prof. Hanspeter Pfister for my undergraduate thesis.

Source code from Jon Barron

This page has been accessed at least several times since 30th Dec 2022.