Anirudh S Chakravarthy
I'm a Machine Learning Engineer at Cruise, working on multi-task learning and large models for long-tail recognition in autonomous driving.
I graduated with my Master of Science in Computer Vision (MSCV) at the Robotics Institute, Carnegie Mellon University (CMU), where I was advised by Prof. Deva Ramanan.
My research at CMU focused on extending panoptic segmentation into the open world and discovering novel objects without explicit supervision.
Want to chat? Feel free to reach out!
Email  | 
Resume  | 
Google Scholar  | 
LinkedIn
|
|
|
|
|
|
|
|
Education
-
M.S in Computer Vision, 2022 Carnegie Mellon University, USA
-
B.E in Computer Science, 2021 BITS Pilani, India
|
|
Lidar Panoptic Segmentation in an Open World
Anirudh Chakravarthy,
Meghana Reddy Ganesina,
Peiyun Hu,
Laura Leal-Taixé,
Shu Kong,
Deva Ramanan,
Aljosa Osep
International Journal of Computer Vision, 2024
[pdf]
[arxiv]
[project page]
[code]
We introduce a new problem setting, Lidar Panoptic Segmentation in an Open World (LiPSOW). We also establish evaluation protocols to evaluate methods under LiPSOW. We propose our method, Hierarchical LiDAR Panoptic Segmentation (HLPS), which combines geometric clustering and lidar semantic segmentation to achieve strong performance on LiPSOW.
|
|
PROFIT: A PROximal FIne Tuning Optimizer
Anirudh Chakravarthy,
Shuai Zheng,
Xin Huang,
Sachithra Hemachandra,
Yuning Chai,
Zhao Chen
Under Review at ECCV, 2024
We present PROFIT, one of the first optimizers that has been specifically designed to operate on converged models that need to be incrementally fine-tuned on a new task/dataset. Unlike standard optimizers like SGD or Adam, which operate with minimal assumptions since the model weights might be randomly initialized, PROFIT takes advantage of the additional structure of a converged model to regularize the optimization process for better results.
|
|
YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset
Donglai Wei,
Siddhant Kharbanda,
Sarthak Arora,
Roshan Roy,
Nishant Jain,
Akash Palrecha,
Tanav Shah,
Shray Mathur,
Ritik Mathur,
Abhijay Kemkar,
Anirudh Chakravarthy,
Zudi Lin,
Won-Dong Jang,
Yansong Tang,
Song Bai,
James Tompkin,
Philip H.S. Torr,
Hanspeter Pfister
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[project page]
[pdf]
We introduce a new dataset and benchmark, YouMVOS, for multi-shot video object segmentation.
|
|
Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation
Anirudh Chakravarthy,
Won-Dong Jang,
Zudi Lin,
Donglai Wei,
Song Bai,
Hanspeter Pfister
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf]
[arXiv]
[code]
We identify mask quality as a bottleneck for video instance segmentation. To overcome this, we propose an attention-based network to propagate missing object instances. Our method significantly outperforms previous state-of-the-art algorithms using the Mask R-CNN backbone, by achieving 36.0% mAP on the YouTube-VIS benchmark.
|
|
MRSCAtt: A Spatio-Channel Attention-Guided Network for Mars Rover Image Classification
Anirudh Chakravarthy*,
Roshan Roy*,
Praveen Ravirathinam*
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
[pdf]
[code]
We propose a network, MRSCAtt (Mars Rover Spatial and Channel Attention), which jointly uses spatial and channel attention to accurately classify images. We use images taken by NASA's Curiosity rover on Mars as a dataset to show the superiority of our approach by achieving state-of-the-art results with 81.53% test set accuracy on the MSL Surface Dataset, outperforming other methods.
|
|
Cruise LLC
Machine Learning Engineer Feb 2023 - Present
Large Models for Long-tail Perception
|
|
Cruise LLC
Machine Learning Engineer Intern May 2022 - Aug 2022
Multi-task Learning for Long-tail Perception
|
|
Visual Computing Group, Harvard University
Research Intern May 2020 - July 2021
Video Instance Segmentation
|
|
Computer Vision and Robotics Lab, University of Illinois Urbana-Champaign
Research Intern May 2020 - Dec 2020
Vital Parameter Estimation
|
|
Self-Supervised Camera Pose Estimation with Geometric Consistency
[pdf]
[code]
Existing camera pose estimation methods make use of ground-truth odometry as supervision, which may be expensive to obtain.
In this work, we train a transformer-based pose estimation network in a self-supervised manner, leveraging advances in monocular depth estimation.
|
|
Is Monocular Vision Sufficient for Multi-View Visual Odometry?
[pdf]
[code]
For visual localization, we often have multiple cameras mounted onto a robot which can be used to infer odometry (known as multi-view visual odometry).
Existing works either heavily rely on the scene geometry or use complicated networks posing challenges for real-world generalization.
In this work, we aim to develop simple yet strong baselines for multi-view visual odometry, by fusing estimates using monocular visual odometry.
|
|
Constrained Humanification: Improving Multi-Person Reconstruction Using Temporal Constraints
[pdf]
[code]
Multi-person 3D reconstruction is challenging, yet no prior work aims to disambiguate inter-person occlusions using temporal information.
Motivated by this, we leverage optical flow as a cue to improve 3D human pose estimation in crowded scenes.
|
|
Latent Space Robustness of Generative Models
[project page]
[code]
Generative models such as StyleGAN have shown very promising results. However, while using such GANs for face generation, we often encounter cases of non-photorealistic generations (e.g: artifacts, not face-like, etc.).
In this project, we aim to formally establish the existence of such failure modes in GANs.
|
This page has been accessed at least
times since 30th Dec 2022.
|