I am a Robotics graduate from the School of Computer Science (Robotics) at Carnegie Mellon University (Pittsburgh). I specialize in Computer Vision/ Machine Learning, focusing on Autonomous vehicles. I work at Motional (A self-driving car company) as a Tech Lead and Sr. Machine Learning Engineer. In my current role, I am responsible for developing transformer-based state-of-the-art ML/Computer Vision algorithms to build better and faster online perception systems. I, along with my team, ensure that the ML networks I develop are efficient enough to be deployed on our production cars as well. Our networks take multi-modal input and give runtime efficient accurate perception output (Bounding boxes, Segmentation masks, and much more).

Previously, in 2019, I worked at Aptiv as a perception intern. There, I developed a novel deep-learning-based auto-annotation tool for HD Maps. We patented the technology as well. Also, I have worked as a researcher for an autonomous truck collision avoidance project at Carnegie Mellon, sponsored by Daimler Trucks North America. I was responsible for the Perception, Tracking, and Sensor Fusion pipeline. Before coming to CMU, I worked at the RnD division of Maruti Suzuki.

I co-founded a startup at CMU, called Vera and won a small NSF grant from a business competition.

I have several patents and published research papers, mostly first authored. I also believe in giving back to the Machine Learning research community through research paper reviews and keynote tutorial sessions. When I am not doing Machine Learning, I read content on human psychology, financial investments, the Future of humanity, and macroeconomics.

Education

Carnegie Mellon University

Master’s in Robotics, School of Computer Science (2018-2020)

Specialized in Artificial Intelligence, Computer Vision, Computer Science, and Robotics.

Delhi Technological University

Bachelor’s in Technology (2011-2015)

Specialized in Artificial Intelligence and Robotics

Professional Experience

Motional (2020-Present), California USA

Tech Lead and Sr. Machine Learning Engineer (Perception)

Tech lead of 5 Machine Learning and Robotics engineers and interns to develop a state-of-the-art vision network that runs in real-time on-car. This network uses eight surround-view cameras and predicts agents in the 3D world. Authored a blog for this project, published on Motional’s portal

Deployed numerous Vision-first Deep Learning models on-car using TensorRT optimizations.

Managed Reading Group and ML literature surveys within a team of 40+ Perception Engineers.

Co-authored 13 patents and 9 selected research papers in computer vision and computer science domains.

Aptiv (Summer’19), Pennsylvania USA

Perception Intern (AI)

Developed segmentation algorithm for road intersections to auto-annotate H.D. Maps by fusing images and LiDAR intensity maps. This Project led to a patent, too.

The Success of this internship project led to the forming of a team of three full-time software engineers.

Daimler Trucks North America (2018-2020), Oregon USA

Research Collaborator (Perception Lead)

Developed a reliable high-speed on-coming collision prevention system on CARLA sim for country roads.

Vera (Fall’ 2019), Pennsylvania USA

Co-founder

Developed an AI assistant for lawyers, to help with paralegal tasks. Won an NSF grant through business competition at Carnegie Mellon.

Maruti Suzuki (2015-2018), India

Assistant Manager, Research and Development

Led a smart mobility project and cross-collaborated with Homologation teams and Japanese researchers.

MSSL Global RSA (Summer’13), South Africa

Summer intern

Developed a computer vision-based part detection and quality insurance tool that helped save on the downtime of a 6 degrees of freedom robotic arm at the paint shop.

Patents

MACHINE LEARNING - BASED FRAMEWORK FOR DRIVABLE SURFACE ANNOTATION

Inventors: Sergi Adipraja Widjaja, Venice Erin Baylon Liong, Zhuang Jie Chong, Apoorv Singh

US 11,367,289 B1

TRAINING MACHINE LEARNING NETWORKS FOR CONTROLLING VEHICLE OPERATION

Inventors: Apoorv Singh, Varun Kumar Reddy Bankiti

Application number: US 18/141,014

MACHINE LEARNING-BASED FRAMEWORK FOR DRIVABLE SURFACE ANNOTATION

Inventors: Sergi, Venice, Zhuang, Apoorv Singh

US 2023/0016246 A1

ENRICHING FEATURE MAPS USING MULTIPLE PLURALITIES OF WINDOWS TO GENERATE BOUNDING BOXES

Inventors: Jongwoo, Apoorv Singh, Varun

US 2024/0062520 A1

ENRICHING OBJECT QUERIES USING A BIRD’S-EYE VIEW FEATURE MAP TO GENERATE BOUNDING BOXES

Inventors: Jongwoo, Apoorv Singh, Varun

US 2024/0062520 A1

ENRICHING FEATURE MAPS USING MULTIPLE PLURALITIES OF WINDOWS TO GENERATE BOUNDING BOXES

Inventors: Jongwoo, Apoorv Singh, Varun

Inventors: Application number: PCT/US2023/072389

AGGREGATION OF DATA REPRESENTING GEOGRAPHICAL AREAS

Inventors: Apoorv Singh, Varun, Jeongil, Akankshya

US 2024-0125617 A1

ITERATIVE DEPTH ESTIMATION

Inventors: Akankshya, Apoorv Singh, Varun

Application number: US 18/163,708

Inventors: Apoorv Singh

Application number: PCT/US2023/018569

Vision-RADAR fusion for DETR-like 3D detections

Inventors: Apoorv Singh, Varun

US 2024-0127596 A1

SCENE-DEPENDENT OBJECT QUERY INITIALIZATION STRATEGY USING TEMPORAL CONSISTENCIES

Inventors: Apoorv Singh, Varun

US 2024-0127597 A1

Augment Bird’s Eye multi-view-Camera detections with Perspective View detections

Inventors: Apoorv Singh

Application number: US 63/470,125

Surround-Vision & RADAR fusion strategy using Transformers

Inventors: Apoorv Singh

Application number: US 63/505,385

Research Publications

Surround-view vision-based 3d detection for autonomous driving: A survey. Paper Link

Authors: Apoorv Singh

Conference: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Vision-radar fusion for robotics bev detections: A survey. Paper Link

Authors: Apoorv Singh

Conference: 2023 IEEE Intelligent Vehicles Symposium (IV)

Transformer-based sensor fusion for autonomous driving: A survey. Paper Link

Authors: Apoorv Singh

Conference: Proceedings of the IEEE/CVF International Conference on Computer Vision

Training Strategies for Vision Transformers for Object Detection. Paper Link

Authors: Apoorv Singh

Conference: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

3m3d: Multi-view, multi-path, multi-representation for 3d object detection. Paper Link

Authors: J Park, Apoorv Singh, V Bankiti

Conference: 2023 IEEE International Conference on Image Processing (ICIP), 1930-1934

A Review on Objective-Driven Artificial Intelligence. Paper Link

Authors: Apoorv Singh

Conference: arXiv preprint arXiv:2308.10135

End-to-end Autonomous Driving using Deep Learning: A Systematic Review. Paper Link

Authors: Apoorv Singh

Conference: arXiv preprint arXiv:2311.18636

Trajectory-Prediction with Vision: A Survey. Paper Link

Authors: Apoorv Singh

Conference: Proceedings of the IEEE/CVF International Conference on Computer Vision

Multi-agent Collaborative Perception for Robotic Fleet: A Systematic Review. Paper Link

Authors: Apoorv Singh, G Raut, A Choudhary

Conference: Accepted in ECCV’24 (Conference yet to happen)

Generative AI in Vision: A Survey on Models, Metrics and Applications. Paper Link

Authors: G Raut, Apoorv Singh

Conference: arXiv preprint arXiv:2402.16369

Traffic Policeman Gesture Recognition With Spatial Temporal Graph Convolution Network. Paper Link

Authors: Apoorv Singh, A Choudhary

Conference: 2023 IEEE Conference on Artificial Intelligence (CAI), 40-41

Augmenting Vision Queries with RADAR for BEV Detection in Autonomous Driving. Paper Link

Authors: Apoorv Singh

Conference: 2023 IEEE Conference on Artificial Intelligence (CAI), 53-54

AI Community Contributions

Publication Peer-Review

Location: NeurIPS’23, CVPR’23, ICRA’22, ICML’22, ICCV’23, IEEE CAI’23, AAAI’23 and lots more, totalling ~85.

Master’s Admission Committee

Location: Robotics Institute, Carnegie Mellon University

Years: 2023, 2024, 2025.

Technical Judge

Location: Pittsburgh Regional Science & Engineering Fair, 2023

Technical Judge

Location: FIRST Robotics Competition, 2023

Conference Program Chairs

Location: NeurIPS 2022, CVPR’23 Precognition Workshop, AAAI-2023

Keynote/ Panel Sessions

Keynote: Vision-based Perception for Autonomous Driving Link

Location: IV 2023, Alaska, USA

Keynote: Vision-based Perception for Autonomous Driving Link

Location: IV 2024, Jeju Islands, Korea

Keynote: 1st Workshop on Cooperative Intelligence for Embodied AI Link

Location: European Conference on Computer Vision 2024 in Milan, Italy

Workshop Organizer. Link

Location: 2023 IEEE Conference on Artificial Intelligence (IEEE CAI)

Organized a workshop on autonomous vehicles with a panel discussion and two guest lecturers in Santa Clara, California, to promote research on Autonomous driving.

Panelist on Session on AI for Autonomous Driving. Link

Co-panelists: Shivam Gautam, Fang-Chieh Chou, Aleksandr Petiushko, Sachithra Hemachandra

Location: 2023 IEEE Conference on Artificial Intelligence (IEEE CAI)

Education

Carnegie Mellon University

Delhi Technological University

Professional Experience

Motional (2020-Present), California USA

Aptiv (Summer’19), Pennsylvania USA

Daimler Trucks North America (2018-2020), Oregon USA

Vera (Fall’ 2019), Pennsylvania USA

Maruti Suzuki (2015-2018), India

MSSL Global RSA (Summer’13), South Africa

Patents

MACHINE LEARNING - BASED FRAMEWORK FOR DRIVABLE SURFACE ANNOTATION

TRAINING MACHINE LEARNING NETWORKS FOR CONTROLLING VEHICLE OPERATION

MACHINE LEARNING-BASED FRAMEWORK FOR DRIVABLE SURFACE ANNOTATION

ENRICHING FEATURE MAPS USING MULTIPLE PLURALITIES OF WINDOWS TO GENERATE BOUNDING BOXES

ENRICHING OBJECT QUERIES USING A BIRD’S-EYE VIEW FEATURE MAP TO GENERATE BOUNDING BOXES

ENRICHING FEATURE MAPS USING MULTIPLE PLURALITIES OF WINDOWS TO GENERATE BOUNDING BOXES

AGGREGATION OF DATA REPRESENTING GEOGRAPHICAL AREAS

ITERATIVE DEPTH ESTIMATION

MULTI-MODAL SENSOR-BASED NAVIGATION USING BOUNDING BOXES

Vision-RADAR fusion for DETR-like 3D detections

SCENE-DEPENDENT OBJECT QUERY INITIALIZATION STRATEGY USING TEMPORAL CONSISTENCIES

Augment Bird’s Eye multi-view-Camera detections with Perspective View detections

Surround-Vision & RADAR fusion strategy using Transformers

Research Publications

Surround-view vision-based 3d detection for autonomous driving: A survey. Paper Link

Vision-radar fusion for robotics bev detections: A survey. Paper Link

Transformer-based sensor fusion for autonomous driving: A survey. Paper Link

Training Strategies for Vision Transformers for Object Detection. Paper Link

3m3d: Multi-view, multi-path, multi-representation for 3d object detection. Paper Link

A Review on Objective-Driven Artificial Intelligence. Paper Link

End-to-end Autonomous Driving using Deep Learning: A Systematic Review. Paper Link

Trajectory-Prediction with Vision: A Survey. Paper Link

Multi-agent Collaborative Perception for Robotic Fleet: A Systematic Review. Paper Link

Generative AI in Vision: A Survey on Models, Metrics and Applications. Paper Link

Traffic Policeman Gesture Recognition With Spatial Temporal Graph Convolution Network. Paper Link

Augmenting Vision Queries with RADAR for BEV Detection in Autonomous Driving. Paper Link

AI Community Contributions

Publication Peer-Review

Master’s Admission Committee

Technical Judge

Technical Judge

Conference Program Chairs

Keynote/ Panel Sessions

Keynote: Vision-based Perception for Autonomous Driving Link

Keynote: Vision-based Perception for Autonomous Driving Link

Keynote: 1st Workshop on Cooperative Intelligence for Embodied AI Link

Workshop Organizer. Link

Panelist on Session on AI for Autonomous Driving. Link