Aditya Aggarwal

I am a masters student in the CSE department at UC San Diego since Fall 22. I am currently working at Google in Privacy, Safety and Security as a SWE Intern focusing on reducing the OpEx cost of manual review systems. I am also working as a Graduate Researcher in the Cognitive Robotics Lab on the Home Robot project under the mentorship of Prof. Henrik Christensen

Previously, I was a Research Intern at Microsoft Research, India, in the Technology and Empowerment group, where I work closely with Dr. Nipun Kwatra and Dr. Mohit Jain at the intersection of HCI, Computer Vision and Healthcare. My primary focus was on developing a low-cost smartphone based diagnostic solution for detecting eye diseases. I have also contributed to multiple open-source projects for the organization Robocomp.

I graduated from IIIT Hyderabad in 2020 with a B.Tech (Honors), where I was advised by Prof. K Madhava Krishna and Prof. Ravi Kiran Sarvadevabhatla . I worked mainly on vision-guided robot navigation and human activity understanding.

I am currently looking for full-time software engineering / machine learning opportunities starting January 2024, so if you have a position available or just want to say hi, you can always mail me.

Email  /  GitHub  /  Google Scholar  /  LinkedIn  /  Resume  / 

profile photo

Experience

Google Logo

Google


Software Engineering Intern (June 2023 - September 2023)
Sunnyvale, USA

During the internship, I developed a gRPC service and data processing pipeline utilizing MapReduce to seamlessly migrate manual review records from Google Shopping to a centralized database. These records were then used to generate customized analytics and real-time alerts. By leveraging messaging queues, I significantly improved the system's scalability, successfully managing a burst of up to 1000 queries per second (QPS) and handling a daily influx of 5000 records. I also implemented an extensible system architecture by introducing support for both Push and Pull frameworks, which remarkably reduced the onboarding time from 8 weeks to a mere 2 weeks.

Tech Stack: Java, MapReduce, SQL

Microsoft Logo

Microsoft Research


Research Intern (February 2021 - August 2022)
Bangalore, India

I led a project that automated the retinoscopy method by combining a smartphone with a retinoscope. I developed an android app which recorded videos at 120 fps and integrated it with a cloud-deployed video analysis pipeline which estimated refractive error of the eye. The result was a potential real-world medical tool that could simplify and enhance refractive error screening. The project's success was underscored by a clinical evaluation involving 128 patients, achieving a remarkable SOTA Mean Absolute Error of 0.75 ± 0.67D.

Tech Stack: Python, Android SDK, Javascript, React, Flask, SQL

Gojek Logo

Gojek


Product Engineer (June 2020 - January 2021)
Bangalore, India

I collaborated on a ride-hailing platform used by 4 million daily users, focusing on improving on-demand features and the booking process. I developed a scheduling service allowing future ride bookings, raising the Booking Conversion Rate from 90% to 95%. Additionally, I used Firebase remote config APIs to enhance system extensibility and fault tolerance across Android and iOS platforms, without requiring app updates.

Tech Stack: Go, Ruby, Kotlin, SQL

Research

I'm interested in problems involving computer vision, machine learning and robotics with real world applications. My research vision is to enable embodied agents to perceive our dynamic world and make intelligent decisions.

project image

Towards Automating Retinoscopy for Refractive Error Diagnosis


Aditya Aggarwal, Siddhartha Gairola, Nipun Kwatra, Mohit Jain
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 6, Issue 3, 2022
arxiv / project page / code / virtual presentation / bibtex /

Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment. It can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Retinoscopy is an objective refraction method that does not require any input from the patient. In this work, we automate retinoscopy by attaching a smartphone to a retinoscope and recording retinoscopic videos with the patient wearing a custom pair of paper frames. The results from our video processing pipeline and mathematical model indicate that our approach has the potential to be used as a retinoscopy-based refractive error screening tool in real-world medical settings.

project image

Quo Vadis, Skeleton Action Recognition?


Pranay Gupta, Anirudh Thatipelli, Aditya Aggarwal, Shubh Maheshwari, Neel Trivedi, Sourav Das, Ravi Kiran Sarvadevabhatla
International Journal of Computer Vision (IJCV), Special Issue on Human pose, Motion, Activities and Shape in 3D, 2021
arxiv / project page / code / video / bibtex /

In this work, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. We benchmark state-of-the-art models on the NTU-120 dataset and provide a multi-layered assessment. To examine skeleton action recognition 'in the wild', we introduce Skeletics-152 and Skeleton-Mimetics datasets. Our results reveal the challenges and domain gap induced by actions 'in the wild' videos.

project image

Reconstruct, Rasterize and Backprop: Dense shape and pose estimation from a single image


Aniket Pokale*, Aditya Aggarwal*, KM Jatavallabhula, K Madhava Krishna
CVPR Workshop on Long Term Visual Localization, Visual Odometry, and Geometric and Learning-Based SLAM, 2020
arxiv / project page / code / virtual presentation / bibtex /

In this work, we present a new system to obtain dense object reconstructions along with 6-DoF poses from a single image. We demonstrate that our approach—dubbed reconstruct, rasterize and backprop (RRB)—achieves significantly lower pose estimation errors compared to prior art, and is able to recover dense object shapes and poses from imagery. We further extend our results to an (offline) setup, where we demonstrate a dense monocular object-centric egomotion estimation system.

* Both authors contributed equally towards this work.
project image

A principled formulation of integrating objects in Monocular SLAM


Aniket Pokale, Dipanjan Das, Aditya Aggarwal, Brojeshwar Bhowmick, K Madhava Krishna
Advances in Robotics (AIR), 2019
ACM Proceedings / video / bibtex /

In this paper, we present a novel edge-based SLAM framework, along with category-level models, to localize objects in the scene as well as improve the camera trajectory. We integrate object category models in the core SLAM back-end to jointly optimize for the camera trajectory, object poses along with its shape and 3D structure. We show that our joint optimization is able to recover a better camera trajectory than Edge SLAM.

Open Source Contributions

project image

I have also contributed to the open-source organization - Robocomp for past three years. I have mentored multiple projects for Google Summer of Code (GSoC) 2021 and 2020. My proposal was also selected as a project for GSoC 2019 Program.

Mentoring and Reviewing
Sign Languare Recognition - GSoC 2021
Hand Gesture Recognition - GSoC 2020
Human recognition (identification) using multi-modal perception system - GSoC 2020

Selected Proposal
Implemented a People Identification Component for the educational bot - GSoC 2019

Education

UCSD

UC San Diego

Masters in Computer Science and Engineering CGPA: 3.9 / 4
San Diego, CA (2022 - Present)
Graduate Researcher at Cognitive Robotics Lab
IIIT Hyderabad

IIIT Hyderabad

B. Tech with Honors in Electronics and Communication Engineering CGPA: 8.9 / 10
Hyderabad, India (2016 - 2020)
Undergraduate Researcher at CVIT Lab, Robotics Research Center

Projects

project image

Fault tolerant distributed system

Built a distributed file storage system using gRPC, with concurrent reads and write support. Implemented consistent hashing and RAFT consensus protocol to make the system scalable and resilient to failure of up to 50% of servers

Check out the code here.
project image

Mosquitoes vs Drones

Developed an end-to-end system which identified possible mosquito breeding grounds (water logged areas) in aerial images. I also implemented a path planning algorithm for drone to reach the predicted location and destroy the mosquito larvae.

Check out the entire workflow in this video.
project image

CANSAT 2018

Our team FalconX ranked 24th at the annual CANSAT competition organized by The American Astronautical Society (AAS) at the Stephenville, Texas. The competition simulated a satellite launch in the form of a soft drink can with functioning power, sensors, and communication system alongside a large hen's egg representing a delicate instrument. Launched from an altitude of 700m above launch site, our probe made a series of manoeuvres to descend at a controlled rate, and finally landed without breaking the egg.

Check out this blog for more details.

project image

ABU Robocon 2018

Check out the problem statement for the contest here.
Our team built an autonomus bot using a pneumatic system to throw a shuttlecock tied to a thread through a ring. It also coordinated with a manual bot which picked up the shuttlecock and passed it to the automatic bot.

Check this video for a demo.
project image

Game of Life

Implemented a popular zero player simulation game showing the cellular automation. It takes an input configuration and the user can observe how it evolves over time. I followed MVC architecture and TDD paradigm while working on this project.

Check out this repository for code implementation in Java and Ruby.



Design and source code from Jon Barron's website