Aditya Aggarwal

I am a masters student in the CSE department at UC San Diego since Fall 22. I am currently working at Google in Privacy, Safety and Security as a SWE Intern focusing on reducing the OpEx cost of manual review systems. I am also working as a Graduate Researcher in the Cognitive Robotics Lab on the Home Robot project under the mentorship of Prof. Henrik Christensen

Previously, I was a Research Intern at Microsoft Research, India, in the Technology and Empowerment group, where I work closely with Dr. Nipun Kwatra and Dr. Mohit Jain at the intersection of HCI, Computer Vision and Healthcare. My primary focus was on developing a low-cost smartphone based diagnostic solution for detecting eye diseases. I have also contributed to multiple open-source projects for the organization Robocomp.

I graduated from IIIT Hyderabad in 2020 with a B.Tech (Honors), where I was advised by Prof. K Madhava Krishna and Prof. Ravi Kiran Sarvadevabhatla . I worked mainly on vision-guided robot navigation and human activity understanding.

I am currently looking for full-time software engineering / machine learning opportunities starting January 2024, so if you have a position available or just want to say hi, you can always mail me.

Email / GitHub / Google Scholar / LinkedIn / Resume /

Experience

Google

Software Engineering Intern (June 2023 - September 2023)
Sunnyvale, USA

During the internship, I developed a gRPC service and data processing pipeline utilizing MapReduce to seamlessly migrate manual review records from Google Shopping to a centralized database. These records were then used to generate customized analytics and real-time alerts. By leveraging messaging queues, I significantly improved the system's scalability, successfully managing a burst of up to 1000 queries per second (QPS) and handling a daily influx of 5000 records. I also implemented an extensible system architecture by introducing support for both Push and Pull frameworks, which remarkably reduced the onboarding time from 8 weeks to a mere 2 weeks.

Tech Stack: Java, MapReduce, SQL

Microsoft Research

Research Intern (February 2021 - August 2022)
Bangalore, India

I led a project that automated the retinoscopy method by combining a smartphone with a retinoscope. I developed an android app which recorded videos at 120 fps and integrated it with a cloud-deployed video analysis pipeline which estimated refractive error of the eye. The result was a potential real-world medical tool that could simplify and enhance refractive error screening. The project's success was underscored by a clinical evaluation involving 128 patients, achieving a remarkable SOTA Mean Absolute Error of 0.75 ± 0.67D.

Tech Stack: Python, Android SDK, Javascript, React, Flask, SQL

Gojek

Product Engineer (June 2020 - January 2021)
Bangalore, India

I collaborated on a ride-hailing platform used by 4 million daily users, focusing on improving on-demand features and the booking process. I developed a scheduling service allowing future ride bookings, raising the Booking Conversion Rate from 90% to 95%. Additionally, I used Firebase remote config APIs to enhance system extensibility and fault tolerance across Android and iOS platforms, without requiring app updates.

Tech Stack: Go, Ruby, Kotlin, SQL

Research

I'm interested in problems involving computer vision, machine learning and robotics with real world applications. My research vision is to enable embodied agents to perceive our dynamic world and make intelligent decisions.

	Towards Automating Retinoscopy for Refractive Error Diagnosis Aditya Aggarwal, Siddhartha Gairola, Nipun Kwatra, Mohit Jain Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 6, Issue 3, 2022 arxiv / project page / code / virtual presentation / bibtex / Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment. It can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Retinoscopy is an objective refraction method that does not require any input from the patient. In this work, we automate retinoscopy by attaching a smartphone to a retinoscope and recording retinoscopic videos with the patient wearing a custom pair of paper frames. The results from our video processing pipeline and mathematical model indicate that our approach has the potential to be used as a retinoscopy-based refractive error screening tool in real-world medical settings.
	Quo Vadis, Skeleton Action Recognition? Pranay Gupta, Anirudh Thatipelli, Aditya Aggarwal, Shubh Maheshwari, Neel Trivedi, Sourav Das, Ravi Kiran Sarvadevabhatla International Journal of Computer Vision (IJCV), Special Issue on Human pose, Motion, Activities and Shape in 3D, 2021 arxiv / project page / code / video / bibtex / In this work, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. We benchmark state-of-the-art models on the NTU-120 dataset and provide a multi-layered assessment. To examine skeleton action recognition 'in the wild', we introduce Skeletics-152 and Skeleton-Mimetics datasets. Our results reveal the challenges and domain gap induced by actions 'in the wild' videos.
	Reconstruct, Rasterize and Backprop: Dense shape and pose estimation from a single image Aniket Pokale, Aditya Aggarwal, KM Jatavallabhula, K Madhava Krishna CVPR Workshop on Long Term Visual Localization, Visual Odometry, and Geometric and Learning-Based SLAM, 2020 arxiv / project page / code / virtual presentation / bibtex / In this work, we present a new system to obtain dense object reconstructions along with 6-DoF poses from a single image. We demonstrate that our approach—dubbed reconstruct, rasterize and backprop (RRB)—achieves significantly lower pose estimation errors compared to prior art, and is able to recover dense object shapes and poses from imagery. We further extend our results to an (offline) setup, where we demonstrate a dense monocular object-centric egomotion estimation system. * Both authors contributed equally towards this work.
	A principled formulation of integrating objects in Monocular SLAM Aniket Pokale, Dipanjan Das, Aditya Aggarwal, Brojeshwar Bhowmick, K Madhava Krishna Advances in Robotics (AIR), 2019 ACM Proceedings / video / bibtex / In this paper, we present a novel edge-based SLAM framework, along with category-level models, to localize objects in the scene as well as improve the camera trajectory. We integrate object category models in the core SLAM back-end to jointly optimize for the camera trajectory, object poses along with its shape and 3D structure. We show that our joint optimization is able to recover a better camera trajectory than Edge SLAM.

Open Source Contributions

I have also contributed to the open-source organization - Robocomp for past three years. I have mentored multiple projects for Google Summer of Code (GSoC) 2021 and 2020. My proposal was also selected as a project for GSoC 2019 Program.

Mentoring and Reviewing
Sign Languare Recognition - GSoC 2021
Hand Gesture Recognition - GSoC 2020
Human recognition (identification) using multi-modal perception system - GSoC 2020

Selected Proposal
Implemented a People Identification Component for the educational bot - GSoC 2019

Education

	UC San Diego Masters in Computer Science and Engineering CGPA: 3.9 / 4 San Diego, CA (2022 - Present) Graduate Researcher at Cognitive Robotics Lab
	IIIT Hyderabad B. Tech with Honors in Electronics and Communication Engineering CGPA: 8.9 / 10 Hyderabad, India (2016 - 2020) Undergraduate Researcher at CVIT Lab, Robotics Research Center

Projects

	Fault tolerant distributed system Built a distributed file storage system using gRPC, with concurrent reads and write support. Implemented consistent hashing and RAFT consensus protocol to make the system scalable and resilient to failure of up to 50% of servers Check out the code here.
	Mosquitoes vs Drones Developed an end-to-end system which identified possible mosquito breeding grounds (water logged areas) in aerial images. I also implemented a path planning algorithm for drone to reach the predicted location and destroy the mosquito larvae. Check out the entire workflow in this video.
	CANSAT 2018 Our team FalconX ranked 24th at the annual CANSAT competition organized by The American Astronautical Society (AAS) at the Stephenville, Texas. The competition simulated a satellite launch in the form of a soft drink can with functioning power, sensors, and communication system alongside a large hen's egg representing a delicate instrument. Launched from an altitude of 700m above launch site, our probe made a series of manoeuvres to descend at a controlled rate, and finally landed without breaking the egg. Check out this blog for more details.
	ABU Robocon 2018 Check out the problem statement for the contest here. Our team built an autonomus bot using a pneumatic system to throw a shuttlecock tied to a thread through a ring. It also coordinated with a manual bot which picked up the shuttlecock and passed it to the automatic bot. Check this video for a demo.
	Game of Life Implemented a popular zero player simulation game showing the cellular automation. It takes an input configuration and the user can observe how it evolves over time. I followed MVC architecture and TDD paradigm while working on this project. Check out this repository for code implementation in Java and Ruby.