I am a Senior Machine Learning Engineer at Apple working on multimodal LLM and 3D Scene Understanding. Prior to
joining apple, I was a Research Engineer at the
Honda Research Institute (HRI) USA, where I mainly worked on 3D Scene Understanding and
Multi-agent interaction modeling related topics for Autonomous
Driving Car. I also worked on the indoor mobile robot and
manipulation.
I'm interested in robotics, computer vision and machine
learning. Much of my research is about understanding the
surrounding environment of the robot/self-driving car from
multi-sensors (lidar, camera, gps/imu).
Propose SlowFast-LLaVA (or SF-LLaVA for short), a training-free video large language model (LLM) that can jointly capture the detailed spatial semantics and long-range temporal context without exceeding the token budget of commonly used LLMs.
Important Object Identification with Semi-Supervised Learning for Autonomous Driving
Jiachen Li*, Haiming Gang*, Hengbo Ma, Masayoshi Tomizuka, Chiho Choi
International Conference on Robotics and Automation(ICRA), 2022
arxiv /
Propose a novel approach for important object identification in egocentric driving scenarios with relational reasoning on the objects in the scene.
Semi-supervised 3D Object Detection via Temporal Graph Neural Networks
Jianren Wang*, Haiming Gang*, Haiming Gang, Siddarth Ancha, Yi-Ting Chen, David Held
International Conference on 3D Vision(3DV), 2021
arxiv /
Propose leveraging large amounts of unlabeled point cloud videos by semi-supervised learning of 3D object detectors via temporal graph neural networks
LOKI: Long Term and Key Intentions for Trajectory Prediction
Propose LOKI (LOng term and Key Intentions), a novel large-scale dataset that is designed to tackle joint trajectory andintention prediction for heterogeneous traffic agents (pedestrians and vehicles) in an autonomous driving setting.
The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes
Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen
International Conference on Robotics and Automation(ICRA), 2019
arxiv /
dataset /
Present the Honda Research Institute 3D Dataset (H3D), a large-scale full-surround 3D multi-object detection and tracking dataset collected using a 3D LiDAR scanner.
The Curious Minded Machine project seeks to develop intelligent systems capable of learning continuously with a human-like sense of curiosity.
Autonomous Domestic Assistant Robot
NYU : MS project
2017-05-08
The project developed a mobile robotic system combined with a manipulator, image processing and motion planning with mobile devices to assist human in an indoor environment. The project utilizes a Microsoft Kinect, three microcontrollers, a mobile phone, a mobile robot base and an arm robot.
Multi-Manipulator Collaboration based on Object Detection
NYU : Robotic Gait and Manipulation
2017-05-05
Control the collaboration of multiple simple DOF manipulators for picking and placing tasks based on object recognition using Linemod provided by ORK (Object Recognition Kitchen) library with ROS.
Haar Feature Object Recognition and Manipulation
NYU : MS project
2016-12-20
This project developed an image processing-based object recognition and manipulation system with a 5-DOF smart robotic arm through a smartphone interface considering human user’s intent sensing.
Braille Display
NYU : Advanced Mechatronics
2016-12-20
This project developed a device that converted the alphabet characters to the braille display system to help people who are visually impaired read the text.
TOT BOT
NYU : Robots for Disability
2016-12-19
We developed “Tot Bot” robot, which enables a kid to see its surrounding on the tablet screen and then reach a selected point by a touch on the screen.
Smart Mirror: Automatic Defog and Display
NYU : Mechatronics
2016-05-01
This project developed a Smart Mirror which automatically defogs and wipes moisture from its surface as well as displaying date, time, and a news headline.