Wanshui Gan
I am a 3rd year PhD student at the Sugiyama-Yokoya-Ishida Lab at the University of Tokyo, Department of Complexity Science and Engineering, advised by Prof. Naoto YOKOYA. I am also the Junior Research Associate of the Geoinformatics Team at the RIKEN Center for Advanced Intelligence Project (AIP).
Prior to that, I received the B.S. degree from the Guangdong University of Technology, China, in 2018 and the M.S. degree from the University of Macau, China, in 2021. My previous research interest lies in 3D vision, large scene parsing, and reconstruction. Additionally, I am interested in exploring 3D foundation models and 3D generative models. You are welcomed to contact me by email if you are interested in my work or potential collaboration.
Email  / 
Google Scholar  / 
Github  / 
Twitter
|
|
Experiences
2022-04 --> Present: RIKEN AIP as Junior Research Associate, Topic: NeRF, 3D occupancy estimation.
2024-02 --> 2024-04: Cyberagent AI Lab as Research Intern, Topic: 4D Gaussian splatting.
2021-04 --> 2021-07: Tencent AI Lab as Research Intern, Topic: Facial landmark detection.
2020-06 --> 2022-02: Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, as Visiting Student, Topic: 6D pose estimation, Stereo Matching, NeRF.
|
Selected Publications
* indicates equal contribution
|
|
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
Wanshui Gan*, Fang Liu*, Hongbin Xu, Ningkai Mo, Naoto Yokoya
In submission
[Project] [Code] [arXiv]
We introduce GaussianOcc, a systematic method that investigates the two usages of Gaussian splatting for fully
self-supervised and efficient 3D occupancy estimation in surround views. The proposed GaussianOcc method enables fully self-supervised (no ground truth pose) 3D occupancy estimation in
competitive performance with low computational cost (2.7 times faster in training and 5 times faster in rendering).
|
|
A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving
Wanshui Gan, Ningkai Mo, Hongbin Xu, Naoto Yokoya
IEEE Transactions on Intelligent Vehicles, 2024
[Paper] [Code] [arXiv]
We introduce a comprehensive framework for surrounding-view 3D occupancy estimation, 3D reconstruction and depth estimation via volume rendering, featuring network design, loss design, and evaluation metric based on discrete point level sampling.
|
|
V4d: Voxel for 4d novel view synthesis
Wanshui Gan, Hongbin Xu, Yi Huang, Shifeng Chen, Naoto Yokoya
IEEE Transactions on Visualization and Computer Graphics, 2023
[Paper] [Code] [arXiv]
We propose the method V4D, a simple yet effective and efficient framework, for 4D novel view synthesis with the 3D voxel, which directly models the 4D neural radiance field without the need for canonical space.
|
|
ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework
Ningkai Mo*, Wanshui Gan*, Naoto Yokoya, Shifeng Chen
IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022
[Paper] [Code] [arXiv]
We introduce a novel 6D pose estimation framework, ES6D, based on the XYZNet and A(M)GPD loss. The XYZNet is designed for feature extraction from RGB-D data. It has a fully convolutional architecture and achieves an excellent trade-off between efficiency and effectiveness. Additionally, the A(M)GPD loss is proposed to handle symmetric objects, and performs better than ADD(S) loss.
|
|
Light-weight network for real-time adaptive stereo depth estimation
Wanshui Gan, Pak Kin Wong, Guokuan Yu, Rongchen Zhao, Chi Man Vong
Neurocomputing, 2021
[Paper] [Code]
We propose a novel light-weight adaptive network (LWANet) for real-time stereo depth estimation, achieving competitive performance compared with MADNet and StereoNet, and it has the advantages of low computational cost and low GPU memory space.
|
Academic Services
Conference Reviewer: CVPR
Journal Reviewer: IEEE TVCG, IEEE TIV, IEEE TCSVT
|
© Wanshui | Last updated: June 21, 2024
|