Wanshui Gan

I earned my Ph.D. from the Sugiyama-Yokoya-Ishida Lab at the University of Tokyo, Department of Complexity Science and Engineering, advised by Prof. Naoto YOKOYA. I was also the Junior Research Associate of the Geoinformatics Team at the RIKEN Center for Advanced Intelligence Project (AIP). During my doctoral studies, I introduced voxel for 4D novel view synthesis and introduced volume rendering and Gaussian splatting for 3D occupancy estimation.

Prior to that, I received the B.S. degree from the Guangdong University of Technology and the M.S. degree from the University of Macau. My previous research interest lies in 3D vision, large scene parsing, and reconstruction. Additionally, I am interested in exploring 3D foundation models and 3D generative models. You are welcomed to contact me by email if you are interested in my work or potential collaboration.

Email / Google Scholar / Github / Twitter

Experiences

2022-04 --> 2025-06: RIKEN AIP as Junior Research Associate, Topic: NeRF, 3D occupancy estimation.

2024-02 --> 2024-04: Cyberagent AI Lab as Research Intern, Topic: 4D Gaussian splatting.

2021-04 --> 2021-07: Tencent AI Lab as Research Intern, Topic: Facial landmark detection.

2020-06 --> 2022-02: Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, as Visiting Student, Topic: 6D pose estimation, Stereo Matching, NeRF.

Selected Publications

* indicates equal contribution

GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting
Wanshui Gan*, Fang Liu*, Hongbin Xu, Ningkai Mo, Naoto Yokoya
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
[Project] [Code] [arXiv]

We introduce GaussianOcc, a systematic method that investigates the two usages of Gaussian splatting for fully self-supervised and efficient 3D occupancy estimation in surround views. The proposed GaussianOcc method enables fully self-supervised (no ground truth pose) 3D occupancy estimation in competitive performance with low computational cost (2.7 times faster in training and 5 times faster in rendering).

	A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving Wanshui Gan, Ningkai Mo, Hongbin Xu, Naoto Yokoya IEEE Transactions on Intelligent Vehicles, 2024 [Paper] [Code] [arXiv] We introduce a comprehensive framework for surrounding-view 3D occupancy estimation, 3D reconstruction and depth estimation via volume rendering, featuring network design, loss design, and evaluation metric based on discrete point level sampling.
	V4d: Voxel for 4d novel view synthesis Wanshui Gan, Hongbin Xu, Yi Huang, Shifeng Chen, Naoto Yokoya IEEE Transactions on Visualization and Computer Graphics, 2023 [Paper] [Code] [arXiv] We propose the method V4D, a simple yet effective and efficient framework, for 4D novel view synthesis with the 3D voxel, which directly models the 4D neural radiance field without the need for canonical space.
	ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework Ningkai Mo, Wanshui Gan, Naoto Yokoya, Shifeng Chen IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022 [Paper] [Code] [arXiv] We introduce a novel 6D pose estimation framework, ES6D, based on the XYZNet and A(M)GPD loss. The XYZNet is designed for feature extraction from RGB-D data. It has a fully convolutional architecture and achieves an excellent trade-off between efficiency and effectiveness. Additionally, the A(M)GPD loss is proposed to handle symmetric objects, and performs better than ADD(S) loss.
	Light-weight network for real-time adaptive stereo depth estimation Wanshui Gan, Pak Kin Wong, Guokuan Yu, Rongchen Zhao, Chi Man Vong Neurocomputing, 2021 [Paper] [Code] We propose a novel light-weight adaptive network (LWANet) for real-time stereo depth estimation, achieving competitive performance compared with MADNet and StereoNet, and it has the advantages of low computational cost and low GPU memory space.

Honors and Awards

The First Prize in Formula Student China (FSAE 2017)

TIER IV Student scholarship (2022, 2023)

Academic Services

Conference Reviewer: CVPR

Journal Reviewer: IEEE TVCG, IEEE TIV, IEEE TCSVT, Neurocomputing

Website Template