We present HUP-3D, a 3D multiview multimodal synthetic dataset for hand ultrasound (US) probe pose estimation in the context of obstetric ultrasound. Egocentric markerless 3D joint pose estimation has potential applications in mixed reality medical education. The ability to understand hand and probe movements opens the door to tailored guidance and mentoring applications. Our dataset consists of over 31k sets of RGB, depth, and segmentation mask frames, including pose-related reference data, with an emphasis on image diversity and complexity. Adopting a camera viewpoint-based sphere concept allows us to capture a variety of views and generate multiple hand grasps poses using a pre-trained network. Additionally, our approach includes a software-based image rendering concept, enhancing diversity with various hand and arm textures, lighting conditions, and background images. We validated our proposed dataset with state-of-the-art learning models and we obtained the lowest hand-object keypoint errors. The supplementary material details the parameters for sphere-based camera view angles and the grasp generation and rendering pipeline configuration. The source code for our grasp generation and rendering pipeline, along with the dataset, is publicly available at https://manuelbirlo.github.io/HUP-3D/.
@InProceedings{Bir_HUP3D_MICCAI2024,
author = { Birlo, Manuel and Caramalau, Razvan and Edwards, Philip J. “Eddie” and Dromey, Brian and Clarkson, Matthew J. and Stoyanov, Danail},
title = { { HUP-3D: A 3D multi-view synthetic dataset for assisted-egocentric hand-ultrasound-probe pose estimation } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15001},
month = {October},
page = {pending}
}