|
|
||
Reconstruction procedure. To reconstruct hand-object interactions in the wild, we leverage all available related data (2D keypoints, 2D instance masks, 3D object models, 3D in-the-lab MoCap) through an optimization-based procedure that consists of four steps: (a) hand pose estimation by 2D keypoints fitting, (b) object pose estimation via differentiable rendering, (c) joint optimization for spatial arrangement, and (d) pose refinement using 3D contact priors. |
|
||
Intermediate results. Top row: input images. 2nd row: results from individually optimizing hand and object. 3rd row: results from joint optimization (two viewpoints per example). Bottom row: results after the refinement. |
|
||
|
||
Qualitative results on images from the EPIC kitchen dataset (row 1-2) and 100 Days of Hands dataset (row 3-4). Our method produces reconstructions of reasonably high-quality across a range of viewpoints, activities, and objects. |
|
||
Additional qualitative results. Our procedure produces promising results across a range of scenarios and objects. |
|
||
Failure cases. We show representative failure cases of our reconstruction procedure. We observe several failure modes due to the failure of the individual steps in our procedure:hand pose estimation (first column), object pose estimation (columns 2-4), and the joint optimization (last column). |
|
||
Collected Models. We collected 120 object models with both within and across category variation. |
|
||
MOW Dataset. We collected a 3D dataset of humans Manipulating Objects in-the-Wild (MOW). Follow the instructions in this github repo to download and use the data. |
Cao*, Radosavovic*, Kanazawa, Malik. Reconstructing Hand-Object Interaction in the Wild ICCV, 2021. [Paper] [Bibtex] |
Acknowledgements |