Yuyang Tu, Junnan Jiang, Shuang Li, Norman Hendrich, Miao Li and Jianwei Zhang
The overall setup for the ObjectInHand dataset collection, we use Intel RealSense SR300 for hand pose tracking to control the Shadow hand for in-hand dexterous manipulation, and Intel RealSense D435 is collecting RGB-D data for object pose estimation. The transmitter can generate a magnetic field for the 6D pose sensor, which is used to label the ground truth of the object pose.
(a) Shows 7 different objects in the ObjectInHand dataset including symmetric objects and hard-to-manipulate objects. (b) Shows different initial robot hand poses in the ObjectInHand dataset. The first column is the baseline scenario, the object is in front of the hand, followed by other derived scenarios object on the hand, object under the hand, object at the back of the hand, object close to the camera, object far from the camera, bad illumination.
We upload all our dataset in GoogleDrive, and example code for data extraction can be available from github.
Also, a finetune with Shadow Hand mask and contact dataset can be available from GoogleDrive, the mask is generated in pybullet.
The whole pipeline of PoseFusion. First, tactile data including electrode reading and fingertip pose, and vision data including RGB image and point cloud are processed separately for feature extraction and pose estimation. Then the merge feature including tactile and vision features is also used for pose estimation. Finally, the three output poses and the merge feature are fed into SelectLSTM for the best pose selection among the three poses, which aims to avoid the error of the modal data collapse.