# Wholebody¶

## Coco-Wholebody Dataset¶

### Associative Embedding + Hrnet on Coco-Wholebody¶

Associative Embedding (NIPS'2017)
@inproceedings{newell2017associative,
title={Associative embedding: End-to-end learning for joint detection and grouping},
author={Newell, Alejandro and Huang, Zhiao and Deng, Jia},
booktitle={Advances in neural information processing systems},
pages={2277--2287},
year={2017}
}

HRNet (CVPR'2019)
@inproceedings{sun2019deep,
title={Deep high-resolution representation learning for human pose estimation},
author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={5693--5703},
year={2019}
}

COCO-WholeBody (ECCV'2020)
@inproceedings{jin2020whole,
title={Whole-Body Human Pose Estimation in the Wild},
author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}


Results on COCO-WholeBody v1.0 val without multi-scale test

Arch Input Size Body AP Body AR Foot AP Foot AR Face AP Face AR Hand AP Hand AR Whole AP Whole AR ckpt log
HRNet-w32+ 512x512 0.551 0.650 0.271 0.451 0.564 0.618 0.159 0.238 0.342 0.453 ckpt log
HRNet-w48+ 512x512 0.592 0.686 0.443 0.595 0.619 0.674 0.347 0.438 0.422 0.532 ckpt log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

### Associative Embedding + Higherhrnet on Coco-Wholebody¶

Associative Embedding (NIPS'2017)
@inproceedings{newell2017associative,
title={Associative embedding: End-to-end learning for joint detection and grouping},
author={Newell, Alejandro and Huang, Zhiao and Deng, Jia},
booktitle={Advances in neural information processing systems},
pages={2277--2287},
year={2017}
}

HigherHRNet (CVPR'2020)
@inproceedings{cheng2020higherhrnet,
title={HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation},
author={Cheng, Bowen and Xiao, Bin and Wang, Jingdong and Shi, Honghui and Huang, Thomas S and Zhang, Lei},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={5386--5395},
year={2020}
}

COCO-WholeBody (ECCV'2020)
@inproceedings{jin2020whole,
title={Whole-Body Human Pose Estimation in the Wild},
author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}


Results on COCO-WholeBody v1.0 val without multi-scale test

Arch Input Size Body AP Body AR Foot AP Foot AR Face AP Face AR Hand AP Hand AR Whole AP Whole AR ckpt log
HigherHRNet-w32+ 512x512 0.590 0.672 0.185 0.335 0.676 0.721 0.212 0.298 0.401 0.493 ckpt log
HigherHRNet-w48+ 512x512 0.630 0.706 0.440 0.573 0.730 0.777 0.389 0.477 0.487 0.574 ckpt log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

### Topdown Heatmap + Hrnet + Dark on Coco-Wholebody¶

HRNet (CVPR'2019)
@inproceedings{sun2019deep,
title={Deep high-resolution representation learning for human pose estimation},
author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={5693--5703},
year={2019}
}

DarkPose (CVPR'2020)
@inproceedings{zhang2020distribution,
title={Distribution-aware coordinate representation for human pose estimation},
author={Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7093--7102},
year={2020}
}

COCO-WholeBody (ECCV'2020)
@inproceedings{jin2020whole,
title={Whole-Body Human Pose Estimation in the Wild},
author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}


Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch Input Size Body AP Body AR Foot AP Foot AR Face AP Face AR Hand AP Hand AR Whole AP Whole AR ckpt log
pose_hrnet_w32_dark 256x192 0.694 0.764 0.565 0.674 0.736 0.808 0.503 0.602 0.582 0.671 ckpt log
pose_hrnet_w48_dark+ 384x288 0.742 0.807 0.705 0.804 0.840 0.892 0.602 0.694 0.661 0.743 ckpt log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

### Topdown Heatmap + Hrnet on Coco-Wholebody¶

HRNet (CVPR'2019)
@inproceedings{sun2019deep,
title={Deep high-resolution representation learning for human pose estimation},
author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={5693--5703},
year={2019}
}

COCO-WholeBody (ECCV'2020)
@inproceedings{jin2020whole,
title={Whole-Body Human Pose Estimation in the Wild},
author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}


Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch Input Size Body AP Body AR Foot AP Foot AR Face AP Face AR Hand AP Hand AR Whole AP Whole AR ckpt log
pose_hrnet_w32 256x192 0.700 0.746 0.567 0.645 0.637 0.688 0.473 0.546 0.553 0.626 ckpt log
pose_hrnet_w32 384x288 0.701 0.773 0.586 0.692 0.727 0.783 0.516 0.604 0.586 0.674 ckpt log
pose_hrnet_w48 256x192 0.700 0.776 0.672 0.785 0.656 0.743 0.534 0.639 0.579 0.681 ckpt log
pose_hrnet_w48 384x288 0.722 0.790 0.694 0.799 0.777 0.834 0.587 0.679 0.631 0.716 ckpt log

### Topdown Heatmap + Resnet on Coco-Wholebody¶

SimpleBaseline2D (ECCV'2018)
@inproceedings{xiao2018simple,
title={Simple baselines for human pose estimation and tracking},
author={Xiao, Bin and Wu, Haiping and Wei, Yichen},
booktitle={Proceedings of the European conference on computer vision (ECCV)},
pages={466--481},
year={2018}
}

COCO-WholeBody (ECCV'2020)
@inproceedings{jin2020whole,
title={Whole-Body Human Pose Estimation in the Wild},
author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2020}
}


Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch Input Size Body AP Body AR Foot AP Foot AR Face AP Face AR Hand AP Hand AR Whole AP Whole AR ckpt log
pose_resnet_50 256x192 0.652 0.739 0.614 0.746 0.608 0.716 0.460 0.584 0.457 0.578 ckpt log
pose_resnet_50 384x288 0.666 0.747 0.635 0.763 0.732 0.812 0.537 0.647 0.573 0.671 ckpt log
pose_resnet_101 256x192 0.670 0.754 0.640 0.767 0.611 0.723 0.463 0.589 0.533 0.647 ckpt log
pose_resnet_101 384x288 0.692 0.770 0.680 0.798 0.747 0.822 0.549 0.658 0.597 0.692 ckpt log
pose_resnet_152 256x192 0.682 0.764 0.662 0.788 0.624 0.728 0.482 0.606 0.548 0.661 ckpt log
pose_resnet_152 384x288 0.703 0.780 0.693 0.813 0.751 0.825 0.559 0.667 0.610 0.705 ckpt log

## Halpe Dataset¶

### Topdown Heatmap + Hrnet + Dark on Halpe¶

HRNet (CVPR'2019)
@inproceedings{sun2019deep,
title={Deep high-resolution representation learning for human pose estimation},
author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={5693--5703},
year={2019}
}

DarkPose (CVPR'2020)
@inproceedings{zhang2020distribution,
title={Distribution-aware coordinate representation for human pose estimation},
author={Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7093--7102},
year={2020}
}

Halpe (CVPR'2020)
@inproceedings{li2020pastanet,
title={PaStaNet: Toward Human Activity Knowledge Engine},
author={Li, Yong-Lu and Xu, Liang and Liu, Xinpeng and Huang, Xijie and Xu, Yue and Wang, Shiyi and Fang, Hao-Shu and Ma, Ze and Chen, Mingyang and Lu, Cewu},
booktitle={CVPR},
year={2020}
}


Results on Halpe v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch Input Size Whole AP Whole AR ckpt log
pose_hrnet_w48_dark+ 384x288 0.531 0.642 ckpt log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on Halpe dataset. We find this will lead to better performance.