Wholebody¶

Coco-Wholebody Dataset
¶

Associative Embedding + Higherhrnet on Coco-Wholebody¶

Associative Embedding (NIPS'2017)

@inproceedings{newell2017associative,
  title={Associative embedding: End-to-end learning for joint detection and grouping},
  author={Newell, Alejandro and Huang, Zhiao and Deng, Jia},
  booktitle={Advances in neural information processing systems},
  pages={2277--2287},
  year={2017}
}

HigherHRNet (CVPR'2020)

@inproceedings{cheng2020higherhrnet,
  title={HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation},
  author={Cheng, Bowen and Xiao, Bin and Wang, Jingdong and Shi, Honghui and Huang, Thomas S and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5386--5395},
  year={2020}
}

COCO-WholeBody (ECCV'2020)

@inproceedings{jin2020whole,
  title={Whole-Body Human Pose Estimation in the Wild},
  author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Results on COCO-WholeBody v1.0 val without multi-scale test

Arch	Input Size	Body AP	Body AR	Foot AP	Foot AR	Face AP	Face AR	Hand AP	Hand AR	Whole AP	Whole AR	ckpt	log
HigherHRNet-w32+	512x512	0.590	0.672	0.185	0.335	0.676	0.721	0.212	0.298	0.401	0.493	ckpt	log
HigherHRNet-w48+	512x512	0.630	0.706	0.440	0.573	0.730	0.777	0.389	0.477	0.487	0.574	ckpt	log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

Associative Embedding + Hrnet on Coco-Wholebody¶

Associative Embedding (NIPS'2017)

@inproceedings{newell2017associative,
  title={Associative embedding: End-to-end learning for joint detection and grouping},
  author={Newell, Alejandro and Huang, Zhiao and Deng, Jia},
  booktitle={Advances in neural information processing systems},
  pages={2277--2287},
  year={2017}
}

HRNet (CVPR'2019)

@inproceedings{sun2019deep,
  title={Deep high-resolution representation learning for human pose estimation},
  author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={5693--5703},
  year={2019}
}

COCO-WholeBody (ECCV'2020)

@inproceedings{jin2020whole,
  title={Whole-Body Human Pose Estimation in the Wild},
  author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Results on COCO-WholeBody v1.0 val without multi-scale test

Arch	Input Size	Body AP	Body AR	Foot AP	Foot AR	Face AP	Face AR	Hand AP	Hand AR	Whole AP	Whole AR	ckpt	log
HRNet-w32+	512x512	0.551	0.650	0.271	0.451	0.564	0.618	0.159	0.238	0.342	0.453	ckpt	log
HRNet-w48+	512x512	0.592	0.686	0.443	0.595	0.619	0.674	0.347	0.438	0.422	0.532	ckpt	log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

Topdown Heatmap + Resnet on Coco-Wholebody¶

SimpleBaseline2D (ECCV'2018)

@inproceedings{xiao2018simple,
  title={Simple baselines for human pose estimation and tracking},
  author={Xiao, Bin and Wu, Haiping and Wei, Yichen},
  booktitle={Proceedings of the European conference on computer vision (ECCV)},
  pages={466--481},
  year={2018}
}

COCO-WholeBody (ECCV'2020)

@inproceedings{jin2020whole,
  title={Whole-Body Human Pose Estimation in the Wild},
  author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch	Input Size	Body AP	Body AR	Foot AP	Foot AR	Face AP	Face AR	Hand AP	Hand AR	Whole AP	Whole AR	ckpt	log
pose_resnet_50	256x192	0.652	0.739	0.614	0.746	0.608	0.716	0.460	0.584	0.457	0.578	ckpt	log
pose_resnet_50	384x288	0.666	0.747	0.635	0.763	0.732	0.812	0.537	0.647	0.573	0.671	ckpt	log
pose_resnet_101	256x192	0.670	0.754	0.640	0.767	0.611	0.723	0.463	0.589	0.533	0.647	ckpt	log
pose_resnet_101	384x288	0.692	0.770	0.680	0.798	0.747	0.822	0.549	0.658	0.597	0.692	ckpt	log
pose_resnet_152	256x192	0.682	0.764	0.662	0.788	0.624	0.728	0.482	0.606	0.548	0.661	ckpt	log
pose_resnet_152	384x288	0.703	0.780	0.693	0.813	0.751	0.825	0.559	0.667	0.610	0.705	ckpt	log

Topdown Heatmap + Hrnet + Dark on Coco-Wholebody¶

HRNet (CVPR'2019)

@inproceedings{sun2019deep,
  title={Deep high-resolution representation learning for human pose estimation},
  author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={5693--5703},
  year={2019}
}

DarkPose (CVPR'2020)

@inproceedings{zhang2020distribution,
  title={Distribution-aware coordinate representation for human pose estimation},
  author={Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7093--7102},
  year={2020}
}

COCO-WholeBody (ECCV'2020)

@inproceedings{jin2020whole,
  title={Whole-Body Human Pose Estimation in the Wild},
  author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch	Input Size	Body AP	Body AR	Foot AP	Foot AR	Face AP	Face AR	Hand AP	Hand AR	Whole AP	Whole AR	ckpt	log
pose_hrnet_w32_dark	256x192	0.694	0.764	0.565	0.674	0.736	0.808	0.503	0.602	0.582	0.671	ckpt	log
pose_hrnet_w48_dark+	384x288	0.742	0.807	0.705	0.804	0.840	0.892	0.602	0.694	0.661	0.743	ckpt	log

Note: + means the model is first pre-trained on original COCO dataset, and then fine-tuned on COCO-WholeBody dataset. We find this will lead to better performance.

Topdown Heatmap + Hrnet on Coco-Wholebody¶

HRNet (CVPR'2019)

@inproceedings{sun2019deep,
  title={Deep high-resolution representation learning for human pose estimation},
  author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={5693--5703},
  year={2019}
}

COCO-WholeBody (ECCV'2020)

@inproceedings{jin2020whole,
  title={Whole-Body Human Pose Estimation in the Wild},
  author={Jin, Sheng and Xu, Lumin and Xu, Jin and Wang, Can and Liu, Wentao and Qian, Chen and Ouyang, Wanli and Luo, Ping},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Results on COCO-WholeBody v1.0 val with detector having human AP of 56.4 on COCO val2017 dataset

Arch	Input Size	Body AP	Body AR	Foot AP	Foot AR	Face AP	Face AR	Hand AP	Hand AR	Whole AP	Whole AR	ckpt	log
pose_hrnet_w32	256x192	0.700	0.746	0.567	0.645	0.637	0.688	0.473	0.546	0.553	0.626	ckpt	log
pose_hrnet_w32	384x288	0.701	0.773	0.586	0.692	0.727	0.783	0.516	0.604	0.586	0.674	ckpt	log
pose_hrnet_w48	256x192	0.700	0.776	0.672	0.785	0.656	0.743	0.534	0.639	0.579	0.681	ckpt	log
pose_hrnet_w48	384x288	0.722	0.790	0.694	0.799	0.777	0.834	0.587	0.679	0.631	0.716	ckpt	log

Wholebody¶

Coco-Wholebody Dataset¶

Associative Embedding + Higherhrnet on Coco-Wholebody¶

Associative Embedding + Hrnet on Coco-Wholebody¶

Topdown Heatmap + Resnet on Coco-Wholebody¶

Topdown Heatmap + Hrnet + Dark on Coco-Wholebody¶

Topdown Heatmap + Hrnet on Coco-Wholebody¶

Coco-Wholebody Dataset
¶