Jingdong Wang (王井东), Fellow of CAE, IEEE and IAPR

Chief Scientist for Computer Vision
Baidu
Email_1: wangjingdong at baidu dot com
Email_2: wangjingdong at outlook dot com

CV    Google Scholar    DBLP    ORCID    中文主页

Biography

Jingdong Wang is Chief Scientist for computer vision with Baidu. Before joining Baidu, he was a Senior Principal Researcher at Microsoft Research Asia from September 2007 to August 2021. His areas of interest include computer vision, deep learning, and multimedia search. His representative works include high-resolution network (HRNet) for generic visual recognition, transformer-based object-contextual representations (OCRNet) for semantic segmentation, discriminative regional feature integration (DRFI) for saliency detection, neighborhood graph search (NGS, SPTAG) for vector search.

He has been serving/served as an Associate Editor of IEEE TPAMI, IJCV, ACM TOMM, IEEE TMM, and IEEE TCSVT, and an (senior) area chair of leading conferences in vision, multimedia, and AI, such as CVPR, ICCV, ECCV, NeurIPS, ACM MM, IJCAI, and AAAI. He will be a Program Chair for ICCV 2025. He was elected as an ACM Distinguished Member, a Fellow of IAPR, a Fellow of IEEE, and a Fellow of CAE, for his contributions to visual content understanding and retrieval.


Recent news

34. Elected as Fellow of CAE 2024, link, 5/6/2024.
33. Will be a Program Chair for ICCV 2025, 03/2024.
32. Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment. [pdf] [code] Accepted by ICCV, 07/2023
31. Context Autoencoder for Self-Supervised Representation Learning. [IJCV] [pdf] [code] Superior over MAE and BEiT, 07/2022. Accepted by IJCV, 07/2023
30. Understanding Self-Supervised Pretraining with Part-Aware Representation Learning. [pdf] [code] Accepted by TMLR, 07/2023
29. Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining. [pdf] Group DETR v2 achieves 64.5 mAP on COCO test-dev, and establishes a new SoTA on the COCO leaderboard . 11/8/2022
28. Transformer does not outperform CNN: On the Connection between Local Attention and Dynamic Depth-wise Convolution. [pdf] [code]. ICLR 2022 spotlight. 03/2022
27. People of ACM interview: URL 12/2021.
26. Elected as Fellow of IEEE, for his contributions to visual content understanding and retrieval, 11/2021.
25. Code released for our NeurIPS 2021 paper, HRFormer: High-Resolution Transformer for Dense Prediction. [pdf] code. 09/2021
24. Code released for our NeurIPS 2021 paper, SPANN: Highly-efficient Billion-scale ApproximateNearest Neighbor Search. [pdf] code. 09/2021
23. Code released for our ICCV 2021 paper, Conditional DETR for Fast Training Convergence. [pdf] code. 8/16/2021
22. Local Transformer attention is equivalent to inhomogeneous dynamic depth-wise convolution: Demystifying local attention. 7/2021
21. Welcome to the large scale approximate nearest search challenge at NeurIPS 2021: Big ANN Benchmark. 5/2021
20. HRNet is shipped to Form Recognizer for Table Recognition. 5/2021
19. Update object-contextual representation for semantic segmentation (ECCV 2020). We rephrase it as Segmentation Transformer. [pdf] code. 5/4/2021
18. Code released for our CVPR 2021 paper, Lite-HRNet: A Lightweight High-Resolution Network. [pdf] code. 4/12/2021
17. Code released for our CVPR 2021 paper, Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression. [pdf] code. 4/7/2021
16. HRNet: Deep High-Resolution Representation Learning for Visual Recognition. Accepted by TPAMI. [pdf] or [pdf at arXiv]. This is a longer version of the HRNet paper published in CVPR 2019. HRNet is a stronger backbone, and acheives superior performance on human pose estimation, semantic segmentation, object detection, face alignment, and so on. Codes are available: [HRNet for human pose estimation], [HRNet for segmentation, detection, face alignment].
15. HRNet + OCR + SegFix is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (January2020). The implementation of HRNet + OCR is available: code
14. Invited as an area chair of CVPR 2020, ECCV 2020, and IJCAI 2020.
13. HRNet + OCR is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (July 2019).
12. High-Resolution Network (HRNet). A replacement of classification networks for visual recognition. projects page.
11. Fast neighborhood graph-based approximate nearest neighbor search: code . Bing vector search. TechCrunch.
10. Invited as an area chair of ICCV 2019, and IJCAI 2019.
9. Elected as an ACM Distinguished Member, 11/2018.
8. Gave a keynote talk about approximate nearest neighbor search on 9/29/2018 at JD.com.  slides
7. Second place entry, COCO keypoints detection challenge ECCV 2018.
6. Appointed as AE of TPAMI, 09/2018.
5. Elected as Fellow of IAPR 2018.
4. One paper is accepted by ECCV 2018.
3. Two papers are accepted by ACM MM 2018.
2. Three papers are accepted by CVPR 2018.
1. Appointed as AE of TCSVT, 01/2018.

Codes and datasets

1. High-resolution networks (HRNet). A replacement of classification networks for computer vision problems projects. Human pose estimation (CVPR 2019): code . Other applications pdf (short) pdf (long) code: semantic segmentation , object detection , facial landmark detection , and ImageNet classification .
2. Small convolutional neural networks. Interleaved group convolutions. IGCV1 (ICCV 2017):  pdf  code | IGCV2 (CVPR 2018):  pdf | IGCV3 (BMVC 2018):  pdf  code
3. Large-scale indexing for similarity search. Neighborhood graph search (ACM MM 2012):  pdf | Neighborhood graph construction (CVPR 2012):  pdf | Trinary-projection trees (TPAMI, CVPR 2010):  pdf | code
4. Hashing and quantization. A survey on learning to hash (TPAMI):  pdf v2  html v2  tex v2  pdf v1 | Composite quantization (TPAMI, ICML 2014):  pdf  code
5. Salient object detection. Discriminative Regional Feature Integration (IJCV, CVPR 2013):  pdf (CVPR)  pdf (IJCV)  c++ code  matlab code  project | Local context (BMVC 2011):  pdf  code | Learning to detect a salient object (TPAMI):  pdf


HTML Hit Counters