Jingdong Wang (王井东), Fellow of CAE, IEEE and IAPR

Chief Scientist for Computer Vision, Baidu
Email: wangjingdong at outlook dot com

CV    Google Scholar    DBLP    ORCID    中文主页

Biography

Jingdong Wang is Chief Scientist for computer vision with Baidu. Before joining Baidu, he was a Senior Principal Researcher at Microsoft Research Asia from September 2007 to August 2021. His areas of interest include computer vision, deep learning, multimedia search and large models. His representative works include: high-resolution network (HRNet) for generic visual recognition, transformer query-based object-contextual representations (OCRNet) for semantic segmentation, discriminative regional feature integration (DRFI) for saliency detection - the first supervised saliency detection approach, neighborhood graph search (NGS, SPTAG) for vector search - the first practical and successful neighbor graph-based vector search algorithm and applied to hundreds of billions of vectors.

He served as a Program Chair for ICCV 2025.He has been serving/served as an Associate Editor of IEEE TPAMI, IJCV, ACM TOMM, IEEE TMM, and IEEE TCSVT, and an (senior) area chair of leading conferences in vision, multimedia, and AI, such as CVPR, ICCV, ECCV, NeurIPS, ACM MM, ICLR, ICML, IJCAI, and AAAI. He was elected as an ACM Distinguished Member, a Fellow of IAPR, a Fellow of IEEE, and a Fellow of CAE, for his contributions to visual content understanding and retrieval.


Recent news

35. MixFlow Training: Alleviating Exposure Bias with Slowed Interpolation Mixture [project page] [pdf] [code] SoTA results, outperform RAE, REPA, SiT and DiT for class-conditional image generation, improve the T2I performance, 12/2025.
34. Elected as Fellow of CAE 2024, link, 5/6/2024.
33. Will be a Program Chair for ICCV 2025, 03/2024.
32. Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment. [pdf] [code] Accepted by ICCV, 07/2023
31. Context Autoencoder for Self-Supervised Representation Learning. [IJCV] [pdf] [code] Superior over MAE and BEiT, 07/2022. Accepted by IJCV, 07/2023
30. Understanding Self-Supervised Pretraining with Part-Aware Representation Learning. [pdf] [code] Accepted by TMLR, 07/2023
29. Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining. [pdf] Group DETR v2 achieves 64.5 mAP on COCO test-dev, and establishes a new SoTA on the COCO leaderboard . 11/8/2022
28. Transformer does not outperform CNN: On the Connection between Local Attention and Dynamic Depth-wise Convolution. [pdf] [code]. ICLR 2022 spotlight. 03/2022
27. People of ACM interview: URL 12/2021.
26. Elected as Fellow of IEEE, for his contributions to visual content understanding and retrieval, 11/2021.
25. Code released for our NeurIPS 2021 paper, HRFormer: High-Resolution Transformer for Dense Prediction. [pdf] code. 09/2021
24. Code released for our NeurIPS 2021 paper, SPANN: Highly-efficient Billion-scale ApproximateNearest Neighbor Search. [pdf] code. 09/2021
23. Code released for our ICCV 2021 paper, Conditional DETR for Fast Training Convergence. [pdf] code. 8/16/2021
22. Local Transformer attention is equivalent to inhomogeneous dynamic depth-wise convolution: Demystifying local attention. 7/2021
21. Welcome to the large scale approximate nearest search challenge at NeurIPS 2021: Big ANN Benchmark. 5/2021
20. HRNet is shipped to Form Recognizer for Table Recognition. 5/2021
19. Update object-contextual representation for semantic segmentation (ECCV 2020). We rephrase it as Segmentation Transformer. [pdf] code. 5/4/2021
18. Code released for our CVPR 2021 paper, Lite-HRNet: A Lightweight High-Resolution Network. [pdf] code. 4/12/2021
17. Code released for our CVPR 2021 paper, Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression. [pdf] code. 4/7/2021
16. HRNet: Deep High-Resolution Representation Learning for Visual Recognition. Accepted by TPAMI. [pdf] or [pdf at arXiv]. This is a longer version of the HRNet paper published in CVPR 2019. HRNet is a stronger backbone, and acheives superior performance on human pose estimation, semantic segmentation, object detection, face alignment, and so on. Codes are available: [HRNet for human pose estimation], [HRNet for segmentation, detection, face alignment].
15. HRNet + OCR + SegFix is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (January2020). The implementation of HRNet + OCR is available: code
14. Invited as an area chair of CVPR 2020, ECCV 2020, and IJCAI 2020.
13. HRNet + OCR is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (July 2019).
12. High-Resolution Network (HRNet). A replacement of classification networks for visual recognition. projects page.
11. Fast neighborhood graph-based approximate nearest neighbor search: code . Bing vector search. TechCrunch.
10. Invited as an area chair of ICCV 2019, and IJCAI 2019.
9. Elected as an ACM Distinguished Member, 11/2018.
8. Gave a keynote talk about approximate nearest neighbor search on 9/29/2018 at JD.com.  slides
7. Second place entry, COCO keypoints detection challenge ECCV 2018.
6. Appointed as AE of TPAMI, 09/2018.
5. Elected as Fellow of IAPR 2018.
4. One paper is accepted by ECCV 2018.
3. Two papers are accepted by ACM MM 2018.
2. Three papers are accepted by CVPR 2018.
1. Appointed as AE of TCSVT, 01/2018.



HTML Hit Counters