Jingdong Wang

Jingdong Wang (王井东), Fellow of CAE, IEEE and IAPR
Chief Scientist for Computer Vision Baidu Email_1: wangjingdong at baidu dot com Email_2: wangjingdong at outlook dot com
CV Google Scholar DBLP ORCID 中文主页

Biography

Jingdong Wang is Chief Scientist for computer vision with Baidu. Before joining Baidu, he was a Senior Principal Researcher at Microsoft Research Asia from September 2007 to August 2021. His areas of interest include computer vision, deep learning, multimedia search and large models. His representative works include: high-resolution network (HRNet) for generic visual recognition, transformer query-based object-contextual representations (OCRNet) for semantic segmentation, discriminative regional feature integration (DRFI) for saliency detection - the first supervised saliency detection approach, neighborhood graph search (NGS, SPTAG) for vector search - the first practical and successful neighbor graph-based vector search algorithm and applied to hundreds of billions of vectors.

He is serving as a Program Chair for ICCV 2025.He has been serving/served as an Associate Editor of IEEE TPAMI, IJCV, ACM TOMM, IEEE TMM, and IEEE TCSVT, and an (senior) area chair of leading conferences in vision, multimedia, and AI, such as CVPR, ICCV, ECCV, NeurIPS, ACM MM, IJCAI, and AAAI. He was elected as an ACM Distinguished Member, a Fellow of IAPR, a Fellow of IEEE, and a Fellow of CAE, for his contributions to visual content understanding and retrieval.

Recent news

34. Elected as Fellow of CAE 2024, link, 5/6/2024.

33. Will be a Program Chair for ICCV 2025, 03/2024.

32. Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment. [pdf] [code] Accepted by ICCV, 07/2023

31. Context Autoencoder for Self-Supervised Representation Learning. [IJCV] [pdf] [code] Superior over MAE and BEiT, 07/2022. Accepted by IJCV, 07/2023

30. Understanding Self-Supervised Pretraining with Part-Aware Representation Learning. [pdf] [code] Accepted by TMLR, 07/2023

29. Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining. [pdf] Group DETR v2 achieves 64.5 mAP on COCO test-dev, and establishes a new SoTA on the COCO leaderboard . 11/8/2022

28. Transformer does not outperform CNN: On the Connection between Local Attention and Dynamic Depth-wise Convolution. [pdf] [code]. ICLR 2022 spotlight. 03/2022

27. People of ACM interview: URL 12/2021.

26. Elected as Fellow of IEEE, for his contributions to visual content understanding and retrieval, 11/2021.

25. Code released for our NeurIPS 2021 paper, HRFormer: High-Resolution Transformer for Dense Prediction. [pdf] code. 09/2021

24. Code released for our NeurIPS 2021 paper, SPANN: Highly-efficient Billion-scale ApproximateNearest Neighbor Search. [pdf] code. 09/2021

23. Code released for our ICCV 2021 paper, Conditional DETR for Fast Training Convergence. [pdf] code. 8/16/2021

22. Local Transformer attention is equivalent to inhomogeneous dynamic depth-wise convolution: Demystifying local attention. 7/2021

21. Welcome to the large scale approximate nearest search challenge at NeurIPS 2021: Big ANN Benchmark. 5/2021

20. HRNet is shipped to Form Recognizer for Table Recognition. 5/2021

19. Update object-contextual representation for semantic segmentation (ECCV 2020). We rephrase it as Segmentation Transformer. [pdf] code. 5/4/2021

18. Code released for our CVPR 2021 paper, Lite-HRNet: A Lightweight High-Resolution Network. [pdf] code. 4/12/2021

17. Code released for our CVPR 2021 paper, Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression. [pdf] code. 4/7/2021

16. HRNet: Deep High-Resolution Representation Learning for Visual Recognition. Accepted by TPAMI. [pdf] or [pdf at arXiv]. This is a longer version of the HRNet paper published in CVPR 2019. HRNet is a stronger backbone, and acheives superior performance on human pose estimation, semantic segmentation, object detection, face alignment, and so on. Codes are available: [HRNet for human pose estimation], [HRNet for segmentation, detection, face alignment].

15. HRNet + OCR + SegFix is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (January2020). The implementation of HRNet + OCR is available: code

14. Invited as an area chair of CVPR 2020, ECCV 2020, and IJCAI 2020.

13. HRNet + OCR is ranked 1 on cityscapes segmentation. Cityscapes segmentation leaderboard (July 2019).

12. High-Resolution Network (HRNet). A replacement of classification networks for visual recognition. projects page.

11. Fast neighborhood graph-based approximate nearest neighbor search: code . Bing vector search. TechCrunch.

10. Invited as an area chair of ICCV 2019, and IJCAI 2019.

9. Elected as an ACM Distinguished Member, 11/2018.

8. Gave a keynote talk about approximate nearest neighbor search on 9/29/2018 at JD.com. slides

7. Second place entry, COCO keypoints detection challenge ECCV 2018.

6. Appointed as AE of TPAMI, 09/2018.

5. Elected as Fellow of IAPR 2018.

4. One paper is accepted by ECCV 2018.

3. Two papers are accepted by ACM MM 2018.

2. Three papers are accepted by CVPR 2018.

1. Appointed as AE of TCSVT, 01/2018.

Codes and datasets

1. High-resolution networks (HRNet). A replacement of classification networks for computer vision problems projects. Human pose estimation (CVPR 2019): code . Other applications pdf (short) pdf (long) code: semantic segmentation , object detection , facial landmark detection , and ImageNet classification .

2. Small convolutional neural networks. Interleaved group convolutions. IGCV1 (ICCV 2017): pdf code | IGCV2 (CVPR 2018): pdf | IGCV3 (BMVC 2018): pdf code

3. Large-scale indexing for similarity search. Neighborhood graph search (ACM MM 2012): pdf | Neighborhood graph construction (CVPR 2012): pdf | Trinary-projection trees (TPAMI, CVPR 2010): pdf | code

4. Hashing and quantization. A survey on learning to hash (TPAMI): pdf v2 html v2 tex v2 pdf v1 | Composite quantization (TPAMI, ICML 2014): pdf code

5. Salient object detection. Discriminative Regional Feature Integration (IJCV, CVPR 2013): pdf (CVPR) pdf (IJCV) c++ code matlab code project | Local context (BMVC 2011): pdf code | Learning to detect a salient object (TPAMI): pdf

Jingdong Wang (王井东), Fellow of CAE, IEEE and IAPR

Biography

Recent news

Codes and datasets