Dr. Wu Xinxiao

Full Professor

Xinxiao Wu received her Ph.D. from the school of computer science, Beijing Institute of Technology in July 2010. From August 2010 to October 2011, She worked as a Post-PhD student research fellow in Nanyang Technological University, Singapore. She joined the School of Computer Science, Beijing Institute of Technology in 2012. She is currently a Professor. She has obtained Excellent PhD studental Dissertation Award from the Chinese Association for Artificial Intelligence. She has published many papers in top conferences and journals on computer vision and artificial intelligence: ICCV, CVPR, ECCV, AAAI, IJCAI, ACM MM, IJCV, IEEE TIP, IEEE TMM, IEEE TNNLS, IEEE TCSVT, IEEE TCYB. Her research work has been supported by many research grants as principal investigator, which includes the National Natural Science Foundation (NSFC), the Ministry of Education PhD studental Fund, and many school-enterprise projects, etc. She also servers on the editorial boards of IEEE Transactions on Multimedia. Her current research interests include machine learning, vision and language, multimedia video understanding.

Welcome students who are interested in vision and language, machine learning and artificial intelligence to join us!

  • wuxinxiao.github.io

News

  • 2024-01-23

    Yuheng Shi and Hanxi Lin's paper “Commonsense Knowledge Prompting for Few-shot Action Recognition in Videos” was accepted by IEEE Transactions on Multimedia (TMM). Congratulations!
  • 2023-07-26

    Shuo Yang and Yongqi Wang's paper “Multi-modal Prompting for Open-vocabulary Video Visual Relationship Detection” was accepted by The 38th AAAI Conference on Artificial Intelligence (AAAI2024). Congratulations!
  • 2023-07-26

    Yayun Qi's paper “Relational Distant Supervision for Image Captioning without Image-text Pairs” was accepted by The 38th AAAI Conference on Artificial Intelligence (AAAI2024). Congratulations!
  • 2023-07-26

    Shuo Yang and Zirui Shang's paper “Probability Distribution Based Frame-supervised Language-driven Action Localization” was accepted by The 31st ACM International Conference on Multimedia (ACM MM2023). Congratulations!
  • 2023-07-17

    Wentian Zhao's paper “Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph” was accepted by IEEE Transactions on Multimedia (TMM). Congratulations!
  • 2023-04-20

    Shitong Shao and Huanran Chen's paper “Teaching What You Should Teach: A Data-Based Distillation Method” was accepted by International Joint Conference on Artificial Intelligence (IJCAI2023). Congratulations!
  • 2023-03-29

    Yubo Zhu's paper “Topic-aware Video Summarization using Multimodal Transformer” was accepted by Pattern Recognition (PR). Congratulations!
  • 2023-03-17

    Xiaofeng Ji's paper “Counterfactual Inference for Visual Relationship Detection in Videos” was accepted by IEEE International Conference on Multimedia and Expo (ICME2023). Congratulations!
  • 2023-02-28

    Jin Chen's paper “Meta-causal Learning for Single Domain Generalization” was accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2023). Congratulations!
  • 2023-01-19

    Wentian Zhao and Yayun Qi won the second prize of “Ingenuity Cup” National Artificial Intelligence Innovation Application Competition! Congratulations!
  • 2023-01-06

    Tong Li's paper “Sentimental Visual Captioning using Multimodal Transformer” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2022-12-05

    Mengxiao Tian's paper “Adaptive Latent Graph Representation Learning for Image-Text Matchin” was accepted by IEEE Transactions on Image Processing (TIP). Congratulations!
  • 2022-05-29

    Wentian Zhao's paper “Learning Cooperative Neural Modules for Stylized Image Captioning” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2022-04-21

    Shuo Yang's paper “Entity-Aware and Motion-Aware Transformers for Language-driven Action Localization” was accepted by International Joint Conference on Artificial Intelligence (IJCAI2022). Congratulations!
  • 2022-03-07

    Hanxi Lin's paper “Adaptive Recursive Circle Framework for Fing-grained Action Recognition” was accepted by IEEE International Conference on Multimedia and Expo (ICME2022). Congratulations!
  • 2021-12-01

    Jin Chen and Xiaofeng Ji's paper “Adaptive Image-to-video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning” was accepted by 36th AAAI Conference on Artificial Intelligenc (AAAI2022). Congratulations!
  • 2021-09-29

    Wentian Zhao's paper “Multi-modal Dependency Tree for Video Captioning” was accepted by Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS2021). Congratulations!
  • 2021-03-13

    Jin Chen's paper “Sequential Instance Refinement for Cross-domain Object Detection in Images” was accepted by IEEE Transactions on Image Processin (TIP). Congratulations!
  • 2021-03-12

    Jingyi Hou and Yayun Qi's paper “跨语言知识蒸馏的视频中文字幕生成” was accepted by 《计算机学报》. Congratulations!
  • 2021-03-07

    Tong Li's paper “Image Captioning with Inherent Sentiment”was accepted by IEEE International Conference on Multimedia and Expo (ICME2021 Oral). Congratulations!
  • 2020-12-02

    Jianwei Zhao and Ruiqi Wang's paper “Anticipating Future Relations via Graph Growing for Action Prediction” was accepted by 35th AAAI Conference on Artificial Intelligenc (AAAI2021). Congratulations!
  • 2020-12-02

    Jin Chen's paper “Spatial-temporal Causal Inference for Partial Image-to-video Adaptation” was accepted by 35th AAAI Conference on Artificial Intelligence (AAAI2021). Congratulations!
  • 2020-11-24

    Wentian Zhao's paper “Cross-domain Image Captioning via Cross-modal Retrieval and Model Adaptation” was accepted by IEEE Transactions on Image Processing (TIP). Congratulations!
  • 2020-11-20

    Ruiqi Wang's paper “Spatial-Temporal Relation Reasoning for Action Prediction in Videos” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2020-09-25

    Jin Chen's paper "Domain Adversarial Reinforcement Learning for Partial Domain Adaptation" was accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS). Congratulations!
  • 2020-07-26

    Jialu Chen's paper "Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer" was accepted by ACM Multimedia 2020. Congratulations!

Research Interests

Artificial Intelligence   visual captioning
  video grounding   Computer Vision
  Vision+Language   video style transfer
   animal interaction analysis human action recognition
  domain adpatation   domain generalization
   video summarization & visual storytelling
  Multimedia Video Analysis and Undestanding
   Transfer Learning   cross-domain object detection

Selected Publications

Commonsense Knowledge Prompting for Few-shot Action Recognition in Videos.

Yuheng Shi, Xinxiao Wu, Hanxi Lin.
IEEE Transactions on Multimedia (TMM), 2024

Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection.

Shuo Yang, Yongqi Wang, Xinxiao Wu.
AAAI Conference on Artificial Intelligence (AAAI), 2024

Probability Distribution Based Frame-supervised Language-driven Action Localization.

Shuo Yang, Zirui Shang, Xinxiao Wu.
The 31st ACM International Conference on Multimedia (ACM MM), 2023

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph.

Wentian Zhao, Xinxiao Wu.
IEEE Transactions on Multimedia (TMM), 2023

Topic-aware Video Summarization using Multimodal Transformer.

Yubo Zhu, Wentian Zhao, Rui Hua, Xinxiao Wu.
Pattern Recognition (PR), 2023

Meta-causal Learning for Single Domain Generalization.

Jin Chen, Zhi Gao, Xinxiao Wu, Jiebo Luo.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Sentimental Visual Captioning using Multimodal Transformer.

Xinxiao Wu, Tong Li
International Journal of Computer Vision (IJCV), 2023

Adaptive Latent Graph Representation Learning for Image-Text Matching.

Mengxiao Tian, Xinxiao Wu, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2022

Learning Cooperative Neural Modules for Stylized Image Captioning.

Xinxiao Wu, Wentian Zhao, Jiebo Luo
International Journal of Computer Vision (IJCV), 2022

Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos.

Shuo Yang, Xinxiao Wu.
International Joint Conference on Artificial Intelligence (IJCAI), 2022

Adaptive Recursive Circle Framework for Fing-grained Action Recognition.

Hanxi Lin, Wentian Zhao, Xinxiao Wu.
IEEE International Conference on Multimedia and Expo (ICME), 2022

Adaptive Image-to-video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning.

Jin Chen, Xiaofeng Ji, Xinxiao Wu.
AAAI Conference on Artificial Intelligence (AAAI), 2022

Multi-modal Dependency Tree for Video Captioning.

Wentian Zhao, Xinxiao Wu, Jiebo Luo.
Neural Information Processing Systems (NeurIPS), 2021

Spatial–Temporal Relation Reasoning for Action Prediction in Videos.

Xinxiao Wu, Ruiqi Wang, Jingyi Hou, Hanxi Lin, Jiebo Luo.
International Journal of Computer Vision (IJCV), 2021

Sequential Instance Refinement for Cross-Domain Object Detection in Images.

Jin Chen, Xinxiao Wu, Lixin Duan, Lin Chen.
IEEE Transactions on Image Processing (TIP), 2021

Image Captioning with Inherent Sentiment.

Tong Li, Yunhui Hu, Xinxiao Wu.
IEEE International Conference on Multimedia and Expo (ICME) oral, 2021

Cross-Domain Image Captioning via Cross-Modal Retrieval and Model Adaptation.

Wentian Zhao, Xinxiao Wu, Jiebo Luo.
IEEE Transactions on Image Processing (TIP), 2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation.

Jin Chen, Xinxiao Wu, Yao Hu, Jiebo Luo.
AAAI Conference on Artificial Intelligence (AAAI), 2021

Anticipating Future Relations via Graph Growing for Action Prediction.

Xinxiao Wu, Jianwei Zhao, Ruiqi Wang.
AAAI Conference on Artificial Intelligence (AAAI), 2021

Exploiting Informative Video Segments for Temporal Action Localization.

Che Sun, Hao Song, Xinxiao Wu, Yunde Jia, Jiebo Luo.
IEEE Transactions on Multimedia (TMM), 2020

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation.

Jin Chen, Xinxiao Wu, Lixin Duan, Shenghua Gao.
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2020

Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer.

Xinxiao Wu, Jialu Chen.
ACM International Conference on Multimedia (ACM MM), 2020

Confidence-guided self refinement for action prediction in untrimmed videos.

Jingyi Hou, Xinxiao Wu, Ruiqi Wang, Jiebo Luo, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2020

Joint Learning of Multiple Latent Domains and Deep Representations for Domain Adaptation.

Xinxiao Wu, Jin Chen, Feiwu Yu, Mingyu Yao, Jiebo Luo.
IEEE Transactions on Cybernetics (T-CYB), 2020

Learning Normal Patterns via Adversarial Attention-Based Autoencoder for Abnormal Event Detection in Videos.

Hao Song, Che Sun, Xinxiao Wu, Mei Chen, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2020

Joint Commonsense and Relation Reasoning for Image and Video Captioning.

Jingyi Hou, Xinxiao Wu, Xiaoxun Zhang, Yayun Qi, Yunde Jia, Jiebo Luo.
AAAI Conference on Artificial Intelligence (AAAI), 2020

MemCap: Memorizing Style Knowledge for Image Captioning.

Wentian Zhao, Xinxiao Wu, Xiaoxun Zhang.
AAAI Conference on Artificial Intelligence (AAAI), 2020

Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning.

Feiwu Yu, Xinxiao Wu, Jialu Chen, Lixin Duan.
IEEE Transactions on Image Processing (TIP),2019

Temporal Action Localization in Untrimmed Videos using Action Pattern Trees.

Hao Song, Xinxiao Wu, Bing Zhu, Yuwei Wu, Mei Chen, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2019

Unsupervised Deep Learning of Mid-Level Video Representation for Action Recognition.

Jingyi Hou, Xinxiao Wu, Jin Chen, Jiebo Luo, Yunde Jia
AAAI Conference on Artificial Intelligence (AAAI), 2018

Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks.

Feiwu Yu, Xinxiao Wu, Yuchao Sun, Lixin Duan.
International Joint Conference on Artificial Intelligence (IJCAI), 2018

Extracting Key Segments of Videos for Event Detection by Learning From Web Sources.

Hao Song, Xinxiao Wu, Wennan Yu, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2018

Content-Attention Representation by Factorized Action-Scene Network for Action Recognition.

Jingyi Hou, Xinxiao Wu, Yuchao Sun, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2017

A Hierarchical Video Description for Complex Activity Understanding.

Cuiwei Liu, Xinxiao Wu, Yunde Jia.
International Journal of Computer Vision (IJCV), 2016

Cross-View Action Recognition Over Heterogeneous Feature Spaces.

Xinxiao Wu, Han Wang, Cuiwei Liu, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2015

Video Annotation via Image Groups from the Web.

Han Wang, Xinxiao Wu, and Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2014

Cross-View Action Recognition over Heterogeneous Feature Spaces.

Xinxiao Wu, Han Wang, Cuiwei Liu, Yunde Jia.
IEEE International Conference on Computer Vision (ICCV), 2013

View-Invariant Action Recognition Using Latent Kernelized Structural SVM.

Xinxiao Wu, Yunde Jia.
European Conference on Computer Vision (ECCV), 2012

Action recognition using context and appearance distribution features.

Xinxiao Wu, Dong Xu, Lixin Duan, Jiebo Luo.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011

Incremental discriminative-analysis of canonical correlations for action recognition.

Xinxiao Wu, Wei Liang, Yunde Jia.
IEEE International Conference on Computer Vision (ICCV), 2009

Group

Yang Shuo

PhD student
video grounding

Tian Mengxiao

PhD student
video understanding

Qi Yayun

MS student
video caption

Zhu Yubo

MS student
style transfer

Li Hongxi

MS student
video summarization

Shi Yuheng

MS student
action recognition

Huang Xiqing

MS student
action segmentation

Wang Yongqi

MS student
video relation detection

Shang Zirui

MS student
video grounding

Wang Ziyi

MS student
domain generalization

Alumni

Wang Han

Beijing Forestry University Associate Professor

Song Hao

Tencent Researcher

Hou Jingyi

Beijing University of Science and Technology PostPhD studental Teaching Fellow

Liu Chao

Alibaba Senior Algorithm Engineer

Yu Feiwu

Alibaba DAMO Development Engineer

Zhu Bing

Alibaba Cloud Data Engineer

Sun Yuchao

Beijing Megvii Co., Ltd Algorithm Researcher

Wang Ruiqi

MI Product Manager

Hua Rui

AVIC Manufacturing Technology Institute Information Management

Chen Jialu

MI Algorithm Engineer

Li Tianyu

Beijing Infrastructure Investment Management Trainee

Lin Hanxi

ByteDance Algorithm Engineer

Li Tong

Alibaba Algorithm Engineer

Chen Jin

Aerospace Intelligence Research Institute R&D Engineer

Zhao Wentian

Beijing Institute of Technology PostPhD

Wen Zihan

China Academy of Space Technology Algorithm Engineer

Ji Xiaofeng

ByteDance Algorithm Engineer

Yi Jiacheng

Chinese People's Liberation Army Strategic Support Force, Assistant Engineer

Teaching

Artificial Intelligence

For Undergraduate students

Image and Video Processing

For MS students

Computational Perception

For Ph.D students