Dr. Wu Xinxiao

Full Professor

Xinxiao Wu received her Ph.D. from the school of computer science, Beijing Institute of Technology in July 2010. From August 2010 to October 2011, She worked as a Post-PhD student research fellow in Nanyang Technological University, Singapore. She joined the School of Computer Science, Beijing Institute of Technology in 2012. She is currently a Professor. She has obtained Excellent PhD studental Dissertation Award from the Chinese Association for Artificial Intelligence. She has published many papers in top conferences and journals on computer vision and artificial intelligence: ICCV, CVPR, ECCV, AAAI, IJCAI, ACM MM, IJCV, IEEE TIP, IEEE TMM, IEEE TNNLS, IEEE TCSVT, IEEE TCYB. Her research work has been supported by many research grants as principal investigator, which includes the National Natural Science Foundation (NSFC), the Ministry of Education PhD studental Fund, and many school-enterprise projects, etc. She also servers on the editorial boards of IEEE Transactions on Multimedia. Her current research interests include machine learning, vision and language, multimedia video understanding.

Welcome students who are interested in vision and language, machine learning and artificial intelligence to join us!

  • wuxinxiao.github.io

News

  • 2024-12-17

    Hongxi Li, Jun Chen, Zirui Shang, and Ziyi Wang won the Excellence Award at the 13th China Innovation and Entrepreneurship Competition and the 8th Emerging Fields Special Competition of Zhongguancun! Congratulations!
  • 2024-12-10

    Zirui Shang, Yubo Zhu, and Hongxi Li's paper “Video Summarization using Denoising Diffusion Probabilistic Model” was accepted by The 39th AAAI Conference on Artificial Intelligence (AAAI2025). Congratulations!
  • 2024-08-13

    Rongjiang Zhu and Yuheng Shi's paper “大语言模型引导的开放域多标签动作识别” was accepted by《计算机研究与发展》. Congratulations!
  • 2024-02-13

    Shuo Yang's paper “Dynamic Pathway for Query-Aware Feature Learning in Language-Driven Action Localization” was accepted by IEEE Transactions on Multimedia (TMM). Congratulations!
  • 2024-01-23

    Yuheng Shi and Hanxi Lin's paper “Commonsense Knowledge Prompting for Few-shot Action Recognition in Videos” was accepted by IEEE Transactions on Multimedia (TMM). Congratulations!
  • 2023-07-26

    Shuo Yang and Yongqi Wang's paper “Multi-modal Prompting for Open-vocabulary Video Visual Relationship Detection” was accepted by The 38th AAAI Conference on Artificial Intelligence (AAAI2024). Congratulations!
  • 2023-07-26

    Yayun Qi's paper “Relational Distant Supervision for Image Captioning without Image-text Pairs” was accepted by The 38th AAAI Conference on Artificial Intelligence (AAAI2024). Congratulations!
  • 2023-07-26

    Shuo Yang and Zirui Shang's paper “Probability Distribution Based Frame-supervised Language-driven Action Localization” was accepted by The 31st ACM International Conference on Multimedia (ACM MM2023). Congratulations!
  • 2023-07-17

    Wentian Zhao's paper “Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph” was accepted by IEEE Transactions on Multimedia (TMM). Congratulations!
  • 2023-04-20

    Shitong Shao and Huanran Chen's paper “Teaching What You Should Teach: A Data-Based Distillation Method” was accepted by International Joint Conference on Artificial Intelligence (IJCAI2023). Congratulations!
  • 2023-03-29

    Yubo Zhu's paper “Topic-aware Video Summarization using Multimodal Transformer” was accepted by Pattern Recognition (PR). Congratulations!
  • 2023-03-17

    Xiaofeng Ji's paper “Counterfactual Inference for Visual Relationship Detection in Videos” was accepted by IEEE International Conference on Multimedia and Expo (ICME2023). Congratulations!
  • 2023-02-28

    Jin Chen's paper “Meta-causal Learning for Single Domain Generalization” was accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2023). Congratulations!
  • 2023-01-19

    Wentian Zhao and Yayun Qi won the second prize of “Ingenuity Cup” National Artificial Intelligence Innovation Application Competition! Congratulations!
  • 2023-01-06

    Tong Li's paper “Sentimental Visual Captioning using Multimodal Transformer” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2022-12-05

    Mengxiao Tian's paper “Adaptive Latent Graph Representation Learning for Image-Text Matchin” was accepted by IEEE Transactions on Image Processing (TIP). Congratulations!
  • 2022-05-29

    Wentian Zhao's paper “Learning Cooperative Neural Modules for Stylized Image Captioning” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2022-04-21

    Shuo Yang's paper “Entity-Aware and Motion-Aware Transformers for Language-driven Action Localization” was accepted by International Joint Conference on Artificial Intelligence (IJCAI2022). Congratulations!
  • 2022-03-07

    Hanxi Lin's paper “Adaptive Recursive Circle Framework for Fing-grained Action Recognition” was accepted by IEEE International Conference on Multimedia and Expo (ICME2022). Congratulations!
  • 2021-12-01

    Jin Chen and Xiaofeng Ji's paper “Adaptive Image-to-video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning” was accepted by 36th AAAI Conference on Artificial Intelligenc (AAAI2022). Congratulations!
  • 2021-09-29

    Wentian Zhao's paper “Multi-modal Dependency Tree for Video Captioning” was accepted by Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS2021). Congratulations!
  • 2021-03-13

    Jin Chen's paper “Sequential Instance Refinement for Cross-domain Object Detection in Images” was accepted by IEEE Transactions on Image Processin (TIP). Congratulations!
  • 2021-03-12

    Jingyi Hou and Yayun Qi's paper “跨语言知识蒸馏的视频中文字幕生成” was accepted by 《计算机学报》. Congratulations!
  • 2021-03-07

    Tong Li's paper “Image Captioning with Inherent Sentiment”was accepted by IEEE International Conference on Multimedia and Expo (ICME2021 Oral). Congratulations!
  • 2020-12-02

    Jianwei Zhao and Ruiqi Wang's paper “Anticipating Future Relations via Graph Growing for Action Prediction” was accepted by 35th AAAI Conference on Artificial Intelligenc (AAAI2021). Congratulations!
  • 2020-12-02

    Jin Chen's paper “Spatial-temporal Causal Inference for Partial Image-to-video Adaptation” was accepted by 35th AAAI Conference on Artificial Intelligence (AAAI2021). Congratulations!
  • 2020-11-24

    Wentian Zhao's paper “Cross-domain Image Captioning via Cross-modal Retrieval and Model Adaptation” was accepted by IEEE Transactions on Image Processing (TIP). Congratulations!
  • 2020-11-20

    Ruiqi Wang's paper “Spatial-Temporal Relation Reasoning for Action Prediction in Videos” was accepted by International Journal of Computer Vision (IJCV). Congratulations!
  • 2020-09-25

    Jin Chen's paper "Domain Adversarial Reinforcement Learning for Partial Domain Adaptation" was accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS). Congratulations!
  • 2020-07-26

    Jialu Chen's paper "Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer" was accepted by ACM Multimedia 2020. Congratulations!
---- More News ----

Research Interests

Artificial Intelligence   visual captioning
  video grounding   Computer Vision
  Vision+Language   video style transfer
   animal interaction analysis human action recognition
  domain adpatation   domain generalization
   video summarization & visual storytelling
  Multimedia Video Analysis and Undestanding
   Transfer Learning   cross-domain object detection

代表性论文

Journal

Dynamic Pathway for Query-Aware Feature Learning in Language-Driven Action Localization.

Shuo Yang, Xinxiao Wu, Zirui Shang, Jiebo Luo.
IEEE Transactions on Multimedia (TMM), 2024

Commonsense Knowledge Prompting for Few-shot Action Recognition in Videos.

Yuheng Shi, Xinxiao Wu, Hanxi Lin.
IEEE Transactions on Multimedia (TMM), 2024

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph.

Wentian Zhao, Xinxiao Wu.
IEEE Transactions on Multimedia (TMM), 2023

Topic-aware Video Summarization using Multimodal Transformer.

Yubo Zhu, Wentian Zhao, Rui Hua, Xinxiao Wu.
Pattern Recognition (PR), 2023

Sentimental Visual Captioning using Multimodal Transformer.

Xinxiao Wu, Tong Li
International Journal of Computer Vision (IJCV), 2023

Adaptive Latent Graph Representation Learning for Image-Text Matching.

Mengxiao Tian, Xinxiao Wu, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2022

Learning Cooperative Neural Modules for Stylized Image Captioning.

Xinxiao Wu, Wentian Zhao, Jiebo Luo
International Journal of Computer Vision (IJCV), 2022

Spatial–Temporal Relation Reasoning for Action Prediction in Videos.

Xinxiao Wu, Ruiqi Wang, Jingyi Hou, Hanxi Lin, Jiebo Luo.
International Journal of Computer Vision (IJCV), 2021

Sequential Instance Refinement for Cross-Domain Object Detection in Images.

Jin Chen, Xinxiao Wu, Lixin Duan, Lin Chen.
IEEE Transactions on Image Processing (TIP), 2021

Cross-Domain Image Captioning via Cross-Modal Retrieval and Model Adaptation.

Wentian Zhao, Xinxiao Wu, Jiebo Luo.
IEEE Transactions on Image Processing (TIP), 2021

Exploiting Informative Video Segments for Temporal Action Localization.

Che Sun, Hao Song, Xinxiao Wu, Yunde Jia, Jiebo Luo.
IEEE Transactions on Multimedia (TMM), 2020

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation.

Jin Chen, Xinxiao Wu, Lixin Duan, Shenghua Gao.
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2020

Confidence-guided self refinement for action prediction in untrimmed videos.

Jingyi Hou, Xinxiao Wu, Ruiqi Wang, Jiebo Luo, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2020

Joint Learning of Multiple Latent Domains and Deep Representations for Domain Adaptation.

Xinxiao Wu, Jin Chen, Feiwu Yu, Mingyu Yao, Jiebo Luo.
IEEE Transactions on Cybernetics (T-CYB), 2020

Learning Normal Patterns via Adversarial Attention-Based Autoencoder for Abnormal Event Detection in Videos.

Hao Song, Che Sun, Xinxiao Wu, Mei Chen, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2020

Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning.

Feiwu Yu, Xinxiao Wu, Jialu Chen, Lixin Duan.
IEEE Transactions on Image Processing (TIP),2019

Temporal Action Localization in Untrimmed Videos using Action Pattern Trees.

Hao Song, Xinxiao Wu, Bing Zhu, Yuwei Wu, Mei Chen, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2019

Extracting Key Segments of Videos for Event Detection by Learning From Web Sources.

Hao Song, Xinxiao Wu, Wennan Yu, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2018

Content-Attention Representation by Factorized Action-Scene Network for Action Recognition.

Jingyi Hou, Xinxiao Wu, Yuchao Sun, Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2017

A Hierarchical Video Description for Complex Activity Understanding.

Cuiwei Liu, Xinxiao Wu, Yunde Jia.
International Journal of Computer Vision (IJCV), 2016

Cross-View Action Recognition Over Heterogeneous Feature Spaces.

Xinxiao Wu, Han Wang, Cuiwei Liu, Yunde Jia.
IEEE Transactions on Image Processing (TIP), 2015

Video Annotation via Image Groups from the Web.

Han Wang, Xinxiao Wu, and Yunde Jia.
IEEE Transactions on Multimedia (TMM), 2014
---- More Journal Article ----

Conference

Relational Distant Supervision for Image Captioning without Image-text Pairs.

Yayun Qi, Wentian zhao, Xinxiao Wu.
AAAI Conference on Artificial Intelligence (AAAI), 2024

Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection.

Shuo Yang, Yongqi Wang, Xinxiao Wu.
AAAI Conference on Artificial Intelligence (AAAI), 2024

Probability Distribution Based Frame-supervised Language-driven Action Localization.

Shuo Yang, Zirui Shang, Xinxiao Wu.
The 31st ACM International Conference on Multimedia (ACM MM), 2023

Meta-causal Learning for Single Domain Generalization.

Jin Chen, Zhi Gao, Xinxiao Wu, Jiebo Luo.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos.

Shuo Yang, Xinxiao Wu.
International Joint Conference on Artificial Intelligence (IJCAI), 2022

Adaptive Recursive Circle Framework for Fing-grained Action Recognition.

Hanxi Lin, Wentian Zhao, Xinxiao Wu.
IEEE International Conference on Multimedia and Expo (ICME), 2022

Adaptive Image-to-video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning.

Jin Chen, Xiaofeng Ji, Xinxiao Wu.
AAAI Conference on Artificial Intelligence (AAAI), 2022

Multi-modal Dependency Tree for Video Captioning.

Wentian Zhao, Xinxiao Wu, Jiebo Luo.
Neural Information Processing Systems (NeurIPS), 2021

Image Captioning with Inherent Sentiment.

Tong Li, Yunhui Hu, Xinxiao Wu.
IEEE International Conference on Multimedia and Expo (ICME) oral, 2021

Spatial-temporal Causal Inference for Partial Image-to-video Adaptation.

Jin Chen, Xinxiao Wu, Yao Hu, Jiebo Luo.
AAAI Conference on Artificial Intelligence (AAAI), 2021

Anticipating Future Relations via Graph Growing for Action Prediction.

Xinxiao Wu, Jianwei Zhao, Ruiqi Wang.
AAAI Conference on Artificial Intelligence (AAAI), 2021

Preserving Global and Local Temporal Consistency for Arbitrary Video Style Transfer.

Xinxiao Wu, Jialu Chen.
ACM International Conference on Multimedia (ACM MM), 2020

Joint Commonsense and Relation Reasoning for Image and Video Captioning.

Jingyi Hou, Xinxiao Wu, Xiaoxun Zhang, Yayun Qi, Yunde Jia, Jiebo Luo.
AAAI Conference on Artificial Intelligence (AAAI), 2020

MemCap: Memorizing Style Knowledge for Image Captioning.

Wentian Zhao, Xinxiao Wu, Xiaoxun Zhang.
AAAI Conference on Artificial Intelligence (AAAI), 2020

Unsupervised Deep Learning of Mid-Level Video Representation for Action Recognition.

Jingyi Hou, Xinxiao Wu, Jin Chen, Jiebo Luo, Yunde Jia
AAAI Conference on Artificial Intelligence (AAAI), 2018

Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks.

Feiwu Yu, Xinxiao Wu, Yuchao Sun, Lixin Duan.
International Joint Conference on Artificial Intelligence (IJCAI), 2018

Cross-View Action Recognition over Heterogeneous Feature Spaces.

Xinxiao Wu, Han Wang, Cuiwei Liu, Yunde Jia.
IEEE International Conference on Computer Vision (ICCV), 2013

View-Invariant Action Recognition Using Latent Kernelized Structural SVM.

Xinxiao Wu, Yunde Jia.
European Conference on Computer Vision (ECCV), 2012

Action recognition using context and appearance distribution features.

Xinxiao Wu, Dong Xu, Lixin Duan, Jiebo Luo.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011

Incremental discriminative-analysis of canonical correlations for action recognition.

Xinxiao Wu, Wei Liang, Yunde Jia.
IEEE International Conference on Computer Vision (ICCV), 2009
---- More Conference Paper ----

Group

Tian Mengxiao

PhD student
video understanding

Qi Yayun

phD student
video caption

Chen Jun

phD student
incremental learning

Shang Zirui

phD student
video grounding

Song Yiqi

phD student
multimodal reasoning

Li Hongxi

MS student
video summarization

Huang Xiqing

MS student
action segmentation

Wang Yongqi

MS student
video relation detection

Wang Ziyi

MS student
domain generalization

Zhu Rongjiang

MS student
video action recognition

Tan Yunteng

MS student
LLM-based agents

Alumni

—— phD graduates ——

Yang Shuo

Shenzhen MSU-BIT University Associate Professor

Chen Jin

Aerospace Intelligence Research Institute R&D Engineer

Zhao Wentian

Beijing Institute of Technology PostPhD

Wang Han

Beijing Forestry University Associate Professor

Song Hao

Tencent Researcher

Hou Jingyi

Beijing University of Science and Technology PostPhD studental Teaching Fellow

—— MS graduates ——

Zhu Yubo

Aerospace Information Research Institute Assistant engineer

Shi Yuheng

Bank of China Information management Pearson

Wen Zihan

China Academy of Space Technology Algorithm Engineer

Ji Xiaofeng

ByteDance Algorithm Engineer

Yi Jiacheng

The PLA Strategic Support Force Assistant engineer

Li Tong

Alibaba Algorithm Engineer

Lin Hanxi

ByteDance Algorithm Engineer

Liu Chao

Alibaba Senior Algorithm Engineer

Yu Feiwu

Alibaba DAMO Development Engineer

Zhu Bing

Alibaba Cloud Data Engineer

Sun Yuchao

Beijing Megvii Co., Ltd Algorithm Researcher

Wang Ruiqi

MI Product Manager

Hua Rui

AVIC Manufacturing Technology Institute Information Management

Chen Jialu

MI Algorithm Engineer

Li Tianyu

Beijing Infrastructure Investment Management Trainee
---- More Alumni ----

Teaching

Compiler Principle and Design

For Undergraduate students

Artificial Intelligence

For Undergraduate students

Image and Video Processing

For MS students

Computational Perception

For Ph.D students