Xin (Eric) Wang

Assistant Professor, Computer Science and Engineering, UC Santa Cruz

Head of Research, Simular


Email: xwang366 [at] ucsc [dot] edu



Bio

Xin (Eric) Wang is an Assistant Professor of Computer Science and Engineering at UC Santa Cruz. His research interests include Natural Language Processing, Computer Vision, and Machine Learning, with an emphasis on Multimodal, Generative, and Embodied AI. He worked at Google Research, Facebook AI Research (FAIR), Microsoft Research, and Adobe Research.
Xin has served as Area Chair for conferences such as ACL, NAACL, EMNLP, ICLR, and NeurIPS, as well as a Senior Program Committee for AAAI and IJCAI. He organized workshops and tutorials at conferences such as ACL, NAACL, CVPR, and ICCV. He has received several awards and recognitions for his work, including CVPR Best Student Paper Award, Google Research Faculty Award, Amazon Alexa Prize Awards, and various gift awards from Adobe, Snap, eBay, etc.

Hiring

If you are interested in joining my lab, please read the information for prospective students and visitors and check out the most beautiful and unique campus of UCSC [YouTube video | bilibili video]. Due to the large volumn of emails, I may not be able to respond to each one individually.

Teaching

Winter 2021 CSE 142: Machine Learning
Spring 2021 CSE 290C: Multimodal Deep Learning
Fall 2021 CSE 142: Machine Learning
Winter 2022 CSE 244B: Machine Learning for Natural Language Processing
Spring 2022 CSE 142: Machine Learning
Fall 2022 CSE 142: Machine Learning
Winter 2023 CSE 244B: Machine Learning for Natural Language Processing
Summer 2023 California State Summer School for Mathematics & Science: AI Cluster
Spring 2024 CSE 142: Machine Learning

News

[NEW!] Three papers accepted to EMNLP 2024!
[NEW!] Our Discffusion paper is accepted to TMLR 2024!
[NEW!] Two papers accepted to ECCV 2024!
[NEW!] Two papers accepted to ACL 2024!
[NEW!] Two papers accepted to NAACL 2024!
[NEW!] Serving as Area Chair for ICLR 2025, NeurIPS 2024, and COLM 2024.
[NEW!] Our lab received a research grant from Microsoft. Thanks Microsoft!
[NEW!] Our lab received a gift award from Adobe. Thanks Adobe!
[NEW!] Our lab received multiple gift awards from eBay and Snap. Thanks eBay and Snap!
[NEW!] Two workshops are accepted to ACL 2024! Will be co-organizing the 3rd Workshop on Advances in Language and Vision Research (ALVR 2024) and the Fourth International Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics (SpLU-RoboNLP 2024) in Bangkok, Thailand.
[NEW!] Invited talk at Yale University (10/2023).
[NEW!] Three papers accepted to NeurIPS 2023! Congratulations to all authors!
[NEW!] Three papers accepted to EMNLP 2023! Congratulations to all authors!
[NEW!] Our Athena team was awarded Second Place (Science Innovation Winner, $50,000) in the Alexa Prize SocialBot Grand Challenge 5!
[NEW!] Serving as Area Chair for ICLR 2024.
[NEW!] Our SlugJARVIS team won Third Place ($50,000) in the inaugural Alexa Prize SimBot Challenge! NEWS COVERAGE
[NEW!] Two papers on (1) Ariel Vision-and-Dialog Navigation and (2) Text-to-Image Association Test are accepted to ACL 2023!
[NEW!] Our ESC paper is accepted to ICML 2023!
[NEW!] Our SlugJARVIS team advances to the finals of the inaugural Alexa Prize SimBot Challenge! Check out Amazon News for more information about this and UCSC News for our three teams of all three Alexa Prize Challenges!
[NEW!] Co-organizing the 5th Workshop on Closing the Loop Between Vision and Language (CLVL) at ICCV 2023.
[NEW!] Serving as Area Chair for NeurIPS 2023.
[NEW!] Invited talk at the CVPR 2023 VizWiz Grand Challenge Workshop (06/2023).
[NEW!] Invited talk at Google Research and UCI (3/2023).
[NEW!] Invited talk at KAUST and USC (2/2023).
[NEW!] Two papers on (1) Training-Free Structured Diffusion Guidance and (2) Neuro-Symbolic Procedural Planning with Commonsense Prompting (Spotlight) are accepted to ICLR 2023!
[NEW!] Three papers on (1) Multimodal Graph Transformer, (2) Imagination-Based Automatic Evaluation, and (3) Imagination-Guided Open-Ended Text Generation are accepted to EACL 2023!
[NEW!] Co-organizing the Workshop on Spatial Language Understanding and Grounded Communication for Robotics (SpLU-RoboNLP) at EMNLP 2023.
[NEW!] Serving as Area Chair for ACL 2023, ICLR 2023, and EMNLP 2022.
[NEW!] Our Sage team received an Amazon Alexa Prize Award to work on Alexa Prize TaskBot Challenge 2. Thanks Amazon!
[NEW!] Our paper "Parameter-Effcient Model Adaptation for Vision Transformers" is accepted to AAAI 2023!
[NEW!] Our Athena team received an Amazon Alexa Prize Award to work on Alexa Prize SocialBot Grand Challenge 5. Thanks Amazon!
[NEW!] Our paper "CPL: Counterfactual Prompt Learning for Vision and Language Models" is accepted to EMNLP 2022!
[NEW!] Our VLMBench paper is accepted to NeurIPS 2022 (Datasets and Benchmarks)! Check out the new compositional benchmark for vision-and-language robotic manipulation HERE!
[NEW!] Invited talk at Adobe Research (08/2022).
[NEW!] Our papers on (1) Privacy-preserving Federated Vision-and-Language Navigation and (2) Language-guided Artistic Style Transfer are accepted to ECCV 2022!
[NEW!] Our SlugJARVIS team won the Alexa Prize SimBot Public Benchmark Challenge! link
[NEW!] Our paper on Understanding Instance-Level Impact of Fairness Constraints accepted to ICML 2022!
[NEW!] Two papers accepted to NAACL 2022 as Oral presentations! Topics include (1) Imagination-Augmented Natural Language Understanding and (2) Diagnosing Vision-and-Language Navigation.
[NEW!] Invited talk at Fudan University (03/2022).
[NEW!] Two papers accepted to CVPR 2022! Topics include (1) Compositional Temporal Grounding and (2) Language-based Video Editing.
[NEW!] Three papers accepted to ACL 2022! Topics include (1) Vision-and-Language Navigation Survey, (2) Multilingual Fairness, and (3) Interpretable Research Replication Prediction.
[NEW!] We have received a Google Faculty Research Award. Thanks Google!
[NEW!] Invited speaker at the CVPR 2022 Workshop on Open-Domain Retrieval Under a Multi-Modal Setting.
[NEW!] Invited talk at USC ISI (02/2022).
[NEW!] Our SlugJARVIS team received an Amazon Alexa Prize Award to work on Alexa Prize SimBot Challenge. Thanks Amazon!
[NEW!] Serving as Area Chair for ACL 2022 and NAACL 2022.
[NEW!] We have received AAII Interdisciplinary Research Award.
[NEW!] Invited talk at Microsoft Research (11/2021).
[NEW!] Our paper on Mitigating Gender Bias in Image Search is accepted to EMNLP 2021 as an Oral paper. Congratulations Jialu!
[NEW!] Invited talk at Stanford Vision Lab (10/2021).
[NEW!] Received Google Cloud Research Credits.
[NEW!] Our VALUE paper is accepted to NeurIPS 2021 (Datasets and Benchmarks). Congratulations to all the authors!
[NEW!] Serving as Senior Program Committee (SPC) for AAAI 2022 and IJCAI-ECAI 2022.
[NEW!] Co-organizing the 4th Workshop on Closing the Loop Between Vision and Language (CLVL) at ICCV 2021!
[NEW!] I am giving a Tutorial on "From VQA to VLN: Recent Advances in Vision-and-Language Research" at CVPR 2021!
[NEW!] Co-organizing the Second Workshop on Advances in Language and Vision Research (ALVR) at NAACL 2021!
[NEW!] I am giving a keynote talk at the Third Workshop on Multimodal Artificial Intelligence at NAACL 2021 on June 6th!
[NEW!] Our paper on Visual Question Rewriting was accepted to SIGIR 2021!
[NEW!] I am serving as Area Chair for CoNLL 2021 and NLPCC 2021.
[NEW!] Invited talk at Arizona State University.
[NEW!] Two papers on multimodal style transfer learning for VLN and visual comparison were accepted to EACL 2020!
[NEW!] I am serving as Area Chair for NAACL 2021.
[NEW!] I am serving as Senior Program Committee (SPC) for IJCAI 2021.
[NEW!] Three papers were accepted to EMNLP 2020 (two conference papers and one Findings paper)!
[NEW!] Two papers were accepted to ECCV 2020 (the adversarial path sampling paper was seleted as Spotlight)!
[NEW!] I successfully defended my Ph.D. Dissertation Closing the Loop Between Language and Vision for Embodied Agents. Thanks to the committee and everyone who has helped me along the Ph.D. journey!
[NEW!] I am serving as Area Chair and Session Chair for EMNLP 2020.
[NEW!] Co-organizing the workshop on Advances in Language and Vision Research (ALVR) at ACL 2020!
[NEW!] Two papers were accepted to CVPR 2020 (the REVERIE paper was selected as Oral)!
[03/2020] Invited panelist at the GPU Technology Conference (GTC) 2020.
[11/2019] Organizer of the workshop on Language & Vision with applications to Video Understanding at CVPR 2020.
[11/2019] Organizer of the tutorial on Self-Supervised Deep Learning for NLP at AACL-IJCNLP 2020.
[10/2019] Invited speaker at the ICCV 2019 Workshop on Person In Context.
[06/2019] Recipient of the CVPR 2019 Best Student Paper Award.
[06/2019] Co-Organizer of the workshop on Closing the Loop Between Vision and Language at ICCV 2019.
[06/2019] Invited talk at Facebook AI.
[01/2019] Session Chair for AAAI 2019 (natural language processing).


Selected Publications [Full Publications]

Preprint

Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe*, Jiuzhou Han*, Shuyu Gan, Jiachen Yang, Ang Li, Xin Eric Wang
Technical report
[Paper] [Website] [Code]

Multimodal Situational Safety
Kaiwen Zhou*, Chengzhi Liu*, Xuandong Zhao, Anderson Compalas, Dawn Song, Xin Eric Wang
Technical report
[Paper] [Website] [Code] [Dataset]

VIA: Unified Spatiotemporal Video Adaptation for Global and Local Video Editing
Jing Gu, Yuwei Fang, Ivan Skorokhodov, Peter Wonka, Xinya Du, Sergey Tulyakov, Xin Eric Wang
Technical report
[Paper] [Website] [Code]

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Xuehai He, Weixi Feng, Kaizhi Zheng, Yujie Lu, Wanrong Zhu, Jiachen Li, Yue Fan, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Kevin Lin, William Yang Wang, Lijuan Wang, Xin Eric Wang
Technical report
[Paper] [Website] [Code] [Dataset]

Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA
Qianqi Yan, Xuehai He, Xiang Yue, Xin Eric Wang
Technical report
[Paper] [Website] [Code] [Dataset]

FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation
Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang
Technical report
[Paper] [Website]

LLM-Coordination: Evaluating and Analyzing Multi-Agent Coordination Abilities in Large Language Models
Saaket Agashe, Yue Fan, Anthony Reyna, Xin Eric Wang
Technical report
[Paper] [Website] [Code]

MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens
Kaizhi Zheng*, Xuehai He*, Xin Eric Wang
Technical report
[Paper] [Website] [Code]

2024

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
Yue Fan, Lei Ding, Ching-Chen Kuo, Shan Jiang, Yang Zhao, Xinze Guan, Jie Yang, Yi Zhang, Xin Eric Wang
EMNLP 2024
[Paper] [Website] [Code] [Dataset]

Active Listening: Personalized Question Generation in Open-Domain Social Conversation with User Model Based Prompting
Kevin Bowden, Yue Fan, Winson Chen, Wen Cui, Davan Harrison, Marilyn Walker, Xin Eric Wang
Findings of EMNLP 2024

Multimodal Procedural Planning via Dual Text-Image Prompting
Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang
Findings of EMNLP 2024
[Paper] [Code]

Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
Transactions on Machine Learning Research (TMLR) 2024
[Paper] [Website] [Code]

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Jing Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang
ECCV 2024
[Paper] [Website] [Code]

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
Gengze Zhou, Yicong Hong, Zun Wang, Xin Eric Wang, Qi Wu
ECCV 2024
[Paper] [Code]

Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Xinze Guan, Xin Eric Wang
ACL 2024
[Paper] [Website] [Code] [Data]

ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
Kaiwen Zhou, Kwonjoon Lee, Teruhisa Misu, Xin Eric Wang
ACL 2024
[Paper] [Website] [Code]

Navigation as Attackers Wish? Towards Building Byzantine-Robust Embodied Agents under Federated Learning
Yunchao Zhang, Zonglin Di, Kaiwen Zhou, Cihang Xie, Xin Eric Wang
NAACL 2024
[Paper] [Website] [Code]

ComCLIP: Training-Free Compositional Image and Text Matching
Kenan Jiang*, Xuehai He*, Ruize Xu, Xin Eric Wang
NAACL 2024
[Paper] [Website] [Code]

2023

Photoswap: Personalized Subject Swapping in Images
Jing Gu, Yilin Wang, Nanxuan Zhao, Tsu-Jui Fu, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang
NeurIPS 2023
[Paper] [Website] [Code]

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng, Wanrong Zhu, Tsu-jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang
NeurIPS 2023
[Paper] [Website] [Code]

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang
NeurIPS 2023
[Paper] [Code]

R2H: Building Multimodal Navigation Helpers that Respond to Help Requests
Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang
EMNLP 2023
[Paper] [Website] [Code]

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
EMNLP 2023
[Paper]

Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment
Zhen Zhang, Jialu Wang, Xin Eric Wang
Findings of EMNLP 2023
[Paper] [Code]

ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
Kaiwen Zhou, Kaizhi Zheng, Connor Pryor, Yilin Shen, Hongxia Jin, Lise Getoor, Xin Eric Wang
ICML 2023
[Paper] [Website]

Aerial Vision-and-Dialog Navigation
Yue Fan, Winson Chen, Tongzhou Jiang, Chun Zhou, Yi Zhang, Xin Eric Wang
Findings of ACL 2023
[Paper] [Website] [Code]

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation
Jialu Wang, Xinyue Gabby Liu, Zonglin Di, Yang Liu, Xin Eric Wang
Findings of ACL 2023
[Paper] [Code] [Demo]

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang
ICLR 2023
[Paper] [Website] [Code]

Neuro-Symbolic Procedural Planning with Commonsense Prompting
Yujie Lu, Weixi Feng, Wanrong Zhu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
ICLR 2023
Spotlight Presentation
[Paper] [Code]

Multimodal Graph Transformer for Multimodal Question Answering
Xuehai He, Xin Eric Wang
EACL 2023
[Paper]

Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
EACL 2023
[Paper] [Code]

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
Wanrong Zhu, Xin Eric Wang, An Yan, Miguel Eckstein, William Yang Wang
EACL 2023
[Paper]

Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He, Chunyuan Li, Pengchuan Zhang, Jianwei Yang, Xin Eric Wang
AAAI 2023
[Paper] [Website] [Code]

Athena 3.0: Personalized Multimodal Chatbot with Neuro-symbolic Dialogue Generators
Yue Fan, Kevin K. Bowden, Wen Cui, Winson Chen, Vrindavan Harrison, Angela Ramirez, Saaket Agashe, Xinyue Gabby Liu, Neha Pullabhotla, Nan Qiang, Jeshwanth Bheemanpally, Sugam Garg, Marilyn Walker, Xin Eric Wang
Alexa Prize SocialBot Grand Challenge 5 Proceedings 2023
[Paper]

Sage: A Multimodal Knowledge Graph-based Conversational Agent for Complex Task Guidance
Kaizhi Zheng, Jeshwanth Bheemanpally, Bhrigu Garg, Seongsil Heo, Dhananjay Sonawane, Winson Chen, Shree Vignesh S, Xin Eric Wang
Alexa Prize TaskBot Challenge 2 Proceedings 2023
[Paper]

SlugJARVIS: Multimodal Commonsense Knowledge-based Embodied AI for SimBot Challenge
Jing Gu, Kaizhi Zheng, Kaiwen Zhou, Yue Fan, Xuehai He, Jialu Wang, Zonglin Di, Xin Eric Wang
Alexa Prize SimBot Challenge Proceedings 2023
[Paper]

2022

CPL: Counterfactual Prompt Learning for Vision and Language Models
Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang
EMNLP 2022
[Paper] [Website] [Code]

VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation
Kaizhi Zheng, Xiaotong Chen, Odest Chad Jenkins, Xin Eric Wang
NeurIPS 2022
[Paper] [Website] [Code]

FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
Kaiwen Zhou, Xin Eric Wang
ECCV 2022
[Paper] [Code]

Language-Driven Artistic Style Transfer
Tsu-Jui Fu, Xin Eric Wang, William Yang Wang
ECCV 2022
[Paper] [Code]

Understanding Instance-Level Impact of Fairness Constraints
Jialu Wang, Xin Eric Wang, Yang Liu
ICML 2022
[Paper] [Code]

Imagination-Augmented Natural Language Understanding
Yujie Lu, Wanrong Zhu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
NAACL 2022
Oral presentation
[Paper] [Code]

Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu, Yuankai Qi, Pradyumna Narayana, Kazoo Sone, Sugato Basu, Xin Eric Wang, Qi Wu, Miguel Eckstein, William Yang Wang
NAACL 2022
Oral presentation
[Paper] [Code]

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang
SoCal NLP 2022
Winner Model of the Alexa Prize SimBot Public Benchmark Challenge LINK
[Paper]

Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li, Junlin Xie, Long Qian, Linchao Zhu, Siliang Tang, Fei Wu, Yi Yang, Yueting Zhuang, Xin Eric Wang
CVPR 2022
[Paper] [Code]

M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformer
Tsu-Jui Fu, Xin Eric Wang, Scott Grafton, Miguel Eckstein, William Yang Wang
CVPR 2022
[Paper] [Dataset] [Video]

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
Jing Gu, Eliana Stefani, Qi Wu, Jesse Thomason, Xin Eric Wang
ACL 2022
[Paper] [Code]

Assessing Multilingual Fairness in Pretrained Multimodal Representations
Jialu Wang, Yang Liu, Xin Eric Wang
Findings of ACL 2022
[Paper]

Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking
Tianyi Luo, Rui Meng, Xin Eric Wang, Yang Liu
Findings of ACL 2022
[Paper]

2021

Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search
Jialu Wang, Yang Liu, Xin Eric Wang
EMNLP 2021
Oral presentation
[Paper] [Code]

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang,
William Yang Wang, Tamara Lee Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu
NeurIPS 2021
[Paper] [Website] [Code] [Data]

Visual Question Rewriting for Increasing Response Rate
Jiayi Wei, Xilian Li, Yi Zhang, Xin Eric Wang
SIGIR 2021
[Paper] [Code]

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Wanrong Zhu, Xin Eric Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang
EACL 2021
[Paper] [Code]

L2C: Describing Visual Differences Needs Semantic Understanding of Individuals
An Yan, Xin Eric Wang, Tsu-Jui Fu, William Yang Wang
EACL 2021
[Paper]

2020

Closing the Loop Between Language and Vision for Embodied Agents
Xin Wang
UC Santa Barbara
[PhD Dissertation]

SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning
Tsu-Jui Fu, Xin Eric Wang, Scott Grafton, Miguel Eckstein, William Yang Wang
EMNLP 2020
Oral presentation
[Paper] [Code] [Slides]

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
Wanrong Zhu, Xin Eric Wang, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang
EMNLP 2020
[Paper]

Learning to Stop: A Simple yet Effective Approach to Urban Vision-Language Navigation
Jiannan Xiang, Xin Eric Wang, William Yang Wang
Findings of EMNLP 2020
[Paper]

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation
Xin Eric Wang*, Vihan Jain*, Eugene Ie, William Yang Wang, Zornitsa Kozareva, Sujith Ravi
ECCV 2020
Ranking 1st on the CVDN leaderboard
[Paper] [Code] [Video] [Slides]

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling
Tsu-Jui Fu, Xin Eric Wang, Matthew Peterson, Scott Grafton, Miguel Eckstein, William Yang Wang
ECCV 2020
Spotlight presentation
[Paper] [Video] [Slides]

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
Juncheng Li, Xin Wang, Siliang Tang, Haizhou Shi, Fei Wu, Yueting Zhuang, William Yang Wang
CVPR 2020
[Paper] [Video]

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi, Qi Wu, Peter Anderson, Xin Wang, William Yang Wang, Chunhua Shen, Anton van den Hengel
CVPR 2020
Oral presentation
[Paper] [Code] [Video]

Vision-Language Navigation Policy Learning and Adaptation
Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Journal version of the CVPR 2019 Best Student Paper

Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs
Pengda Qin, Xin Wang, Wenhu Chen, Chunyun Zhang, Weiran Xu, William Yang Wang
AAAI 2020
Oral presentation
[Paper] [Code]

2019

TIGEr: Text-to-Image Grounding for Image Caption Evaluation
Ming Jiang, Qiuyuan Huang, Lei Zhang, Xin Wang, Pengchuan Zhang, Zhe Gan, Jana Diesner, Jianfeng Gao
EMNLP-IJCNLP 2019
[Paper] [Code] [bibtex]

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Xin Wang*, Jiawei Wu*, Junkun Chen, Lei Li, Yuan-Fang Wang, William Yang Wang
ICCV 2019
Oral presentation
[Paper] [Website] [Video] [bibtex]

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang
CVPR 2019
Best Student Paper (1/5160=0.02%)
[Paper] [Video] [bibtex]

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment
Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis
CVPR 2019
[Paper] [bibtex]

Self-Supervised Dialogue Learning
Jiawei Wu, Xin Wang, William Yang Wang
ACL 2019
[Paper] [bibtex]

Self-Supervised Learning for Contextualized Extractive Summarization
Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang
ACL 2019
[Paper] [Code] [bibtex]

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models
Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin
ACL 2019
[Paper] [bibtex]

Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation
Jiawei Wu, Xin Wang, William Yang Wang
NAACL 2019
Oral presentation
[Paper] [bibtex]

Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning
Xin Wang, Jiawei Wu, Da Zhang, Yu Su, William Yang Wang
AAAI 2019
Oral presentation
[Paper] [Code] [bibtex]

2018 and before

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
Xin Wang*, Wenhan Xiong*, Hongmin Wang, William Yang Wang
ECCV 2018
[Paper] [bibtex]

XL-NBT: A Cross-lingual Neural Belief Tracking Framework
Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang
EMNLP 2018
[Paper] [Code] [bibtex]

No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Xin Wang*, Wenhu Chen*, Yuan-Fang Wang, William Yang Wang
ACL 2018
Oral presentation
[Paper] [Code] [Video] [Slides (pptx)] [Slides (pdf)] [bibtex]

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Network
Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang
BMVC 2018
Oral presentation
[Paper] [Code] [Video] [Slides] [bibtex]

Video Captioning via Hierarchical Reinforcement Learning
Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, William Yang Wang
CVPR 2018
[Paper] [Supp] [Dataset] [bibtex]

Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
Xin Wang, Yuan-Fang Wang, William Yang Wang
NAACL-HLT 2018
[Paper] [Paper] [bibtex]

Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer
Xin Wang, Geoffrey Oxholm, Da Zhang, Yuan-Fang Wang
CVPR 2017
[Paper] [Supp] [Images] [Code (Third-Party)] [bibtex]

Deep Reinforcement Learning for Visual Object Tracking in Videos
Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang
Tech report 2017
[Paper] [bibtex]



Products

ProjectCloak: Remove Unwanted Objects in Video
In collabration with Geoffrey Oxholm, Oliver Wang, Eli Shechtman, Mike Lukac
2017 MAX Sneak
2018 MAX Keynote
Featured in Adobe After Effects! Link

[Project] [Blog] [Video]

ArtisticEye: A Real-time Application for High-resolution Artistic Style Transfer
In collabration with Geoffrey Oxholm
Applied to the de Young Museum in San Francisco
Honored to present the product prototype to the Adobe CEO Shantanu Narayen face to face

[Blog] [Video]



Experience

Google AI, Mountain View, US

       Research Intern,   Summer 2019

       Mentors: Sujith Ravi, Zornitsa Kozareva

Facebook AI Research (FAIR), Menlo Park, US

       Graduate Researcher,   Spring 2019

       Mentors: Xinlei Chen, Marcus Rohrbach, Dhruv Batra

Microsoft Research AI, Redmond, US

       Research Intern,   Summer 2018

       Mentors: Lei Zhang, Asli Celikyilmaz, Jianfeng Gao

Adobe Research, San Francisco, US

       Research Intern,   Summer 2017

       Mentors: Geoffrey Oxholm, Oliver Wang, Eli Shechtman, Mike Lukac

Adobe Research, San Francisco, US

       Research Intern,   Summer 2016

       Mentor: Geoffrey Oxholm

Exacloud Inc., Hangzhou, China

       Software Engineer Intern,   12. 2014 - 03. 2015

HCI, Graphics and Computer Vision Group, HKU

       Research Assistant, Summer 2014

       Advisor: Yizhou Yu





Service

  • Organizer:
  • Area Chair (or Senior Program Committee):
  • Session Chair:
  • Program Committee: ACL,   NAACL,   EMNLP,   CVPR,   ICCV,   ECCV,   NeurIPS,   ICLR,   AAAI,   IJCAI,   CoRL  

  • Journal Reviewer: TPAMI,  IJCV