- 1650. Xiangyu Yue, Zangwei Zheng, Shanghang Zhang, Yang Gao, Trevor Darrell, Kurt Keutzer, Alberto Sangiovanni-Vincentelli. Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation. , CVPR 2021.
- 1649. Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao. Discovering Autoregressive Orderings with Variational Inference. , ICLR 2021.
- 1648. Sicheng Zhao, Yezhen Wang, Bo Li, Bichen Wu, Yang Gao, Pengfei Xu, Trevor Darrell, Kurt Keutzer . ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation. , AAAI 2021.
- 1647. Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu. Mutual Information State Intrinsic Control. , ICLR 2021.
- 1646. Shusheng Xu, Yichen Liu, Xiaoyu Yi, Siyuan Zhou, Huizi Li, Yi Wu. Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension. , NeurIPS 2021.
- 1645. Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell. Zero-shot Policy Learning with Spatial Temporal Reward Decomposition on Contingency-aware Observation. , ICRA 2021.
- 1644. Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian. NovelD: A Simple yet Effective Exploration Criterion. , NeurIPS 2021.
- 1643. Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu. Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems. , NeurIPS 2021.
- 1642. Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu. Solving Compositional Reinforcement Learning Problems via Task Reduction. , ICLR 2021.
- 1641. Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu. Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization. , ICLR 2021.
- 1640. Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang. Temporal Induced Self-Play for Stochastic Bayesian Games. , IJCAI 2021.
- 1639. S. Wang, H. Wang, L. Huang. Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback. , AAAI 2021.
- 1638. Y. Du, S. Wang, L. Huang. A One-Size-Fits-All Solution to Conservative Bandit Problems. , AAAI 2021.
- 1637. Tiancheng Jin, Longbo Huang, Haipeng Luo. The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition. , NeurIPS 2021.
- 1636. Xinran Gu, Kaixuan Huang, Jingzhao Zhang, Longbo Huang. Fast Federated Learning in the Presence of Arbitrary Device Unavailability. , NeurIPS 2021.
- 1635. Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang. Continuous Mean-Covariance Bandits. , NeurIPS 2021.
- 1634. Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman. Multi-Agent Reinforcement Learning in Time-varying Networked Systems. , NeurIPS 2021.
- 1633. Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson. Regularized Softmax Deep Multi-Agent Q-Learning. , NeurIPS 2021.