- 1640. Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang. Temporal Induced Self-Play for Stochastic Bayesian Games. , IJCAI 2021.
- 1639. S. Wang, H. Wang, L. Huang. Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback. , AAAI 2021.
- 1638. Y. Du, S. Wang, L. Huang. A One-Size-Fits-All Solution to Conservative Bandit Problems. , AAAI 2021.
- 1637. Tiancheng Jin, Longbo Huang, Haipeng Luo. The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition. , NeurIPS 2021.
- 1636. Xinran Gu, Kaixuan Huang, Jingzhao Zhang, Longbo Huang. Fast Federated Learning in the Presence of Arbitrary Device Unavailability. , NeurIPS 2021.
- 1635. Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang. Continuous Mean-Covariance Bandits. , NeurIPS 2021.
- 1634. Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman. Multi-Agent Reinforcement Learning in Time-varying Networked Systems. , NeurIPS 2021.
- 1633. Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson. Regularized Softmax Deep Multi-Agent Q-Learning. , NeurIPS 2021.
- 1632. Yu Huang, Chenzhuang Du, Zihui Xue, Xuanyao Chen, Hang Zhao, Longbo Huang. What Makes Multimodal Learning Better than Single (Provably). , NeurIPS 2021.
- 1631. Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan and Chongjie Zhang. MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration. , ICML 2021.
- 1630. Hao Hu, Jianing Ye, Zhizhou Ren, Guangxiang Zhu, and Chongjie Zhang. Generalizable Episodic Memory for Deep Reinforcement Learning. , ICML 2021.
- 1629. Zhaorong Wang, Meng Wang, Jingqi Zhang, Yingfeng Chen, Chongjie Zhang. Reward-Constrained Behavior Cloning. , IJCAI 2021.
- 1628. Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson and Chongjie Zhang. RODE: Learning Roles to Decompose Multi-Agent Tasks. , ICLR 2021.
- 1627. Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu and Chongjie Zhang. QPLEX: Duplex Dueling Multi-Agent Q-Learning. , ICLR 2021.
- 1626. Yihan Wang, Beining Han, Tonghan Wang, Heng Dong and Chongjie Zhang. DOP: Off-Policy Multi-Agent Decomposed Policy Gradients. , ICLR 2021.
- 1625. Siyuan Li, Lulu Zheng, Jianhao Wang and Chongjie Zhang. Learning Subgoal Representations with Slow Dynamics. , ICLR 2021.
- 1624. Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Eben Li, Chongjie Zhang, Jianye HAO. Model-Based Reinforcement Learning via Imagination with Derived Memory. , NeurIPS 2021.
- 1623. Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang. On the Estimation Bias in Double Q-Learning. , NeurIPS 2021.