Google at ICLR 2021

Posted by Jaqui Herman, Research Specialist and Tim Herrmann, Program Manager

The 9th International Conference on Learning Representations (ICLR 2021), a virtual conference focused on deep learning, kicked off this week, offering conference and workshop tracks that present some of the latest research in deep learning and its applications to areas such as computer vision, computational biology, speech recognition, text understanding, and more.

As a Platinum Sponsor of ICLR 2021, Google will have a strong presence with over 100 accepted publications and participation on organizing committees and in workshops. If you have registered for ICLR 2021, we hope you’ll watch our talks and learn about the work at Google that goes into solving interesting problems for billions of people. Learn more about our research being presented in the list below (Googlers in bold).

Officers and Board Members
Includes: Hugo Larochelle, Tara Sainath

Organizing Committee
Includes: Sanmi Koyejo, Chelsea Finn

Area Chairs
Includes: Abhishek Kumar, Aditya Menon, Aleksandra Faust, Alexey Dosovitskiy, Andrew Cotter, Andrew Dai, Augustus Odena, Been Kim, Behnam Neyshabur, Ben Poole, Bo Dai, Bo Li, Branislav Kveton, Ce Liu, Claudio Gentile, Colin Raffel, Danny Tarlow, David Ha, Dengyong Zhou, Dumitru Erhan, Dustin Tran, Felix Hill, George Tucker, Hanie Sedghi, Heinrich Jiang, Hossein Mobahi, Izhak Shafran, Jascha Sohl-Dickstein, Jasper Snoek, Jean-Philippe Vert, Jeffrey Pennington, Justin Gilmer, Kevin Swersky, Marco Cuturi, Mario Lucic, Marlos C. Machado, Mathieu Blondel, Matt Johnson, Matthieu Geist, Mohammad Norouzi, Naman Agarwal, Navdeep Jaitly, Nicolas Le Roux, Niki Parmar, Olivier Bachem, Olivier Pietquin, Philip Long, Quentin Berthet, Razvan Pascanu, Rodolphe Jenatton, Samy Bengio*, Sebastian Nowozin, Silvio Lattanzi, Slav Petrov, Srinadh Bhojanapalli, Suman Ravuri, Tim Salimans, Vitaly Kuznetsov, William Cohen, Yann Dauphin, Yujia Li

Publications
Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes
Mike Gartrell, Insu Han, Elvis Dohmatob, Jennifer Gillenwater, Victor-Emmanuel Brunel

An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale (see the blog post)
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Biao Zhang*, Ankur Bapna, Rico Sennrich, Orhan Firat

Evolving Reinforcement Learning Algorithms (see the blog post)
John D Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Quoc V Le, Sergey Levine, Honglak Lee, Aleksandra Faust

Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song*, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study
Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Leonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

When Do Curricula Work?
Xiaoxia Wu, Ethan Dyer, Behnam Neyshabur

Sharpness-aware Minimization for Efficiently Improving Generalization
Pierre Foret*, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur

Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models Zirui Wang*, Yulia Tsvetkov, Orhan Firat, Yuan Cao

Mathematical Reasoning via Self-supervised Skip-tree Training
Markus Norman Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy

Long-Tail Learning via Logit Adjustment
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, Sanjiv Kumar

Are Neural Rankers Still Outperformed by Gradient Boosted Decision Trees?
Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork

LambdaNetworks: Modeling Long-Range Interactions without Attention
Irwan Bello

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G Bellemare

BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration
Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, Charles Sutton, Hanjun Dai

Practical Real Time Recurrent Learning with a Sparse Approximation
Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

LEAF: A Learnable Frontend for Audio Classification (see the blog post)
Neil Zeghidour, Olivier Teboul, Félix de Chaumont Quitry, Marco Tagliasacchi

Batch Reinforcement Learning Through Continuation Method
Yijie Guo, Shengyu Feng, Nicolas Le Roux, Ed Chi, Honglak Lee, Minmin Chen

Scalable Transfer Learning with Expert Models
Joan Puigcerver, Carlos Riquelme Ruiz, Basil Mustafa, Cedric Renggli*, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Rishabh Agarwal, Marlos C. Machado*, Pablo Samuel Castro, Marc G Bellemare

Scaling Symbolic Methods Using Gradients for Neural Model Explanation
Subham Sekhar Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley

Primal Wasserstein Imitation Learning (see the blog post)
Robert Dadashi, Leonard Hussenot, Matthieu Geist, Olivier Pietquin

Reset-Free Lifelong Learning with Skill-Space Planning
Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

Teaching Temporal Logics to Neural Networks
Christopher Hahn, Frederik Schmitt, Jens U. Kreber, Markus Norman Rabe, Bernd Finkbeiner

Shape-Texture Debiased Neural Network Training
Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

Rethinking Embedding Coupling in Pre-trained Language Models
Hyung Won Chung, Thibault Fevry*, Henry Tsai, Melvin Johnson, Sebastian Ruder

Overparameterisation and Worst-Case Generalisation: Friend or Foe?
Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar

Single-Photon Image Classification
Thomas Fischbacher, Luciano Sbaiz

Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Efthymios Tzinis*, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Daniel P. W. Ellis, John R. Hershey

Adaptive Federated Optimization
Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, Hugh Brendan McMahan

Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Biao Zhang*, Ankur Bapna, Rico Sennrich, Orhan Firat

Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
Benjamin Eysenbach, Shreyas Chaudhari, Swapnil Asawa, Sergey Levine, Ruslan Salakhutdinov

Open Question Answering over Tables and Text
Wenhu Chen*, Ming-Wei Chang, Eva Schlinger, William Yang Wang, William W. Cohen

Practical Real Time Recurrent Learning with a Sparse Approximation
Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression
Rianne van den Berg, Alexey A. Gritsenko, Mostafa Dehghani, Casper Kaae Sønderby, Tim Salimans

A Universal Representation Transformer Layer for Few-Shot Image Classification
Lu Liu, William L. Hamilton, Guodong Long, Jing Jiang, Hugo Larochelle

Tradeoffs in Data Augmentation: An Empirical Study
Raphael Gontijo-Lopes, Sylvia Smullin, Ekin Dogus Cubuk, Ethan Dyer

Coping with Label Shift via Distributionally Robust Optimisation
Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar, Suvrit Sra

Rethinking Attention with Performers (see the blog post)
Krzysztof Marcin Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Quincy Davis, Afroz Mohiuddin, Lukasz Kaiser, David Benjamin Belanger, Lucy J Colwell, Adrian Weller

Teaching with Commentaries
Aniruddh Raghu*, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
Vinay Venkatesh Ramasesh, Ethan Dyer, Maithra Raghu

Model-Based Offline Planning
Arthur Argenson, Gabriel Dulac-Arnold

The Geometry of Integration in Text Classification RNNs
Kyle Aitken*, Vinay Venkatesh Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan

On the Origin of Implicit Regularization in Stochastic Gradient Descent
Samuel L Smith, Benoit Dherin, David Barrett, Soham De

Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song*, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole

The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers (see the blog post)
Preetum Nakkiran*, Behnam Neyshabur, Hanie Sedghi

Learning Energy-Based Models by Diffusion Recovery Likelihood
Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P Kingma

Latent Skill Planning for Exploration and Transfer
Kevin Xie, Homanga Bharadhwaj, Danijar Hafner, Animesh Garg, Florian Shkurti

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation
Yuliang Zou*, Zizhao Zhang, Han Zhang, Chun-Liang Li, Xiao Bian, Jia-Bin Huang, Tomas Pfister

WaveGrad: Estimating Gradients for Waveform Generation
Nanxin Chen*, Yu Zhang, Heiga Zen, Ron J Weiss, Mohammad Norouzi, William Chan

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Atish Agarwala, Abhimanyu Das, Brendan Juba*, Rina Panigrahy, Vatsal Sharan*, Xin Wang, Qiuyi Zhang

Long Range Arena : A Benchmark for Efficient Transformers
Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

Explainable Deep One-Class Classification
Philipp Liznerski, Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Marius Kloft, Klaus Robert Muller

Net-DNF: Effective Deep Modeling of Tabular Data
Liran Katzir, Gal Elidan, Ran El-Yaniv

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral
Lucio M. Dery, Yann Dauphin, David Grangier

Long-Tail Learning via Logit Adjustment
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, Sanjiv Kumar

Average-Case Acceleration for Bilinear Games and Normal Matrices
Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
Anurag Ajay*, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum

Training Independent Subnetworks for Robust Prediction
Marton Havasi*, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew Mingbo Dai, Dustin Tran

Benchmarks for Deep Off-Policy Evaluation
Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Thomas Paine

TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks
Martin Trimmel, Henning Petzka, Cristian Sminchisescu

Mastering Atari with Discrete World Models (see the blog post)
Danijar Hafner, Timothy P Lillicrap, Mohammad Norouzi, Jimmy Ba

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit
Danijar Hafner, Timothy P Lillicrap, Mohammad Norouzi, Jimmy Ba

Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning
Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies
Paul Pu Liang*, Manzil Zaheer, Yuan Wang, Amr Ahmed

Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret*, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur

HyperGrid Transformers: Towards A Single Model for Multiple Tasks
Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, Da-Cheng Juan

Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms
Maruan Al-Shedivat*, Jennifer Gillenwater, Eric Xing, Afshin Rostamizadeh

BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration
Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, Charles Sutton, Hanjun Dai

Are Neural Rankers Still Outperformed by Gradient Boosted Decision Trees?
Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth
Thao Nguyen, Maithra Raghu, Simon Kornblith

A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun*, Shankar Krishnan, Hossein Mobahi

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

Mathematical Reasoning via Self-Supervised Skip-Tree Training
Markus Norman Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy

Lipschitz Recurrent Neural Networks
N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, Michael W. Mahoney

Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Michael R Zhang*, Thomas Paine, Ofir Nachum, Cosmin Paduraru, George Tucker, ziyu wang, Mohammad Norouzi

The Importance of Pessimism in Fixed-Dataset Policy Optimization
Jacob Buckman, Carles Gelada, Marc G Bellemare

Monotonic Kronecker-Factored Lattice
William Taylor Bakst, Nobuyuki Morioka, Erez Louidor

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study
Marcin Andrychowicz, Anton Raichuk, Piotr Stańczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Leonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

Adversarially Guided Actor-Critic
Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist

Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes
Mike Gartrell, Insu Han, Elvis Dohmatob, Jennifer Gillenwater, Victor-Emmanuel Brunel

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, Zhifeng Chen

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
Wonkwang Lee, Whie Jung, Han Zhang, Ting Chen, Jing Yu Koh, Thomas Huang, Hyungsuk Yoon, Honglak Lee*, Seunghoon Hong

Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
Zirui Wang, Yulia Tsvetkov, Orhan Firat, Yuan Cao

Dataset Meta-Learning from Kernel Ridge-Regression
Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Dual-Mode ASR: Unify and Improve Streaming ASR with Full-Context Modeling
Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara N Sainath, Yonghui Wu, Ruoming Pang

Implicit Gradient Regularization
David Barrett, Benoit Dherin

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G Bellemare

Deconstructing the Regularization of BatchNorm
Yann Dauphin, Ekin Dogus Cubuk

C-Learning: Learning to Achieve Goals via Recursive Classification
Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Evolving Reinforcement Learning Algorithms
John D Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Quoc V Le, Sergey Levine, Honglak Lee, Aleksandra Faust

Colorization Transformer
Manoj Kumar, Dirk Weissenborn, Nal Kalchbrenner

Control-Aware Representations for Model-based Reinforcement Learning
Brandon Cui, Yinlam Chow, Mohammad Ghavamzadeh

Evaluations and Methods for Explanation through Robustness Analysis
Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Kumar Ravikumar, Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh

Learning and Evaluating Representations for Deep One-Class Classification
Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Minho Jin, Tomas Pfister

No MCMC for Me: Amortized Sampling for Fast and Stable Training of Energy-Based Models
Will Sussman Grathwohl, Jacob Jin Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

Neural Thompson Sampling
Weitong ZHANG, Dongruo Zhou, Lihong Li, Quanquan Gu

A Design Space Study for LISTA and Beyond
Tianjian Meng, Xiaohan Chen, Yifan Jiang, Zhangyang Wang

i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning
Kibok Lee, Yian Zhu, Kihyuk Sohn, Chun-Liang Li, Jinwoo Shin, Honglak Lee

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments
Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer

Calibration of Neural Networks using Splines
Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, Richard Hartley

Extreme Memorization via Scale of Initialization
Harsh Mehta, Ashok Cutkosky, Behnam Neyshabur

Molecule Optimization by Explainable Evolution
Binghong Chen, Tianzhe Wang, Chengtao Li, Hanjun Dai, Le Song

Combining Ensembles and Data Augmentation Can Harm Your Calibration
Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

Workshops
Science and Engineering of Deep Learning
Speakers and Panelists include: Alex Hanna
Moderator and Advisors include: Emily Denton
Organizers include: Negar Rostemzadeh, Samy Bengio*

Synthetic Data Generation: Quality, Privacy, Bias
Speakers include: Jinsung Yoon, Emily Denton
Program Committee includes: Syed Ashrafulla

Enormous Language Models: Perspectives and Benchmarks
Speakers and Panelists include: Noam Shazeer, Natalie Schluter
Organizers include: Colin Raffel, Adam Roberts, Jascha Sohl-Dickstein, Katherine Lee, William Fedus, Aitor Lewkowycz

The Role of Mathematical Reasoning in General Artificial Intelligence
Speakers and Panelists include: Markus Rabe, Christian Szegedy

Weakly Supervised Learning
Invited Speakers include: Lu Jiang

Learning to Learn
Organizers include: Yevgen Chebotar

Embodied Multimodal Learning (EML)
Invited Speakers includes: Sergey Levine

Distributed and Private Machine Learning
Program Committee includes: Peter Kairouz, Ananda Theertha Suresh

S2D-OLAD: From Shallow to Deep, Overcoming Limited and Adverse Data
Invited Speakers include: Alex Hanna, Hugo Larochelle
Organizers include: Vincent Dumoulin

Responsible AI (RAI)
Speakers include: Been Kim

Energy-Based Models: Current Perspectives, Challenges, and Opportunities
Organizers include: Adji Bousso Dieng, Igor Mordatch

A Roadmap to Never-Ending RL
Invited Session Panelists include: Aleksandra Faust
Program Committee includes: Coline Devin, Karol Hausman, Ben Eysenbach, Ofir Nachum, Ryan Julian, Tianhe Yu, Dumitru Erhan, Marc Pickett, Shixiang Gu

2nd Workshop on Practical ML for Developing Countries: Learning Under Limited/low Resource Scenarios
Program Committee includes: Pablo Samuel Castro

Beyond Static Papers: Rethinking How We Share Scientific Understanding in ML
Speakers include: David Ha, Hugo Larochelle
Organizers include: Sara Hooker

* Indicates work done while at Google

Read More