Sergey Levine   @svlevine

Associate Professor at UC Berkeley





  Tweets by Sergey Levine  

Sergey Levine    @svlevine     6/15/2021
Prior to this, I didn't think that goal-conditioned RL could handle such diverse real-world scenes. But what's neat here is not just the vis. goal results, but that it provides pretraining & joint training to accelerates downstream tasks, a kind of unsupervised pretraining for RL
 Reply      Retweet   27      Like     113    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     7/12/2021
Always makes my day to come in to lab and see a robot playing with an object in the corner, without adult supervision馃檪 Though I think this one needs a friend, it's been going at it for some time on its own...
 Reply      Retweet   12      Like     164    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     6/1/2021
Can robots to learn to drive on Berkeley sidewalks, entirely from images? Also at #ICRA2021, Tue 18:30 GMT+1/10:30 am PT, Greg Kahn will present LaND, which can learn to navigate kms of Berkeley sidewalks! In the Field Robotics: Control session. https://t.co/ZujiRF6PGt
 Reply      Retweet   15      Like     110    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     7/20/2021
Adversarial surprise: an "intrinsic motivation arms race" where Agent A tries to surprise Agent B, and Agent B wants to avoid getting surprised. Both a way to get emergent complexity, and provide some appealing coverage guarantees for exploration. Check out the paper below:
 Reply      Retweet   18      Like     69    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     7/25/2021
An "RL" take on compression: "super-lossy" compression that changes the image, but preserves its downstream effect (i.e., the user should take the same action seeing the "compressed" image as when they saw original) https://t.co/gMEncxzxzA w @sidgreddy & @ancadianadragan 馃У>
 Reply      Retweet   17      Like     68    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     7/21/2021
Want to learn how algorithmic causality leads modular RL methods to transfer better? Check out @mmmbchang's long talk at #ICML2021 tomorrow 7/22 at 5:00 am PT, or watch it right here: https://t.co/fz69J6KMWg https://t.co/aNM5T37e1E https://t.co/lmyhZTgBs3 Short thread: https://t.co/Aqdh1hjktU
 Reply      Retweet   13      Like     60    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/7/2021
Can we devise model-based RL methods that use the *same* objective for the model and the policy? The idea in Mismatched no More (MnM馃檪) is to devise a single objective that can be optimized by both model and policy that provably lower bounds reward. A thread:
 Reply      Retweet   8      Like     15    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     9/23/2021
Offline RL lets us run RL without active online interaction, but tuning hyperparameters, model capacity etc. still requires rollouts, or validation tasks. In our new paper, we propose guidelines for *fully offline* tuning for algorithms like CQL https://t.co/Yg9YDMrIA0 A thread:
 Reply      Retweet   6      Like     21    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/8/2021
Is there a principled way to adapt a model to distributional shift without labels? In our new paper, "Training on Test Data", we propose a Bayesian adaptation strategy based on BNNs and entropy minimization. w/ Aurick Zhou: https://t.co/juFpiE6mb3 A thread:
 Reply      Retweet   6      Like     10    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/7/2021
Generative models can be used for compression. But if we combine them with an "RL-like" GAN, we can make it so that, when a user sees the compressed image, they do the same thing as if they saw the original. This is the idea behind pragmatic compression: https://t.co/N2wFFyV0uw
 Reply      Retweet   4      Like     15    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/7/2021
For a while, @ben_eysenbach, @rsalakhu, and I have been trying to understand unsupervised RL, such as DIAYN https://t.co/ZH51PsSkqj Which skills do they learn? Can we develop a theory of unsupervised RL? Our paper https://t.co/zst5S8K3CC Thread on what we found (and that gif):
 
 Reply      Retweet   3      Like     13    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     6/3/2021
This will be in the session "Reinforcement Learning for Robotics I" at 19:30 GMT+1/11:30 am PT Thu 6/3. You can find the paper here: https://t.co/SDEtxYcxgd Code: https://t.co/uPVACeN9cI Video: https://t.co/I1iyXQYSXt
 Reply      Retweet   2      Like     11    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     6/2/2021
Here are a few animations of new tasks that VAL masters in unseen test environments: for each task, the robot was first allowed to interact without any supervision, using its offline-trained policy + generative model to propose & practice goals. Then the user gave it a goal.
 Reply      Retweet   2      Like     11    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/13/2021
Offline RL + meta-learning enables industrial robots to learn new insertion tasks with near-perfect success rates with AWAC + PEARL + finetuning! w/ @tonyzzhao, @jianlanluo, @DeepMind, Intrinsic https://t.co/XIjDBtfhwx Short summary below:
 
 Reply      Retweet   1      Like     5    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/11/2021
When training your robot, don't start from scratch! In "Bridge Data", we study how big datasets robotic datasets (71 tasks, 7.2k demos!) bridge the generalization gap. For new tasks, use bridge data along with task data to boost generalization: https://t.co/fGG35ikRus A thread:
 
 Reply      Retweet   1      Like     4    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/8/2021
To avoid needing to store all training data, we can learn posterior q(theta) using any BNN approach, and then incorporate this as a regularizer when minimizing entropy at test time on unlabeled data.
 
 Reply      Retweet   1      Like     3    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/7/2021
Turns out they don't. It means no matter how many skills you try to learn, you will not capture all possible optimal policies! But there is some good news -- what you do learn is a group of skills whose average minimizes distance to the "hardest" (furthest) policy.
 
 Reply      Retweet   1      Like     3    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/8/2021
We need a better graphical model. What if we assume new datapoints are not arbitrary: if we are asked to classify a new OOD, it likely belongs to *one of* the classes, we just don't know which one! Now there is a relationship between theta and phi for each distribution!
 
 Reply      Retweet   1      Like     3    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/11/2021
In robotic learning, we start from scratch for every experiment, with custom per-task data. Generalization is poor, because the dataset is narrow. In supervised learning domains, there are large reusable datasets (e.g., ImageNet, MS-COCO). What would be "ImageNet" in robotics?
 
 Reply      Retweet   1      Like     2    

More by:   Sergey Levine
zoom in  
 



Sergey Levine    @svlevine     10/7/2021
Regular model-based RL methods train the model with MLE, but better models don't necessarily translate to better policies, hence understanding convergence is difficult. Can we devise a MBRL method where the model optimizes an objective that makes it better *for the policy*?
 
 Reply      Retweet   1      Like     2    

More by:   Sergey Levine
zoom in  
 









  More from:  


Sebastian Raschka
Author of the 'Python Machine Learning' book. Tweet about Python, deep learning research, open source. Asst Prof of Statistics @UWMadison. Opinions are my own.

DeepMind
Our team research and build safe AI systems. We're committed to solving intelligence, to advance science and humanity.

Google AI
Google AI is focused on bringing the benefits of AI to everyone. In conducting and applying our research, we advance the state-of-the-art in many domains.

Yann LeCun
Professor at NYU. Chief AI Scientist at Facebook. Researcher in AI, Machine Learning, etc. ACM Turing Award Laureate.

Facebook AI
Facebook AI focuses on bringing the world together by advancing AI, powering meaningful and safe experiences, and conducting open research.

Matthias Niessner
Professor for Visual Computing & Artificial Intelligence @TU_Muenchen Co-Founder @synthesiaIO