Robocrunch        AI

Samarth Sinha    @_sam_sinha_   ·   9/14/2021
Our paper on “Surprisingly Simple Self-Supervision for Offline RL” got accepted at CoRL 2021! Find out how simple data augmentations from states can help Q-learning algorithms on a variety of robotics tasks! J/w: @AjayMandlekar @animesh_garg
 Reply      Retweet   12      Like     84    

  Similar Tweets  

DeepMind    @DeepMind   ·   9/16/2021
In part two of the model-free lecture, Hado explains how to use prediction algorithms for policy improvement, leading to algorithms - like Q-learning - that can learn good behaviour policies from sampled experience: #DeepMindxUCL @ai_ucl
 Reply      Retweet   2      Like     7    

Posted by:

Sergey Levine    @svlevine   ·   7/16/2021
Data-driven design is a lot like offline RL. Want to design a drug molecule, protein, or robot? Offline model-based optimization (MBO) tackles this, and our new algorithm, conservative objective models (COMs) provides a simple approach: A thread:
 Reply      Retweet   45      Like     204    

Animesh Garg    @animesh_garg   ·   9/15/2021
Data Augmentation helps with performance improvement in Offline RL. - across the board with most envs and most algorithms! Surprising finding and hence the name!
 Reply      Retweet   2      Like     8    

Sergey Levine    @svlevine   ·   6/3/2021
1/ The basic idea is simple: take all dimensions of states, actions, and rewards, discretize them, and model them as a long sequence of discrete tokens. Then use a language model. This can be used to predict, and also to control.
 Reply      Retweet   2      Like     28    

Antonin Raffin    @araffin2   ·   9/15/2021
"Smooth Exploration for Robotic Reinforcement Learning" was accepted at @corl_conf 2021 =)! In this joint work w/ Jens Kober and Freek Stulp, we present a simple alternative to Gaussian noise exploration for using RL on real robots Paper:
 Reply      Retweet   1      Like     3    

Sergey Levine    @svlevine   ·   7 hours
Multi-task RL is hard. Multi-task offline RL is also hard. Weirdly, sharing data for all tasks (and relabeling) can actually make it harder. In conservative data sharing (CDS), we use conservative Q-learning principles to address this. arXiv: A thread:
 Reply      Retweet   12      Like     59    

Matt Fontaine    @tehqin17   ·   9/14/2021
Amazed at how such complex levels emerge from such small and simple NCAs trained via CMA-ME, where the constraints and variation are defined from very simple functions. No target data required!
 Reply      Retweet   1      Like     4    

Xiaolong Wang    @xiaolonw   ·   7/2/2021
Introducing 𝗦𝗩𝗘𝗔, Stabilizing Value Estimation under Augmentation. While RAD / DrQ provided great studies on how data aug affects RL, most augmentations make training unstable. SVEA stabilizes the training across various augmentations and generalize RL w/ Vision Transformers.
 Reply      Retweet   2      Like     24    

Ross Wightman    @wightmanr   ·   7/12/2021
Another issue here, augmentations aside, some of the models were pretrained on imagenet-21k and that's not mentioned...
 Reply      Retweet   1      Like     12    

Sergey Levine    @svlevine   ·   7/27/2021
Offline RL+meta-learning is a great combo: take data from prior tasks, use it to meta-train a good RL method, then quickly adapt to new tasks. But it's really hard. With SMAC, we use online self-supervised finetuning to make offline RL work: A thread:
 Reply      Retweet   43      Like     162    

Ettore Randazzo    @RandazzoEttore   ·   9/14/2021
Very cool work! It's great to see that having compact and powerful models such as NCA can allow for quality diversity algorithms to be used efficiently.
 Reply      Retweet   2      Like     1    

Simone Scardapane    @s_scardapane   ·   7/7/2021
Quick reminder that we are live at 18:30 if you are interested in speech processing and self-supervision. We are hosting on Google Meet, so register on Meetup or Eventbrite to access the link: @SpeechBrain1 @ParcolletT @mcaron31
 Reply      Retweet   2      Like     5    

Andrei Bursuc    @abursuc   ·   7/28/2021
So happy about this fun and simple #iccv2021 project on OOD detection in semantic segmentation with Victor Besnier and @david_picard We were eager to share since quite a while already. We'll release paper and code soon.
 Reply      Retweet   4      Like     26    

Sheng Zhang    @sheng_zh   ·   9/14/2021
How to extract relation whose args never co-occur in a paragraph when distant supervision is very noisy and SOTA models are less effective? Checkout our #EMNLP2021 paper "Modular Self-Supervision for Document-Level Relation Extraction"
 Reply      Retweet   1      Like     10    

  Relevant People  

Samarth Sinha
Researching ML @FacebookAI & @VectorInst Prev: Undergrad @UofT, Visitor @Mila_Quebec @Berkeley_AI
Samarth Sinha 21.4

Sergey Levine
Associate Professor at UC Berkeley
Sergey Levine 43.3

Our team research and build safe AI systems. We're committed to solving intelligence, to advance science and humanity.
DeepMind 60.0

Andrew Gordon Wilson
Machine Learning Professor
Andrew Gordon Wilson 39.3

Matt Fontaine
roboticist in training @ USC, Algorithms Live! host, ICPC judge, USACO coach
Matt Fontaine 7.63

Antonin Raffin
Researcher in robotics and machine learning (Reinforcement Learning). Member of Stable-Baselines team:
Antonin Raffin 23.6

PyTorch Lightning
The lightweight PyTorch AI research framework. Scale your models, not the boilerplate! Use our platform @gridai_ to scale models from your laptop to the cloud.
PyTorch Lightning 37.0