robocrunch
Eric Jang
@ericjang11
VP of AI, Halodi Robotics
Tweets by Eric Jang
patterns of information trying to preserve themselves exist at all space and time scales.
Shared by
Eric Jang
at
5/14/2022
if a career in crypto isn't high-risk enough for you and you'd like to join a humanoid robotics startup, hmu! I'm looking to grow my team soon
Shared by
Eric Jang
at
5/13/2022
If DQN and GATO are overhyped (implying somewhat obvious / easy to do?), can you suggest some similar will-be-overhyped ideas for the "deep learning junkies" to try? It's easy to trivialize other people's work in hindsight, was wondering if you could predict the future instead.
Shared by
Eric Jang
at
5/13/2022
I've really come around on the utility of training large-scale models since 2017. These efforts offer a glimpse into what "everyday programmer's compute" will look like in 1-2 decades. Good use of ad revenue, as opposed to spending ad revenue to make more ad revenue
Shared by
Eric Jang
at
5/13/2022
Despite not being SOTA on every "niche", model consolidation may yet win because 1) still improves with compute 2) saves dev time. That's the entire point of generality - trade off some specialized efficiency for ability to do many more things.
Shared by
Eric Jang
at
5/13/2022
On the other hand, nature still occasionally evolves generalized things (opposable thumbs, neocortex), which begs the question of what constraints allow for generality to prosper.
Shared by
Eric Jang
at
5/13/2022
Drawing an analogy to ML models, perhaps there *always* exists specialized models that outperform some general-purpose model, and that explains why multi-domain models tend to not outperform heavily tuned SOTA on popular "niches".
Shared by
Eric Jang
at
5/13/2022
Even if this sort of model never beats SOTA, the glass-half-full view is that this is a small victory for "model consolidation" https://t.co/ZdpoVq6aRM You can use GATO hparams + net and get within 70-80% of SOTA for pretty much any research benchmark.
Shared by
Eric Jang
at
5/13/2022
It remains an open question as to whether this "one model to learn them all" concept will eventually be SOTA or not (via cross-domain generalization / using language as a substrate of generalization), and if so, at what scale.
Shared by
Eric Jang
at
5/13/2022
The GATO paper buries this result a bit, but training on everything doesn't "just work" (yet). Zamir et al 2018 (Taskonomy) https://t.co/dRoG0Rs0PD suggests that in the presence of finite model capacity, whether some tasks positively or negatively transfer is non-trivial.
Shared by
Eric Jang
at
5/13/2022
Kaiser 2017's "multimodel" got about 70-80% of the single-domain SOTA, unsure how GATO performs but I suspect it to be similar if not slightly better.
Shared by
Eric Jang
at
5/13/2022
GATO cites @lukaszkaiser 2017 ("one model to learn them all"), which had essentially the same high level idea and empirical findings: "kind of works, but not a slam dunk yet for the narrative of cross-domain generalization yet" https://t.co/nHmXyAioHk
Shared by
Eric Jang
at
5/13/2022
I love how DeepMind's large-scale papers now include author contributions, you can see that a lot of the work went into scalable data loading
Shared by
Eric Jang
at
5/13/2022
A short thread about DeepMind's recent GATO paper. It trains a basic transformer on an impressive number of datasets
Shared by
Eric Jang
at
5/13/2022