Robocrunch        AI

Anthony Goldbloom    @antgoldbloom   ·   9/14/2021
Kaggle's dataset platform passed 100K datasets last month If you're looking for datasets for your ML and data science projects, this is a great place to look
 Reply      Retweet   41      Like     209    

  Similar Tweets  

Yee Whye Teh    @yeewhye   ·   9/6/2021
The Turing-JBC-RSS Lab led by @cholmesuk, Peter Diggle & Sylvia Richardson is looking for researchers to be seconded to join the various interesting projects on using statistics, data science and ML to help with understanding the pandemic!
 Reply      Retweet   12      Like     15    

Weights & Biases    @weights_biases   ·   9/8/2021
See how quickly you can view and interactively explore language datasets with W&B Tables! In this video, @metaphdor demos some queries on the GoEmotions dataset 🎭
 Reply      Retweet   1      Like     5    

Thomas Wolf    @Thom_Wolf   ·   9/15/2021
Starting a big project takes a lot more time & energy than people expect. I've been pushing mostly one project per year: -2019 🤗Transformers -2020 🤗Datasets -2021 @BigscienceW I used to find it frustratingly slow now I accept it Give your projects the time they need to grow
 Reply      Retweet        Like     19    

Michal Wolski    @michalwols   ·   8/27/2021
With all of this #FoundationModels talk is anyone working on foundation datasets? A shared continuously growing public dataset to rival google's JFT would be much more valuable to the field than gigantic transformers trained on random text from reddit.
 Reply      Retweet   19      Like     98    

Sasha Rush    @srush_nlp   ·   9/8/2021
New preprint: Datasets ( documents the Hugging Face Datasets project, now containing more than 700 NLP datasets from over 300 contributors. NLP models haven't changed much recently, but datasets, and how we use and document them, have changed a lot ...
 Reply      Retweet   1      Like     7    

Hugging Face    @huggingface   ·   8/27/2021
The 🤗Hugging Face Hub Python wrapper makes it super easy to work with 🤗 Hub model & dataset repos. Search for, upload, download models & datasets without leaving your python runtime! We just released v0.0.16 w/ huge QOL upgrades 👏🎉 Come contribute!
 Reply      Retweet   49      Like     199    

Nandan Thakur    @Nthakur20   ·   9/9/2021
🚨New 🍻BEIR preprint on Arxiv🚨 What’s 🆕? Evaluated latest SOTA🚀 reranking (MiniLM), dense (TAS-B), and sparse (DeepCT, docT5query) models. Added a 🆕 dataset: Robust04. Using hole@10, found a few datasets with annotation biases. w/ @Nils_Reimers. pdf:
 Reply      Retweet        Like     13    

Y Combinator    @ycombinator   ·   8/17/2021
Welcome to S21, Unbox! Unbox is building a collaborative QA platform for the next generation of machine learning models. They make it easy to track and version all your models and datasets, allowing the team to focus on building production-ready models.
 Reply      Retweet   8      Like     35    

Anna Rogers    @annargrs   ·   7/28/2021
#NLPaperAlert: QA Dataset Explosion!🔥 A survey of 200+ QA/RC datasets proposing a taxonomy of formats & reasoning skills. Also in the bag: modalities, conversational QA, domains & beyond-English data. Honored to work on this with @nlpmattg & @IAugenstein
 Reply      Retweet   61      Like     240    

Nils Reimers    @Nils_Reimers   ·   9/8/2021
Easy access to datasets is a big game changer: It allows anyone to train on many datasets, leading to strong & robust models. Would be great to see more people contributing datasets, as this really is the limiting factor for many Machine Learning applications.
 Reply      Retweet        Like     14    

Michael Poli    @MichaelPoli6   ·   7/26/2021
Took a while. Get in touch if you're interested in contributing to open-source for neural diff eqs and implicit models! We have a lot of other interesting projects and collaborations underway.
 Reply      Retweet   2      Like     19    

Benedict Evans    @benedictevans   ·   8/28/2021
I've said this before, but if you write about Theranos as a 'Silicon Valley startup that raised from VCs' you will not understand what you're looking at. The fact that it didn't follow the SV model and didn't raise from VCs was an important part of the story.
 Reply      Retweet   137      Like     1342    

Rob Salomone    @SalomoneRob   ·   7/7/2021
New version of @PyroAi just came out, exciting! Been looking forward to teaching classes using it for implementing Normalizing Flows and Deep Latent Variable models (just a small amount of what Pyro can do!) at the @DiscoverAMSI Data Science Winter School starting next week!
 Reply      Retweet   5      Like     20    

Nils Reimers    @Nils_Reimers   ·   9/8/2021
📂1.2 Billion Training Pairs Training on large datasets is essential to generalize well across domains and tasks. Previous models were trained on rather small datasets of a few 100k train pairs and had issues on specialized topics. We collected 1.2B training pairs from ...
 Reply      Retweet   1      Like     3    

  Relevant People  

Anthony Goldbloom
CEO of @kaggle (a @google company)
Anthony Goldbloom 47.9

Weights & Biases
Developer tools for machine learning. Build better models faster with experiment tracking, dataset versioning, and model management.
Weights & Biases 38.3

Thomas Wolf
Co-founder & Chief Scientist at @HuggingFace – I lead the Open-Source & Science teams – 🤗Transformers & 🤗Datasets libraries – @BigScienceW research workshop
Thomas Wolf 44.4

Yee Whye Teh
Professor of Statistical Machine Learning at @OxCSML, @oxfordstats and Research Scientist at @DeepMindAI. All opinions are my own.
Yee Whye Teh 40.2

Hugging Face
The AI community building the future. #BlackLivesMatter #stopasianhate
Hugging Face 47.4

Michal Wolski
You might have seen me at @columbia @clarifai @nyufuturelabs
Michal Wolski 17.8

Anna Rogers
NLP researcher working on BERTology & data for teaching/testing NLU. Organiser of Insights from Negative Results @ EMNLP2021. Post-doc at @CPH_SODAS.
Anna Rogers 31.9