Peng Qi    @qi2peng2    11/23/2021      

Crowdsourcing is hard, technical work. I wish one day I get to write a paper on a crowdsourced dataset that mainly focuses on the idea behind the data, the dataset itself, and the 1000000s of considerations and technical hurdles on the way. Not the boilerplate baseline models.
    5         92


TheSequence    @TheSequenceAI    12/4/2021      

🔥2 New Super Models to Handle Any Type of Dataset We build models optimized for a specific type of dataset like: - text - audio - computer vision - etc. Is it possible to create a general model? @DeepMind unveils the answer. 1/7
    1         3

Stefan    @stefan_it_    12/6/2021      

📚 Update: we further release smaller multilingual Historic Language models, ranging from 2 - 8 layers (and different hidden sizes)🤗 📈Evaluation on NewsEye NER dataset shows a nice performance overview. 💡All models are available on the @huggingface Model Hub now!

Andrej Karpathy    @karpathy    11/30/2021      

1/3 Some panoptic segmentation eye candy 🌈🤩 from a new project we are bringing up. These are too raw to run in the car, but feed into auto labelers. Collaboration of data labeling a large (100K+), clean, diverse, multicam+video dataset and engineers who train the models
    6         28

Jigsaw    @Jigsaw    11/24/2021      

Today, we release Sentence Templates, a new dataset for Perspective API, that allows researchers and developers of other machine learning models to test for biases that may be undermining their own work. Read more about our work on Medium.