Naveen Rao    @NaveenGRao    10/8/2021      

In the context of neural network training, we can usually find ways to split computation over a desired number of devices. So, we don’t need to think monolithically and jam everything in one mega processor.




Jürgen Schmidhuber    @SchmidhuberAI    12/6/2021      

25th anniversary of the LSTM at #NeurIPS2021. reVIeWeR 2 - who rejected it from NeurIPS1995 - was thankfully MIA. The subsequent journal publication in Neural Computation has become the most cited neural network paper of the 20th century:
    16         120

MosaicML    @MosaicML    12/4/2021      

Transfer learning has become a key tool in efficiently training deep neural networks. Typically, a network is pretrained using a large amount of data on a related supervised or self-supervised task. Then, the final few layers (the "head") are removed and a new head is added.(2/8)

Reza Zadeh    @Reza_Zadeh    9/27/2018      

When you use a 10 layer Deep Neural Network where Logistic Regression would suffice
    2877         8859

Omar Sanseviero    @osanseviero    11/29/2021      

I usually don't play poker, but when I do I play it the Jax style.
    1         34

TheSequence    @TheSequenceAI    11/30/2021      

Modern deep neural networks are large and require incredibly large training datasets. The traditional sequential approach is simply impractical. But we can use parallel training. The idea of parallelizable training is intuitive but incredibly hard to achieve. 1/2
    1         7