Related  





Jack Clark    @jackclarkSF    11/29/2021      

2 years means frontier of dense generative models goes from 1.6billion parameters (Salesforce, following GPT-2), to 530billion parameters (Microsoft Megatron-Turing https://t.co/md03Qz8K90). 330X increase in model size, not too shabby.
  
    7         57