robocrunch
Simone Scardapane
@s_scardapane
I fall in love with a new #machinelearning topic every month 🙄 Researcher @SapienzaRoma | Chairman @iaml_it | Co-host @SmarterPodcast | @GoogleDevExpert
Tweets by Simone Scardapane
*Masked Siamese Networks for Label-Efficient Learning* by @mido_assran @mcaron31 @imisra_ @p_bojanowski @armandjoulin et al. Neat new self-supervised method that combines "patch masking" with a siamese architecture and DINO-like training. https://t.co/iFiZgVhJhv
Shared by
Simone Scardapane
at
5/2/2022
*PolyLoss: A Polynomial Expansion Perspective of Classification Loss* #ICLR by @tanmingxing @chenxi116 @ekindogus @AnguelovDrago et al. A one-line change can improve your loss, by perturbing the leading coefficient of the corresponding Taylor expansion. https://t.co/xd5J6FKKli
Shared by
Simone Scardapane
at
4/30/2022
*Planting Undetectable Backdoors in Machine Learning Models* by Goldwasser @mikekimbackward @Vinod_MIT Zamir Fascinating theoretical paper on using cryptography (in particular digital signatures) to include "undetectable" backdoors in a classifier. https://t.co/dr52Os9elw
Shared by
Simone Scardapane
at
4/28/2022
Thrilled to announce the 1st workshop on Dynamic Neural Networks @icmlconf 2022! Perfect place to explore novel ideas to make neural networks more "dynamic" in terms of architectures, algorithms, reasoning, etc. We invite contributed talks and posters: https://t.co/bmgvTArM9f
Shared by
Simone Scardapane
at
4/21/2022
*Machine learning for medical imaging: methodological failures and recommendations for the future* by @DrVeronikaCH @GaelVaroquaux Nice paper describing common issues and missteps in applying ML models for medical imaging and diagnostics. https://t.co/Zsw1QZpEbL
Shared by
Simone Scardapane
at
4/20/2022
If you are interested in graph neural networks, I made a quick Colab tutorial on performing graph classification with #PyTorch Geometric & @PyTorchLightnin, and then explaining the predictions with GNNExplainer and/or Captum. https://t.co/Bsf4IsYUOz
Shared by
Simone Scardapane
at
4/15/2022
*Object-Region Video Transformers* by @roeiherzig @Karttikeya_m @_amirbar @GalChechik @trevordarrell et al. Integrating an object detector in a Video Transformer provides significant improvements in accuracy and interpretability for action recognition. https://t.co/GO9cuptoZS
Shared by
Simone Scardapane
at
4/1/2022
There are a few additional interesting bits, including how to simplify / approximate the frame averaging whenever needed, and a new class of graph models based on this framework. Paper is here: https://t.co/BJTOVlLBR0
Shared by
Simone Scardapane
at
3/30/2022
*Block-Recurrent Transformers* by @Yuhu_ai_ @ethansdyer @bneyshabur et al. I am always up for new recurrent models! 👀 Here, the state and readout updates are replaced with parallel transformers models, and the recurrence operates on blocks. https://t.co/QQrNF7DskM
Shared by
Simone Scardapane
at
3/29/2022
*Symmetry-Based Representations for Artificial and Biological General Intelligence* by Higgins Racanière @DaniloJRezende Nice introductory paper on the role that symmetries are gaining in learning representations both in AI and neuroscience. https://t.co/bcBXJjzC8y
Shared by
Simone Scardapane
at
3/25/2022
*Transformer Memory as a Differentiable Search Index* by @YiTayML @vqctran @m__dehghani et al. Intriguing paper that proposes to replace an entire information retrieval system with a single transformer model trained to predict the document id (docid). https://t.co/BTYAzIxg7U
Shared by
Simone Scardapane
at
3/23/2022
Recently, I have been playing with @TensorFlow Probability. Since the documentation is a little bit scattered, I put together a small Colab tutorial. 👀 It goes from using distributions to performing VI on Bayesian neural nets. https://t.co/4c20zab8BK
Shared by
Simone Scardapane
at
3/22/2022
*GFlowNets for Discrete Probabilistic Modeling* @alex_volokhova The basic GFlowNet assumes your reward function is given, but you can also train it jointly using ideas from energy-based modelling. In this work, they use it to generate images. 11/n https://t.co/Kl8bAB0K0v
Shared by
Simone Scardapane
at
3/10/2022
*Biological Sequence Design with GFlowNets* @bonadossou @JainMoksh @alexhdezgcia @folinoid @Mila_Quebec Another cool application: the design of biological sequences with specific characteristics (I admit I am a little bit out of my depth here). 10/n https://t.co/ipbTYfYFdI
Shared by
Simone Scardapane
at
3/10/2022
*Bayesian Structure Learning with Generative Flow Networks* by @TristanDeleu @AntGois @ChrisEmezue @SimonLacosteJ Moving to applications, here they leverage GFlowNets to get state-of-the-art results in learning the structure of Bayesian networks. 9/n https://t.co/yBuvxPrJ1q
Shared by
Simone Scardapane
at
3/10/2022
*Trajectory Balance: Improved Credit Assignment in GFlowNets* Building on it, @JainMoksh @folinoid @ChenSun92 et al. show a much better training criterion by sampling entire trajectories, making training significantly faster. 8/n https://t.co/5m8eq6UVfB
Shared by
Simone Scardapane
at
3/10/2022
Under this interpretation, you train a neural network to predict how the flow goes through the graph, by imposing that the incoming and outgoing flows at each node are conserved. With this, you get one consistency equation per node that you can enforce with a loss function. 5/n
Shared by
Simone Scardapane
at
3/10/2022
*Generative Flow Networks* A new method to sample structured objects (eg, graphs, sets) with a formulation inspired to the state space of reinforcement learning. I have collected a few key ideas and pointers below if you are interested. 👀 1/n 👇
Shared by
Simone Scardapane
at
3/10/2022
Interestingly, this was concurrently proposed in an #ICLR submission by @anirudhg9119 @matteohessel et al. *Learning by Directional Gradient Descent* Their DODGE estimator is equivalent, and they also investigate different types of perturbations. https://t.co/ABidyZ3Nmn
Shared by
Simone Scardapane
at
2/22/2022
*Gradients without Backpropagation* by @atilimgunes @BAPearlmutter et al. Since everyone in my small bubble is talking about this paper, let me do a quick dive. 👀 🧵 https://t.co/UP0aqG9Yaa
Shared by
Simone Scardapane
at
2/22/2022
*Structured Denoising Diffusion Models in Discrete State-Spaces* by @jacobaustin132 @_ddjohnson @hojonathanho @dtarlow2 @vdbergrianne I am slacking off a bit, but always find time for another cool paper on diffusion models. 👀 (Hint by @EmilianPostola1) https://t.co/9NMC0wjpRC
Shared by
Simone Scardapane
at
2/21/2022
*Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for ML* #AISTATS @james_y_zou Cool application of Shapley values to determine the importance of each point in the training of a neural network. https://t.co/KbltFK1xO6
Shared by
Simone Scardapane
at
2/15/2022
The tutorial also has a very short Colab notebook attached: https://t.co/mDoXC7wDfn I love playing at the edge of OO and FP programming in deep learning! Very complex things become easy, and vice versa.
Shared by
Simone Scardapane
at
1/19/2022
*PiRank: Scalable Learning To Rank via Differentiable Sorting* by @conv3xhull @adityagrover_ Charron @StefanoErmon Nice #NeurIPS2021 paper on learning-to-rank via a simple relaxation of permutation matrices. https://t.co/wnW8uSPPmK
Shared by
Simone Scardapane
at
11/29/2021
Has anyone tested this? It looks like a pretty big deal for any #PyTorch user. 👀
Shared by
Simone Scardapane
at
11/12/2021
*Adaptive Machine Unlearning* by Gupta @crispy_jung @sethvneel @Aaroth Sharifi-Malvajerdi @ChrisWaites (Algorithms to handle unlearning when the probability of a certain request depends on the model itself) https://t.co/QayRVReCgR 2/3
Shared by
Simone Scardapane
at
11/4/2021
Time to go through #NeurIPS2021 submissions! Today: two awesome papers studying the link between machine unlearning & differential privacy 👇 1/3
Shared by
Simone Scardapane
at
11/4/2021
Several ideas expand on Chapter 7 from @D2L_ai. The geometric deep learning bits are instead from the fantastic GDL overview by @mmbronstein @joanbruna @TacoCohen @PetarV_93 : https://t.co/lVM7tAwOrY /n
Shared by
Simone Scardapane
at
11/3/2021
*Neural networks for data science* lecture 4 is out! 👇 aka "here I am talking about convolutional neural networks while everyone asks me about transformers" /n
Shared by
Simone Scardapane
at
11/3/2021
*VQ-GNN: A Framework to Scale up GNNs using Vector Quantization* by @KezhiKong @johnpdickerson @furongh @tomgoldsteincs & al. #NeurIPS Elegant way to do mini-batching of nodes in a GNN, by approximating the remaining graph with a vector quantization. https://t.co/wz8NpA6H6b
Shared by
Simone Scardapane
at
11/2/2021
You can also find a set of exercises to investigate https://t.co/WudvfEYBra, training models from scratch, using TensorFlow metrics, etc. /n https://t.co/6IS5DL5AAz
Shared by
Simone Scardapane
at
10/21/2021
Neural networks for data science: lecture 2 is out! Learn linear regression, and you are halfway there to deep learning. 🙃 👇 /n
Shared by
Simone Scardapane
at
10/6/2021
I find historical perspectives of deep learning centering on data quite refreshing. Modern books on ML (e.g. "Pattern, Predictions, Actions" by @beenwrekt @mrtz) are making great strides in popularizing them. 5/x
Shared by
Simone Scardapane
at
10/5/2021
*On the genealogy of ML datasets: A critical history of ImageNet* by @cephaloponderer @alexhanna @amironesei @andrewthesmart Nicole Intriguing article exploring some norms & values implicit in datasets such as ImageNet, and the outsized influence they end up having in ML. 1/x
Shared by
Simone Scardapane
at
10/5/2021
*Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking* by @michael_sejr @nicola_decao @iatitov Cool #ICLR2021 spotlight paper on finding explanations for graph neural networks that are as faithful as possible. https://t.co/pPRJov5yCa
Shared by
Simone Scardapane
at
9/30/2021
*Meta-Learning for Training Explainable Graph Neural Networks* Can we train a model to be more explainable? In our new preprint, we experiment with meta-learning and graph networks by optimizing on-the-fly explanations. cc @IndroSpinelli @IspammL https://t.co/1iVOhrb8gB
Shared by
Simone Scardapane
at
9/28/2021
FastFormer: yet another attention model that might be everything you need. 🤞 It simplifies the transformer by aggregating the queries and the key-queries w/ two learnable sets of coefficients. Sort of unintuitive, but it seems to work very well. https://t.co/NTRf6RkhHD
Shared by
Simone Scardapane
at
9/13/2021
*Towards Domain-Agnostic Contrastive Learning* #ICML 2021 by @vikasverma1077 @lmthang Kawaguchi @hieupham789 @quocleix Here's an interesting question: can we do self-supervised learning if we know *nothing* about the domain we are operating on? /n
Shared by
Simone Scardapane
at
7/26/2021
Quick reminder that we are live at 18:30 if you are interested in speech processing and self-supervision. We are hosting on Google Meet, so register on Meetup or Eventbrite to access the link: @SpeechBrain1 @ParcolletT @mcaron31 https://t.co/cXqinP1jAe
Shared by
Simone Scardapane
at
7/7/2021
Can't wait for the code release in #JAX! They have a lot of nice experiments including dataset distillation and molecular dynamics. The paper is here: https://t.co/OWVmuhefAz Thanks to @unsorsodicorda for the paper suggestion. 🙃
Shared by
Simone Scardapane
at
7/5/2021
⚡️Organized by the Rome Machine Learning / Data Science Meetup; ⚡️Supported by @iaml_it; ⚡️Free registration on Eventbrite; ⚡️English (exceptionally this month); ⚡️ July 7th at 18:30 CEST: https://t.co/cXqinP1jAe
Shared by
Simone Scardapane
at
6/21/2021
Twitter friends, time for suggestions! I have a neural network course that I teach with slides (Beamer) & notebooks. I'd like to innovate a little, but I am unsure about nice teaching tools that have math & code support & are collaborative. Any ideas? Am I too old?
Shared by
Simone Scardapane
at
6/18/2021
Naive score-based models are uncommon, because sampling starts in poorly approximated regions. The solution is noise-conditional score models, that perturb the original input, and generate data using "annealed" Langevin dynamics. https://t.co/68WHvyWc3H
Shared by
Simone Scardapane
at
6/16/2021
Denoising diffusion turns out to be similar to "score-based" models, pioneered by @YSongStanford and @StefanoErmon @YSongStanford has written an outstanding blog post on these ideas, so I'll just skim some of the most interesting connections: https://t.co/Xf4ZduMSAd /n
Shared by
Simone Scardapane
at
6/16/2021
A cornerstone in diffusion models is the introduction of "denoising" versions by @hojonathanho @ajayj_ @pabbeel They showed how to make diffusion models perform close to the state-of-the-art using a suitable reformulation of their training objective. /n
Shared by
Simone Scardapane
at
6/16/2021
The starting point for diffusion models is probably "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" by @jaschasd Weiss @niru_m @SuryaGanguli Classic paper, definitely worth reading: https://t.co/ytpVNw2RHV /n
Shared by
Simone Scardapane
at
6/16/2021
*Score-based diffusion models* An emerging approach in generative modelling that is gathering more and more attention. If you are interested, I collected some introductive material and thoughts in a small thread. 👇 Feel free to weigh in with additional material! /n
Shared by
Simone Scardapane
at
6/16/2021
What is intriguing is that the framework interpolates between multiple scenarios: first solution step is the original GD, while closed-form solution (in one case) is similar to a pre-conditioned GD model. Optimization is "local" in the sense that it decouples across layers. /n
Shared by
Simone Scardapane
at
6/14/2021
*LocoProp: Enhancing BackProp via Local Loss Optimization* by @esiamid @_arohan_ & Warmuth Interesting approach to bridge the gap between first-order, second-order, and "local" optimization approaches. 👇 /n
Shared by
Simone Scardapane
at
6/14/2021
*How Attentive are Graph Attention Networks?* If you have ever used GATs, unmissable paper by Brody @urialon1 @yahave 👇 They show the standard formulation of GAT suffers from a significant limitation, solved easily by modifying the attention mechanism. 1/2
Shared by
Simone Scardapane
at
6/3/2021
Because I got asked by so many people, I have open-sourced all videos for my Reproducible Deep Learning course! 🥳 ~ 16 hours to combine with 13 code branches and quite a lot of slides. Videos are far from polished but I hope you enjoy them. 🙃 All here: https://t.co/wwKkJDhM8Z
Shared by
Simone Scardapane
at
6/1/2021
*Learning Hydra for configuring ML experiments* I wrote a lengthy tutorial on using @Hydra_Framework in #machinelearning experiments. I go from basic configuration to advanced use-cases, including instantiating a model directly and validating the schema. 👇 1/3
Shared by
Simone Scardapane
at
5/26/2021
We are running all these experiments, but are we even sure the code works as intended? In exercise 6, we use a function from @PyTorchLightnin to overfit on a random batch of data and validate the training loop in a unit test. Inspired by @karpathy: https://t.co/uZzjDGrRNp /n
Shared by
Simone Scardapane
at
5/20/2021
*Reproducible Deep Learning* Lectures 5 and 6 are out! Now that our code, data, and environment are all safely versioned, what to do? Run a lot of experiments! We use @weights_biases to manage the runs, and @github Actions to sparkle some CI. Small thread below. 👇 /n
Shared by
Simone Scardapane
at
5/20/2021