Computational neuroscience has lately had great success at modeling perception with ANNs – but it has been unclear if this approach translates to higher cognitive systems. We made some exciting progress in modeling human language processing https://www.biorxiv.org/content/10.1101/2020.06.26.174482v1.
This work is the result of a terrific collaboration with Idan A. Blank, Greta Tuckute, Carina Kauf, Eghbal A. Hosseini, Nancy Kanwisher, Josh Tenenbaum and Ev Fedorenko.
Work by Ev Fedorenko and others has localized the language network as a set of regions that support high-level language processing (e.g. https://www.sciencedirect.com/science/article/pii/S136466131300288X) BUT the actual mechanisms underlying human language processing have remained unknown.
To evaluate model candidates of mechanisms, we use previously published human recordings: fMRI activations to short passages (Pereira et al., 2018), ECoG recordings to single words in diverse sentences (Fedorenko et al., 2016), fMRI to story fragments (Blank et al. 2014). More specifically, we present the same stimuli to models that were presented to humans and “record” model activations. We then compute a correlation score of how well the model recordings can predict human recordings with a regression fit on a subset of the stimuli.
Since we also want to figure out how close model predictions are to the internal reliability of the data, we extrapolate a ceiling of how well an “infinite number of subjects” could predict individual subjects in the data. Scores are normalized by this estimated ceiling.
So how well do models actually predict our recordings? We tested 43 diverse language models, incl. embedding, recurrent, and transformer models. Specific models (GPT2-xl) predict some of the data near perfectly, and consistently across datasets. Embeddings like GloVe do not.
The scores of models are further predicted by the task performance of models to predict the next word on the WikiText-2 language modeling dataset (evaluated as perplexity, lower is better) – but NOT by task performance on any of the GLUE benchmarks.
Since we only care about neurons because they support interesting behaviors, we tested how well models predict human reading times: specific models again do well and their success correlates with 1) their neural scores, and 2) their performance on the next-word prediction task.
We also explored the relative contributions to brain predictivity of two different aspects of model design: network architecture and training experience, ~akin to evolutionary and learning-based optimization. (see also this recent work). Intrinsic architectural properties (like size and directionality) in some models already yield representational spaces that – without any training – reliably predict brain activity. These untrained scores predict scores after training. While deep learning is mostly focused on the learning part, architecture alone works surprisingly well even on the next-word prediction task. Critically for the brain datasets, a random embedding with the same number of features as GPT2-xl does not yield reliable predictions.
Summary: 1) specific models accurately predict human language data; 2) their neural predictivity is correlated with task performance to predict the next word, 3) and with their ability to predict human reading times; 4) architecture alone already yields reasonable scores. These results suggest that predicting future inputs may shape human language processing, and they enable using ANNs as embodied hypotheses of brain mechanisms. To fuel future generations of neurally plausible models, we will soon release all our code and data.
Certain ANNs are surprisingly good models of primate vision, but require millions of supervised synaptic updates — this unbiological development has been the recent focus of many discussions in neuroscience. Is all this training really necessary? We approach this in new work https://www.biorxiv.org/content/10.1101/2020.06.08.140111v1.
Neuroscientists have argued for innate structure with only thin learning on top, i.e. where structure the genome dictates brain connectivity and is leveraged for rapid experience-dependent development. We took first steps at this with more brain-like neural networks.
We started from CORnet-S, the current top model on neural and behavioral benchmarks in Brain-Score.org. We first found that variants of this model which are trained for only 2% of supervised updates (epochs x images) already achieve 80% of the trained model’s score.
Even without any updates, the models’ brain predictivities are well above chance. Examining this “at-birth” synaptic connectivity and improving it with a new method “Weight Compression”, we can reach 54% without any training at all
However, to be more brain-like we require at least some training — but ideally this would not change millions of synapses requiring precise machinery to coordinate the updates. By training only critical down-sampling layers, we achieve 80% when updating only 5% of synapses.
Applying these three strategies in combination (reducing supervised epochs x images + improved at-birth connectivity + reducing synaptic updates), we achieve ~80% of a fully trained model’s brain predictivity with two orders of magnitude fewer supervised synaptic updates.
Taking a step back, we think these are first steps to model not just primate adult visual processing during inference, but also how the system is wired up from an evolutionary birth state encoded in the genome and by developmental update rules. Lots more work to do!
Very excited to share that I was awarded the Takeda Fellowship on research at the intersection of AI and Health!
The work lead by Jonas Kubilius and me on “Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs” was accepted to NeurIPS as an oral! Only 36 (or 0.5%) out of the 6743 submissions were selected as an oral, so we’re very excited to present our ideas how Machine Learning and Neuroscience can interact again in the form of models of the brain.
The McGovern institute and Robin.ly have also reported about our findings.
The field of Machine Learning is doing pretty well at quantifying its goals and progress, yet Neuroscience is lagging behind in that regard — current claims are often qualitative and not rigorously compared with other models across a wider spectrum of tasks.
Brain-Score is our attempt to speed up progress in Neuroscience by providing a platform where models and data can compete against each other: https://www.biorxiv.org/content/early/2018/09/05/407007
Deep neural networks trained on ImageNet classification do the best on our current set of benchmarks and there is a lot of criticism about the mis-alignment between these networks and the primate ventral stream: mapping between the many layers and brain regions is unclear, the models are too large and are just static feed-forward processors.
We thus created a more brain-like model, “CORnet”, which does well on Brain-Score with only four areas and recurrent processing: https://www.biorxiv.org/content/early/2018/09/04/408385
EDIT: Science Magazine wrote a news piece about the use of deep neural networks as models of the brain with the final paragraphs devoted to Brain-Score: http://sciencemag.org/news/2018/09/smarter-ais-could-help-us-understand-how-our-brains-interpret-world
Finally out (in PNAS)! Our paper on recurrent computations for the recognition of occluded objects, in humans as well as models. Feed-forward alone doesn’t seem to cut it, but attractor dynamics help; similarly the brain requires recurrent processing to untangle highly occluded images.
We have some pretty visualization gifs in the github, along with the code: https://github.com/kreimanlab/occlusion-classification
EDIT: MIT News covered our work, along with a video of us giving the intuition behind it: http://news.mit.edu/2018/mit-martin-schrimpf-advancing-machine-ability-recognize-partially-seen-objects-0920
Integreat has been selected as one of 10 finalists at Google.org’s impact challenge in the category “lighthouse projects”! This secures 250,000€ of funding that we will use to expand on the job market capabilities.
[Google.org page] [selected press coverage]
Summer Internship work is out in ICLR! Automatic architecture search finds non-intuitive (at least to me) architecture including sine curves and division.
I’m really glad to have worked with a fantastic team at Salesforce Research, most closely with Stephen Merity and Richard Socher.
Paper + Reviews: https://openreview.net/forum?id=SkOb1Fl0Z
It’s done! I finished my Master’s Thesis which focused on the idea and implementation of recurrent neural networks in computer vision, inspired by findings in neuroscience. The two main applications of this technique shown here are the recognition of partially occluded objects and the integration of context cues.
Here’s the link: Brain-inspired Recurrent Neural Algorithms for Advanced Object Recognition – Martin Schrimpf
There is a new project we are beginning to look into which analyzes today’s neural networks in terms of stability and plasticity.
More explicitly, we evaluate how well these networks can cope with changes to their weights and how well they can adapt to new information. Some preliminary results suggest that if weights in lower layers are perturbed, this has a more severe effect on performance than if higher layers are perturbed. This has a nice correlation to neuroscience where it is assumed that our hierarchically lower cortical layers in the visual cortex remain rather fixed over the years.
Update: we just uploaded a version to arXiv (https://arxiv.org/abs/1703.08245) which is currently under review at ICML.