Please also see my Google Scholar and Semantic Scholar profiles for a list of all publications.
2022
Geiger*, Franziska; Schrimpf*, Martin; Marques, Tiago; DiCarlo, James J.
Wiring Up Vision: Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream Inproceedings
In: International Conference on Learning Representations (ICLR) Spotlight, 2022.
@inproceedings{<LineBreak>GeigerSchrimpf2022WiringUp,
title = {Wiring Up Vision: Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream},
author = {Franziska Geiger* and Martin Schrimpf* and Tiago Marques and James J. DiCarlo},
url = {https://openreview.net/forum?id=g1SzIRLQXMM},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {International Conference on Learning Representations (ICLR) Spotlight},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2021
Schrimpf, Martin; Blank, Idan A; Tuckute, Greta; Kauf, Carina; Hosseini, Eghbal A; Kanwisher, Nancy G; Tenenbaum, Joshua B; Fedorenko, Evelina
The neural architecture of language: Integrative modeling converges on predictive processing Journal Article
In: Proceedings of the National Academy of Sciences (PNAS), 2021.
@article{Schrimpf2021language,
title = {The neural architecture of language: Integrative modeling converges on predictive processing},
author = {Martin Schrimpf and Idan A Blank and Greta Tuckute and Carina Kauf and Eghbal A Hosseini and Nancy G Kanwisher and Joshua B Tenenbaum and Evelina Fedorenko},
url = {https://www.pnas.org/content/118/45/e2105646118},
doi = {10.1073/pnas.2105646118},
year = {2021},
date = {2021-11-09},
urldate = {2021-11-20},
journal = {Proceedings of the National Academy of Sciences (PNAS)},
publisher = {Cold Spring Harbor Laboratory},
abstract = {The ability to share ideas through language is our species' signature cognitive skill, but how this feat is achieved by the brain remains unknown. Inspired by the success of artificial neural networks (ANNs) in explaining neural responses in perceptual tasks (Kell et al., 2018; Khaligh-Razavi & Kriegeskorte, 2014; Schrimpf et al., 2018; Yamins et al., 2014; Zhuang et al., 2017), we here investigated whether state-of-the-art ANN language models (e.g. Devlin et al., 2018; Pennington et al., 2014; Radford et al., 2019) capture human brain activity elicited during language comprehension. We tested 43 language models spanning major current model classes on three neural datasets (including neuroimaging and intracranial recordings) and found that the most powerful generative transformer models (Radford et al., 2019) accurately predict neural responses, in some cases achieving near-perfect predictivity relative to the noise ceiling. In contrast, simpler word-based embedding models (e.g. Pennington et al., 2014) only poorly predict neural responses (textless10% predictivity). Models' predictivities are consistent across neural datasets, and also correlate with their success on a next-word-prediction task (but not other language tasks) and ability to explain human comprehension difficulty in an independent behavioral dataset. Intriguingly, model architecture alone drives a large portion of brain predictivity, with each model's untrained score predictive of its trained score. These results support the hypothesis that a drive to predict future inputs may shape human language processing, and perhaps the way knowledge of language is learned and organized in the brain. In addition, the finding of strong correspondences between ANNs and human representations opens the door to using the growing suite of tools for neural network interpretation to test hypotheses about the human mind.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Schrimpf, Martin; Mc Grath, Paul; DiCarlo, James J.
In: Champalimaund Research Symposium (CRS), 2021.
@inproceedings{Schrimpf2021topographic,
title = {Topographic artificial neural networks predict the behavioral effects of causal perturbations in primate visual ventral stream IT},
author = {Martin Schrimpf and Mc Grath, Paul and James J. DiCarlo},
url = {https://mschrimpf.altervista.org/wp-content/uploads/2021/11/20211006-CRS21-poster-1.pdf},
year = {2021},
date = {2021-10-15},
urldate = {2021-10-15},
booktitle = {Champalimaund Research Symposium (CRS)},
abstract = {Particular artificial neural networks have recently been shown to functionally and behaviorally correspond to the primate ventral visual stream. From external visual inputs, these models are able to predict the neural activity in early through high-level visual cortex IT as well as object recognition behaviors supported by the hierarchy of cortical regions. A critical link however is missing: how do causal perturbations of internal neural activity affect behavioral outputs such as object and face categorization?
Here, we test model ability to predict three different behavioral effects of experimental perturbation studies (Rajalingham & DiCarlo 2019, Afraz et al. 2006, Afraz et al. 2015), covering changes to neural activity in inferotemporal cortex IT via pharmacological and optogenetic suppression as well as micro-stimulation and their effects on object, face, and gender categorization.
We model neural perturbations with a relative change of internal activity scaled by a Gaussian that matches the millimeter-scale spatial spread expected experimentally (Arikan et al. 2002).
Models with a random spatial layout of neurons are unable to predict any of these behavioral effects of causal perturbations.
Guiding model neuronal spatial organization by wiring cost minimization (Lee et al. 2020) on the other hand leads to topographic networks that qualitatively reproduce the experimentally observed behavioral outcomes following causal perturbations.
These results open the door to modeling causal experiments that contemporary models fail to capture. Moreover, they are a crucial step towards precise predictions of behavioral differences elicited via diverse perturbation patterns which could enable the next generation of brain machine interfaces.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Here, we test model ability to predict three different behavioral effects of experimental perturbation studies (Rajalingham & DiCarlo 2019, Afraz et al. 2006, Afraz et al. 2015), covering changes to neural activity in inferotemporal cortex IT via pharmacological and optogenetic suppression as well as micro-stimulation and their effects on object, face, and gender categorization.
We model neural perturbations with a relative change of internal activity scaled by a Gaussian that matches the millimeter-scale spatial spread expected experimentally (Arikan et al. 2002).
Models with a random spatial layout of neurons are unable to predict any of these behavioral effects of causal perturbations.
Guiding model neuronal spatial organization by wiring cost minimization (Lee et al. 2020) on the other hand leads to topographic networks that qualitatively reproduce the experimentally observed behavioral outcomes following causal perturbations.
These results open the door to modeling causal experiments that contemporary models fail to capture. Moreover, they are a crucial step towards precise predictions of behavioral differences elicited via diverse perturbation patterns which could enable the next generation of brain machine interfaces.
Kar, Kohitij; Schrimpf, Martin; Schmidt, Kailyn; DiCarlo, James J
In: Journal of Vision, pp. 2489, The Association for Research in Vision and Ophthalmology, 2021, ISSN: 1534-7362.
@inproceedings{Kar2021,
title = {Chemogenetic suppression of macaque V4 neurons produces retinotopically specific deficits in downstream IT neural activity patterns and core object recognition behavior},
author = {Kohitij Kar and Martin Schrimpf and Kailyn Schmidt and James J DiCarlo},
doi = {10.1167/jov.21.9.2489},
issn = {1534-7362},
year = {2021},
date = {2021-09-01},
booktitle = {Journal of Vision},
volume = {21},
number = {9},
pages = {2489},
publisher = {The Association for Research in Vision and Ophthalmology},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Marques, Tiago; Schrimpf, Martin; DiCarlo, James J
Multi-scale hierarchical neural network models that bridge from single neurons in the primate primary visual cortex to object recognition behavior Technical Report
2021.
@techreport{Marques2021,
title = {Multi-scale hierarchical neural network models that bridge from single neurons in the primate primary visual cortex to object recognition behavior},
author = {Tiago Marques and Martin Schrimpf and James J DiCarlo},
url = {https://www.biorxiv.org/content/10.1101/2021.03.01.433495v2 https://www.biorxiv.org/content/10.1101/2021.03.01.433495v2.abstract https://www.biorxiv.org/content/10.1101/2021.03.01.433495v1},
doi = {10.1101/2021.03.01.433495},
year = {2021},
date = {2021-08-01},
urldate = {2021-08-01},
journal = {bioRxiv preprint},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Object recognition relies on inferior temporal (IT) cortical neural population representations that are themselves computed by a hierarchical network of feedforward and recurrently connected neural population called the ventral visual stream (areas V1, V2, V4 and IT). While recent work has created some reasonably accurate image-computable hierarchical neural network models of those neural stages, those models do not yet bridge between the properties of individual neurons and the overall emergent behavior of the ventral stream. For example, current leading ventral stream models do not allow us to ask questions such as: How does the surround suppression behavior of individual V1 neurons ultimately relate to IT neural representation and to behavior?; or How would deactivation of a particular sub-population of V1 neurons specifically alter object recognition behavior? One reason we cannot yet do this is that individual V1 artificial neurons in multi-stage models have not been shown to be functionally similar with individual biological V1 neurons. Here, we took an important first step towards this direction by building and evaluating hundreds of hierarchical neural network models in how well their artificial single neurons approximate macaque primary visual cortical (V1) neurons. We found that single neurons in some models are surprisingly similar to their biological counterparts and that the distributions of single neuron properties, such as those related to orientation and spatial frequency tuning, approximately match those in macaque V1. Crucially, we also observed that hierarchical models with V1-layers that better match macaque V1 at the single neuron level are also more aligned with human object recognition behavior. These results provide the first multi-stage, multi-scale models that allow our field to ask precisely how the specific properties of individual V1 neurons relate to recognition behavior. Finally, we here show that an optimized classical neuroscientific model of V1 is still more functionally similar to primate V1 than all of the tested multi-stage models, suggesting that further model improvements are possible, and that those improvements would likely have tangible payoffs in terms of behavioral prediction accuracy and behavioral robustness. ### Competing Interest Statement The authors have declared no competing interest.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Gan, Chuang; Schwartz, Jeremy; Alter, Seth; Schrimpf, Martin; Traer, James; Freitas, Julian De; Kubilius, Jonas; Bhandwaldar, Abhishek; Haber, Nick; Sano, Megumi; Kim, Kuno; Wang, Elias; Mrowca, Damian; Lingelbach, Michael; Curtis, Aidan; Feigelis, Kevin; Bear, Daniel M; Gutfreund, Dan; Cox, David; DiCarlo, James J; McDermott, Josh; Tenenbaum, Joshua B; Yamins, Daniel L K
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation Inproceedings
In: Neural Information Processing Systems (NeurIPS; oral), 2021.
@inproceedings{Gan2021,
title = {ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation},
author = {Chuang Gan and Jeremy Schwartz and Seth Alter and Martin Schrimpf and James Traer and Julian {De Freitas} and Jonas Kubilius and Abhishek Bhandwaldar and Nick Haber and Megumi Sano and Kuno Kim and Elias Wang and Damian Mrowca and Michael Lingelbach and Aidan Curtis and Kevin Feigelis and Daniel M Bear and Dan Gutfreund and David Cox and James J DiCarlo and Josh McDermott and Joshua B Tenenbaum and Daniel L K Yamins},
url = {http://arxiv.org/abs/2007.04954},
year = {2021},
date = {2021-07-01},
urldate = {2021-07-01},
booktitle = {Neural Information Processing Systems (NeurIPS; oral)},
journal = {arXiv preprint},
abstract = {We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation. With TDW, users can simulate high-fidelity sensory data and physical interactions between mobile agents and objects in a wide variety of rich 3D environments. TDW has several unique properties: 1) realtime near photo-realistic image rendering quality; 2) a library of objects and environments with materials for high-quality rendering, and routines enabling user customization of the asset library; 3) generative procedures for efficiently building classes of new environments 4) high-fidelity audio rendering; 5) believable and realistic physical interactions for a wide variety of material types, including cloths, liquid, and deformable objects; 6) a range of ävatar" types that serve as embodiments of AI agents, with the option for user avatar customization; and 7) support for human interactions with VR devices. TDW also provides a rich API enabling multiple agents to interact within a simulation and return a range of sensor and physics data representing the state of the world. We present initial experiments enabled by the platform around emerging research directions in computer vision, machine learning, and cognitive science, including multi-modal physical scene understanding, multi-agent interactions, models that "learn like a child", and attention studies in humans and neural networks. The simulation platform will be made publicly available.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Zhuang, Chengxu; Yan, Siming; Nayebi, Aran; Schrimpf, Martin; Frank, Michael C; DiCarlo, James J; Yamins, Daniel L K
Unsupervised Neural Network Models of the Ventral Visual Stream Journal Article
In: Proceedings of the National Academy of Sciences (PNAS), 2021.
@article{Zhuang2021,
title = {Unsupervised Neural Network Models of the Ventral Visual Stream},
author = {Chengxu Zhuang and Siming Yan and Aran Nayebi and Martin Schrimpf and Michael C Frank and James J DiCarlo and Daniel L K Yamins},
url = {https://www.pnas.org/content/118/3/e2014196118.short
https://www.biorxiv.org/content/10.1101/2020.06.16.155556v1
},
doi = {10.1101/2020.06.16.155556},
year = {2021},
date = {2021-06-01},
urldate = {2020-06-01},
journal = {Proceedings of the National Academy of Sciences (PNAS)},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods, and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Moreover, we find that these methods produce brain-like representations even when trained on noisy and limited data measured from real children's developmental experience. We also find that semi-supervised deep contrastive embeddings can leverage small numbers of labelled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results suggest that deep contrastive embedding objectives may be a biologically-plausible computational theory of primate visual development. ### Competing Interest Statement The authors have declared no competing interest.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2020
Dapello, Joel; Marques, Tiago; Schrimpf, Martin; Geiger, Franziska; Cox, David D; DiCarlo, James J
Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations Inproceedings
In: Neural Information Processing Systems (NeurIPS; spotlight), 2020.
@inproceedings{DapelloMarques2020,
title = {Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations},
author = {Joel Dapello and Tiago Marques and Martin Schrimpf and Franziska Geiger and David D Cox and James J DiCarlo},
url = {https://www.biorxiv.org/content/10.1101/2020.06.16.154542v1},
doi = {10.1101/2020.06.16.154542},
year = {2020},
date = {2020-06-01},
urldate = {2020-06-01},
booktitle = {Neural Information Processing Systems (NeurIPS; spotlight)},
journal = {bioRxiv preprint},
abstract = {Current state-of-the-art object recognition models are largely based on convolutional neural network (CNN) architectures, which are loosely inspired by the primate visual system. However, these CNNs can be fooled by imperceptibly small, explicitly crafted perturbations, and struggle to recognize objects in corrupted images that are easily recognized by humans. Here, by making comparisons with primate neural data, we first observed that CNN models with a neural hidden layer that better matches primate primary visual cortex (V1) are also more robust to adversarial attacks. Inspired by this observation, we developed VOneNets, a new class of hybrid CNN vision models. Each VOneNet contains a fixed weight neural network front-end that simulates primate V1, called the VOneBlock, followed by a neural network back-end adapted from current CNN vision models. The VOneBlock is based on a classical neuroscientific model of V1: the linear-nonlinear-Poisson model, consisting of a biologically-constrained Gabor filter bank, simple and complex cell nonlinearities, and a V1 neuronal stochasticity generator. After training, VOneNets retain high ImageNet performance, but each is substantially more robust, outperforming the base CNNs and state-of-the-art methods by 18% and 3%, respectively, on a conglomerate benchmark of perturbations comprised of white box adversarial attacks and common image corruptions. Finally, we show that all components of the VOneBlock work in synergy to improve robustness. While current CNN architectures are arguably brain-inspired, the results presented here demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in ImageNet-level computer vision applications. ### Competing Interest Statement The authors have declared no competing interest.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Geiger, Franziska; Schrimpf, Martin; Marques, Tiago; Dicarlo, James J
Wiring Up Vision : Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream Technical Report
2020.
@techreport{GeigerSchrimpf2020,
title = {Wiring Up Vision : Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream},
author = {Franziska Geiger and Martin Schrimpf and Tiago Marques and James J Dicarlo},
url = {https://www.biorxiv.org/content/10.1101/2020.06.08.140111v1},
doi = {10.1101/2020.06.08.140111},
year = {2020},
date = {2020-06-01},
journal = {bioRxiv preprint},
publisher = {Cold Spring Harbor Laboratory},
abstract = {After training on large datasets, certain deep neural networks are surprisingly good models of the neural mechanisms of adult primate visual object recognition. Nevertheless, these models are poor models of the development of the visual system because they posit millions of sequential, precisely coordinated synaptic updates, each based on a labeled image. While ongoing research is pursuing the use of unsupervised proxies for labels, we here explore a complementary strategy of reducing the required number of supervised synaptic updates to produce an adult-like ventral visual stream (as judged by the match to V1, V2, V4, IT, and behavior). Such models might require less precise machinery and energy expenditure to coordinate these updates and would thus move us closer to viable neuroscientific hypotheses about how the visual system wires itself up. Relative to the current leading model of the adult ventral stream, we here demonstrate that the total number of supervised weight updates can be substantially reduced using three complementary strategies: First, we find that only 2% of supervised updates (epochs and images) are needed to achieve ~80% of the match to adult ventral stream. Second, by improving the random distribution of synaptic connectivity, we find that 54% of the brain match can already be achieved “at birth” (i.e. no training at all). Third, we find that, by training only ~5% of model synapses, we can still achieve nearly 80% of the match to the ventral stream. When these three strategies are applied in combination, we find that these new models achieve ~80% of a fully trained model's match to the brain, while using two orders of magnitude fewer supervised synaptic updates. These results reflect first steps in modeling not just primate adult visual processing during inference, but also how the ventral visual stream might be “wired up” by evolution (a model's “birth” state) and by developmental learning (a model's updates based on visual experience). ### Competing Interest Statement The authors have declared no competing interest.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Schrimpf, Martin; Kubilius, Jonas; Lee, Michael J; Murty, N. Apurva Ratan; Ajemian, Robert; DiCarlo, James J.
Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence Journal Article
In: Neuron, 2020, ISSN: 0896-6273.
@article{Schrimpf2020Integrative,
title = {Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence},
author = {Martin Schrimpf and Jonas Kubilius and Michael J Lee and N. Apurva Ratan Murty and Robert Ajemian and James J. DiCarlo},
url = {https://doi.org/10.1016/j.neuron.2020.07.040},
doi = {10.1016/j.neuron.2020.07.040},
issn = {0896-6273},
year = {2020},
date = {2020-01-01},
journal = {Neuron},
publisher = {Elsevier Inc.},
abstract = {A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models. Years of research have led to models that capture experimental results in individual behavioral tasks and individual brain regions. We here advocate for taking the next step: integrating experimental results from many laboratories into suites of benchmarks that, when considered together, push mechanistic models toward explaining entire domains of intelligence, such as vision, language, and motor control. Given recent successes of neurally mechanistic models and the surging availability of neural, anatomical, and behavioral data, we believe that now is the time to create integrative benchmarking platforms that incentivize ambitious, unified models. This perspective discusses the advantages and the challenges of this approach and proposes specific steps to achieve this goal in the domain of visual intelligence with the case study of an integrative benchmarking platform called Brain-Score.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2019
Kubilius*, Jonas; Schrimpf*, Martin; Kar, Kohitij; Rajalingham, Rishi; Hong, Ha; Majaj, Najib J.; Issa, Elias B.; Bashivan, Pouya; Prescott-Roy, Jonathan; Schmidt, Kailyn; Nayebi, Aran; Bear, Daniel; Yamins, Daniel L. K.; DiCarlo, James J.
Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs Inproceedings
In: Neural Information Processing Systems (NeurIPS; oral), 2019.
@inproceedings{KubiliusSchrimpf2019,
title = {Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs},
author = {Jonas Kubilius* and Martin Schrimpf* and Kohitij Kar and Rishi Rajalingham and Ha Hong and Najib J. Majaj and Elias B. Issa and Pouya Bashivan and Jonathan Prescott-Roy and Kailyn Schmidt and Aran Nayebi and Daniel Bear and Daniel L. K. Yamins and James J. DiCarlo},
year = {2019},
date = {2019-12-09},
booktitle = {Neural Information Processing Systems (NeurIPS; oral)},
abstract = {Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categorization performance, yet bringing into question how brain-like they still are. In particular, typical deep models from the machine learning community are often hard to map onto the brain's anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. Here we demonstrate that better anatomical alignment to the brain and high performance on machine learning as well as neuroscience measures do not have to be in contradiction. We developed CORnet-S, a shallow ANN with four anatomically mapped areas and recurrent connectivity, guided by Brain-Score, a new large-scale composite of neural and behavioral benchmarks for quantifying the functional fidelity of models of the primate ventral visual stream. Despite being significantly shallower than most models, CORnet-S is the top model on Brain-Score and outperforms similarly compact models on ImageNet. Moreover, our extensive analyses of CORnet-S circuitry variants reveal that recurrence is the main predictive factor of both Brain-Score and ImageNet top-1 performance. Finally, we report that the temporal evolution of the CORnet-S "IT" neural population resembles the actual monkey IT population dynamics. Taken together, these results establish CORnet-S, a compact, recurrent ANN, as the current best model of the primate ventral visual stream.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Jozwik, Kamila Maria; Schrimpf, Martin; Kanwisher, Nancy; DiCarlo, James J.
To find better neural network models of human vision, find better neural network models of primate vision Technical Report
2019.
@techreport{Jozwik2019primatehuman,
title = {To find better neural network models of human vision, find better neural network models of primate vision},
author = {Kamila Maria Jozwik and Martin Schrimpf and Nancy Kanwisher and James J. DiCarlo},
url = {https://www.biorxiv.org/content/10.1101/688390v1.full https://www.biorxiv.org/content/10.1101/688390v1},
doi = {10.1101/688390},
year = {2019},
date = {2019-07-01},
journal = {bioRxiv},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Specific deep artificial neural networks (ANNs) are the current best models of ventral visual processing and object recognition behavior in monkeys. We here explore whether models of non-human primate vision generalize to visual processing in the human primate brain. Specifically, we asked if model match to monkey IT is a predictor of model match to human IT, even when scoring those matches on different images. We found that the model match to monkey IT is a positive predictor of the model match to human IT (R = 0.36), and that this approach outperforms the current standard predictor of model accuracy on ImageNet. This suggests a more powerful approach for pre-selecting models as hypotheses of human brain processing.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Schrimpf*, Martin; Kubilius*, Jonas; Hong, Ha; Majaj, Najib J; Rajalingham, Rishi; Issa, Elias B; Kar, Kohitij; Ziemba, Corey; Bashivan, Pouya; Prescott-Roy, Jonathan; Schmidt, Kailyn; Yamins, Daniel L K; J., DiCarlo James
Using Brain-Score to Evaluate and Build Neural Networks for Brain-Like Object Recognition Inproceedings
In: Computational and Systems Neuroscience (Cosyne), 2019.
@inproceedings{schrimpf2019cosyne,
title = {Using Brain-Score to Evaluate and Build Neural Networks for Brain-Like Object Recognition},
author = {Martin Schrimpf* and Jonas Kubilius* and Ha Hong and Najib J Majaj and Rishi Rajalingham and Elias B Issa and Kohitij Kar and Corey Ziemba and Pouya Bashivan and Jonathan Prescott-Roy and Kailyn Schmidt and Daniel L K Yamins and DiCarlo James J.},
year = {2019},
date = {2019-01-01},
booktitle = {Computational and Systems Neuroscience (Cosyne)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2018
Bashivan, Pouya; Schrimpf, Martin; Ajemian, Robert; Rish, Irina; Riemer, Matthew; Yuhai, Tu
Continual Learning with Self-Organizing Maps Inproceedings
In: Neural Information Processing Systems (NeurIPS) Continual Learning Workshop, 2018.
@inproceedings{bashivan2018continual,
title = {Continual Learning with Self-Organizing Maps},
author = {Pouya Bashivan and Martin Schrimpf and Robert Ajemian and Irina Rish and Matthew Riemer and Tu Yuhai},
year = {2018},
date = {2018-12-01},
booktitle = {Neural Information Processing Systems (NeurIPS) Continual Learning Workshop},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Kubilius, Jonas; Schrimpf, Martin; Nayebi, Aran; Bear, Daniel; Yamins, Daniel L K; Dicarlo, James J
CORnet: Modeling the Neural Mechanisms of Core Object Recognition Technical Report
2018.
@techreport{KubiliusSchrimpf2018CORnet,
title = {CORnet: Modeling the Neural Mechanisms of Core Object Recognition},
author = {Jonas Kubilius and Martin Schrimpf and Aran Nayebi and Daniel Bear and Daniel L K Yamins and James J Dicarlo},
url = {https://www.biorxiv.org/content/early/2018/09/04/408385 https://www.biorxiv.org/content/biorxiv/early/2018/09/04/408385.full.pdf?%3Fcollection=},
doi = {10.1101/408385},
year = {2018},
date = {2018-09-01},
journal = {bioRxiv},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Deep artificial neural networks with spatially repeated processing (a.k.a., deep convolutional ANNs) have been established as the best class of candidate models of visual processing in primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NAS-Net architectures, demonstrating increasingly better object cat-egorization performance and increasingly better explanatory power of both neural and behavioral responses. However, from the neuroscientist's point of view, the relationship between such very deep architectures and the ventral visual pathway is incomplete in at least two ways. On the one hand, current state-of-the-art ANNs appear to be too complex (e.g., now over 100 levels) compared with the relatively shallow cortical hierarchy (4-8 levels), which makes it difficult to map their elements to those in the ventral visual stream and to understand what they are doing. On the other hand, current state-of-the-art ANNs appear to be not complex enough in that they lack recurrent connections and the resulting neural response dynamics that are commonplace in the ventral visual stream. Here we describe our ongoing efforts to resolve both of these issues by developing a "CORnet" family of deep neural network architectures. Rather than just seeking high object recognition performance (as the state-of-the-art ANNs above), we instead try to reduce the model family to its most important elements and then gradually build new ANNs with recurrent and skip connections while monitoring both performance and the match between each new CORnet model and a large body of primate brain and behav-ioral data. We report here that our current best ANN model derived from this approach (CORnet-S) is among the top models on Brain-Score, a composite benchmark for comparing models to the brain, but is simpler than other deep ANNs in terms of the number of convolutions performed along the longest path of information processing in the model. All CORnet models are available at github.com/dicarlolab/CORnet, and we plan to update this manuscript and the available models in this family as they are produced. object recognition | deep neural networks | feedforward | recurrence},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Schrimpf, Martin; Kubilius, Jonas; Hong, Ha; Majaj, Najib J; Rajalingham, Rishi; Issa, Elias B; Kar, Kohitij; Bashivan, Pouya; Prescott-Roy, Jonathan; Schmidt, Kailyn; Yamins, Daniel L K; Dicarlo, James J
Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? Technical Report
2018.
@techreport{SchrimpfKubilius2018BrainScore,
title = {Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?},
author = {Martin Schrimpf and Jonas Kubilius and Ha Hong and Najib J Majaj and Rishi Rajalingham and Elias B Issa and Kohitij Kar and Pouya Bashivan and Jonathan Prescott-Roy and Kailyn Schmidt and Daniel L K Yamins and James J Dicarlo},
url = {https://www.biorxiv.org/content/early/2018/09/05/407007 http://dx.doi.org/10.1101/407007},
doi = {10.1101/407007},
year = {2018},
date = {2018-09-01},
journal = {bioRxiv},
publisher = {Cold Spring Harbor Laboratory},
abstract = {The internal representations of early deep artificial neural networks (ANNs) were found to be remarkably similar to the internal neural representations measured experimentally in the primate brain. Here we ask, as deep ANNs have continued to evolve, are they becoming more or less brain-like? ANNs that are most functionally similar to the brain will contain mechanisms that are most like those used by the brain. We therefore developed Brain-Score-a composite of multiple neural and be-havioral benchmarks that score any ANN on how similar it is to the brain's mechanisms for core object recognition-and we deployed it to evaluate a wide range of state-of-the-art deep ANNs. Using this scoring system, we here report that: (1) DenseNet-169, CORnetS and ResNet-101 are the most brain-like ANNs. (2) There remains considerable variability in neural and behav-ioral responses that is not predicted by any ANN, suggesting that no ANN model has yet captured all the relevant mechanisms. (3) Extending prior work, we found that gains in ANN ImageNet performance led to gains on Brain-Score. However, correlation weakened at ≥ 70% top-1 ImageNet performance, suggesting that additional guidance from neuroscience is needed to make further advances in capturing brain mechanisms. (4) We uncovered smaller (i.e. less complex) ANNs that are more brain-like than many of the best-performing ImageNet models, which suggests the opportunity to simplify ANNs to better understand the ventral stream. The scoring system used here is far from complete. However, we propose that evaluating and tracking model-benchmark correspondences through a Brain-Score that is regularly updated with new brain data is an exciting opportunity: experimental benchmarks can be used to guide machine network evolution, and machine networks are mechanistic hypotheses of the brain's network and thus drive next experiments. To facilitate both of these, we release Brain-Score.org: a platform that hosts the neural and behavioral benchmarks, where ANNs for visual processing can be submitted to receive a Brain-Score and their rank relative to other models, and where new experimental data can be naturally incorporated.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Tang, Hanlin; Schrimpf, Martin; Lotter, William; Moerman, Charlotte; Paredes, Ana; Caro, Josue Ortega; Hardesty, Walter; Cox, David; Kreiman, Gabriel
Recurrent computations for visual pattern completion Journal Article
In: Proceedings of the National Academy of Sciences, vol. 115, no. 35, 2018, ISSN: 0027-8424.
@article{TangSchrimpfLotter2018Recurrent,
title = {Recurrent computations for visual pattern completion},
author = {Hanlin Tang and Martin Schrimpf and William Lotter and Charlotte Moerman and Ana Paredes and Josue {Ortega Caro} and Walter Hardesty and David Cox and Gabriel Kreiman},
url = {http://www.ncbi.nlm.nih.gov/pubmed/30104363 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC6126774 http://www.pnas.org/lookup/doi/10.1073/pnas.1719397115},
doi = {10.1073/pnas.1719397115},
issn = {0027-8424},
year = {2018},
date = {2018-08-01},
journal = {Proceedings of the National Academy of Sciences},
volume = {115},
number = {35},
publisher = {National Academy of Sciences},
abstract = {Making inferences from partial information constitutes a critical aspect of cognition. During visual perception, pattern completion enables recognition of poorly visible or occluded objects. We combined psychophysics, physiology, and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent with this hypothesis. First, subjects robustly recognized objects even when they were rendered textless15% visible, but recognition was largely impaired when processing was interrupted by backward masking. Second, invasive physiological responses along the human ventral cortex exhibited visually selective responses to partially visible objects that were delayed compared with whole objects, suggesting the need for additional computations. These physiological delays were correlated with the effects of backward masking. Third, state-of-the-art feed-forward computational architectures were not robust to partial visibility. However, recognition performance was recovered when the model was augmented with attractor-based recurrent connectivity. The recurrent model was able to predict which images of heavily occluded objects were easier or harder for humans to recognize, could capture the effect of introducing a backward mask on recognition behavior, and was consistent with the physiological delays along the human ventral visual stream. These results provide a strong argument of plausibility for the role of recurrent computations in making visual inferences from partial information.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Schrimpf*, Martin; Kubilius*, Jonas; Hong, Ha; Majaj, Najib J; Rajalingham, Rishi; Issa, Elias B; Kar, Kohitij; Bashivan, Pouya; Prescott-Roy, Jonathan; Schmidt, Kailyn; Yamins, Daniel L K; J., James J. DiCarlo James
Brain-Score: Which Artificial Neural Network Best Emulates the Brain’s Neural Network? Inproceedings
In: Cognitive Computational Neuroscience (CCN), 2018.
@inproceedings{schrimpf2018ccn,
title = {Brain-Score: Which Artificial Neural Network Best Emulates the Brain’s Neural Network?},
author = {Martin Schrimpf* and Jonas Kubilius* and Ha Hong and Najib J Majaj and Rishi Rajalingham and Elias B Issa and Kohitij Kar and Pouya Bashivan and Jonathan Prescott-Roy and Kailyn Schmidt and Daniel L K Yamins and James J. DiCarlo James J.},
year = {2018},
date = {2018-01-01},
booktitle = {Cognitive Computational Neuroscience (CCN)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Arend, Luke; Han, Yena; Schrimpf, Martin; Bashivan, Pouya; Kar, Kohitij; Poggio, Tomaso; DiCarlo, James J; Boix, Xavier
Single units in a deep neural network functionally correspond with neurons in the brain: preliminary results Technical Report
2018.
@techreport{arend2018single,
title = {Single units in a deep neural network functionally correspond with neurons in the brain: preliminary results},
author = {Luke Arend and Yena Han and Martin Schrimpf and Pouya Bashivan and Kohitij Kar and Tomaso Poggio and James J DiCarlo and Xavier Boix},
year = {2018},
date = {2018-01-01},
journal = {CBMM Memo},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
2017
Schrimpf*, Martin; Merity*, Stephen; Bradbury, James; Socher, Richard
A Flexible Approach to Automated RNN Architecture Generation Inproceedings
In: International Conference on Learning Representations (ICLR) Workshop Track, 2017.
@inproceedings{schrimpf2017b,
title = {A Flexible Approach to Automated RNN Architecture Generation},
author = {Martin Schrimpf* and Stephen Merity* and James Bradbury and Richard Socher},
url = {https://arxiv.org/abs/1712.07316
https://openreview.net/forum?id=BJDCPSJPM},
doi = {arXiv:1712.07316},
year = {2017},
date = {2017-12-20},
booktitle = {International Conference on Learning Representations (ICLR) Workshop Track},
abstract = {The process of designing neural architectures requires expert knowledge and extensive trial and error.
While automated architecture search may simplify these requirements, the recurrent neural network (RNN) architectures generated by existing methods are limited in both flexibility and components.
We propose a domain-specific language (DSL) for use in automated architecture search which can produce novel RNNs of arbitrary depth and width.
The DSL is flexible enough to define standard architectures such as the Gated Recurrent Unit and Long Short Term Memory and allows the introduction of non-standard RNN components such as trigonometric curves and layer normalization. Using two different candidate generation techniques, random search with a ranking function and reinforcement learning,
we explore the novel architectures produced by the RNN DSL for language modeling and machine translation domains.
The resulting architectures do not follow human intuition yet perform well on their targeted tasks, suggesting the space of usable RNN architectures is far larger than previously assumed.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
While automated architecture search may simplify these requirements, the recurrent neural network (RNN) architectures generated by existing methods are limited in both flexibility and components.
We propose a domain-specific language (DSL) for use in automated architecture search which can produce novel RNNs of arbitrary depth and width.
The DSL is flexible enough to define standard architectures such as the Gated Recurrent Unit and Long Short Term Memory and allows the introduction of non-standard RNN components such as trigonometric curves and layer normalization. Using two different candidate generation techniques, random search with a ranking function and reinforcement learning,
we explore the novel architectures produced by the RNN DSL for language modeling and machine translation domains.
The resulting architectures do not follow human intuition yet perform well on their targeted tasks, suggesting the space of usable RNN architectures is far larger than previously assumed.
Schrimpf, Martin
Brain-inspired Recurrent Neural Algorithms for Advanced Object Recognition Masters Thesis
Technical University Munich, LMU Munich, University of Augsburg, Universitätsstraße 6a, 86135 Augsburg, 2017.
@mastersthesis{Schrimpf2017,
title = {Brain-inspired Recurrent Neural Algorithms for Advanced Object Recognition},
author = {Martin Schrimpf},
url = {https://mschrimpf.altervista.org/wp-content/uploads/2017/09/Brain-inspired-Recurrent-Neural-Algorithms-for-Advanced-Object-Recognition-Martin-Schrimpf-1.pdf},
year = {2017},
date = {2017-05-13},
address = {Universitätsstraße 6a, 86135 Augsburg},
school = {Technical University Munich, LMU Munich, University of Augsburg},
abstract = {Deep learning has enabled breakthroughs in machine learning, resulting in performance levels seeming on par with humans. This is particularly the case in computer vision where models can learn to classify objects in the images of a dataset. These datasets however often feature perfect information, whereas objects in the real world are frequently cluttered with other objects and thereby occluded. In this thesis, we argue for the usefulness of recurrency for solving these questions of partial information and visual context. First, we show that humans robustly recognize partial objects even at low visibilities while today’s feed-forward models in computer vision are not robust to occlusion with classification performance lacking far behind human baseline. We argue that recurrent computations in the visual cortex are the crucial piece, evident through performance deficits with backward masking and neurophysiological delays for partial images. By extending the neural network Alexnet with recurrent connections at the last feature layer, we are able to outperform feed-forward models and even human subjects. These recurrent models also correlate with human behavior and capture the effects of backward masking. Second, we empirically demonstrate that human subjects benefit from visual context in recognizing difficult images. Building on top of feed-forward Alexnet, we add scene information and recurrent connections to the object prediction layer to define a simple model capable of context integration. Through the use of semantic relationships between objects and scenes derived from a lexical corpus, we can define the recurrent weights without training on large image datasets. The results of this work suggest that recurrent connections are a powerful tool for integrating spatiotemporal information, allowing for the robust recognition of even complex images.},
type = {Master's Thesis},
keywords = {},
pubstate = {published},
tppubtype = {mastersthesis}
}
Cheney*, Nicholas; Schrimpf*, Martin; Kreiman, Gabriel
On the Robustness of Convolutional Neural Networks to Internal Architecture and Weight Perturbations Technical Report
2017.
@techreport{cheney2017,
title = {On the Robustness of Convolutional Neural Networks to Internal Architecture and Weight Perturbations},
author = {Nicholas Cheney* and Martin Schrimpf* and Gabriel Kreiman},
url = {https://arxiv.org/abs/1703.08245},
year = {2017},
date = {2017-03-23},
journal = {arXiv preprint},
abstract = {Deep convolutional neural networks are generally regarded as robust function approximators. So far, this intuition is based on perturbations to external stimuli such as the images to be classified. Here we explore the robustness of convolutional neural networks to perturbations to the internal weights and architecture of the network itself. We show that convolutional networks are surprisingly robust to a number of internal perturbations in the higher convolutional layers but the bottom convolutional layers are much more fragile. For instance, Alexnet shows less than a 30% decrease in classification performance when randomly removing over 70% of weight connections in the top convolutional or dense layers but performance is almost at chance with the same perturbation in the first convolutional layer. Finally, we suggest further investigations which could continue to inform the robustness of convolutional networks to internal perturbations.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
2016
Schrimpf, Martin
Should I use Tensorflow Technical Manual
University of Augsburg 2016, (Seminar Paper).
@manual{should_i_use_tensorflowb,
title = {Should I use Tensorflow},
author = {Martin Schrimpf},
url = {https://arxiv.org/abs/1611.08903},
year = {2016},
date = {2016-01-01},
organization = {University of Augsburg},
abstract = {Google's Machine Learning framework TensorFlow was open-sourced in November 2015 and has since built a growing community around it. TensorFlow is supposed to be flexible for research purposes while also allowing its models to be deployed productively. This work is aimed towards people with experience in Machine Learning considering whether they should use TensorFlow in their environment. Several aspects of the framework important for such a decision are examined, such as the heterogenity, extensibility and its computation graph. A pure Python implementation of linear classification is compared with an implementation utilizing TensorFlow. I also contrast TensorFlow to other popular frameworks with respect to modeling capability, deployment and performance and give a brief description of the current adaption of the framework.},
note = {Seminar Paper},
keywords = {},
pubstate = {published},
tppubtype = {manual}
}
2014
Schrimpf, Martin
Scalable Database Concurrency Control using Transactional Memory Technical Report
Technical University Munich Bachelor Thesis, 2014.
@techreport{bachelors_thesisb,
title = {Scalable Database Concurrency Control using Transactional Memory},
author = {Martin Schrimpf},
url = {http://mschrimpf.com/wp-content/uploads/2016/11/Scalable-Database-Concurrency-Control-using-Transactional-Memory-Martin-Schrimpf-TextSigned.pdf},
year = {2014},
date = {2014-01-01},
institution = {Technical University Munich},
abstract = {Intel recently made available the optimistic synchronization technique Hardware Transactional
Memory (HTM) in their mainstream Haswell processor microarchitecture.
The first part of this work evaluates the core performance characteristics of the two programming
interfaces within Intel’s Transactional Synchronization Extensions (TSX), Hardware Lock Elision
(HLE) and Restricted Transactional Memory (RTM). Therein, a scope of application is defined
regarding inter alia the transaction size which is limited to the L1 DCache or even less with wrongly
aligned data due to cache associativity, the transaction duration restricted by Hardware interrupts
and a limit to the nesting of transactions. By comparing common data structures and analyzing
the behavior of HTM using hardware counters, the Hashmap is identified as a suitable structure
with a 134% speedup compared to classical POSIX mutexes.
In the second part, several latching mechanisms of MySQL InnoDB’s Concurrency Control are
selected and modified with different implementations of HTM to achieve increased scalability. We
find that it does not suffice to apply HTM naively to all mutex calls by using either HLE prefixes or
an HTM-enabled glibc. Furthermore, many transactional cycles often come at the price of frequent
aborted cycles which inhibits performance increases when measuring MySQL with the tx-bench
and too many aborts can even decrease the throughput to 29% of the unmodified version.},
type = {Bachelor Thesis},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Memory (HTM) in their mainstream Haswell processor microarchitecture.
The first part of this work evaluates the core performance characteristics of the two programming
interfaces within Intel’s Transactional Synchronization Extensions (TSX), Hardware Lock Elision
(HLE) and Restricted Transactional Memory (RTM). Therein, a scope of application is defined
regarding inter alia the transaction size which is limited to the L1 DCache or even less with wrongly
aligned data due to cache associativity, the transaction duration restricted by Hardware interrupts
and a limit to the nesting of transactions. By comparing common data structures and analyzing
the behavior of HTM using hardware counters, the Hashmap is identified as a suitable structure
with a 134% speedup compared to classical POSIX mutexes.
In the second part, several latching mechanisms of MySQL InnoDB’s Concurrency Control are
selected and modified with different implementations of HTM to achieve increased scalability. We
find that it does not suffice to apply HTM naively to all mutex calls by using either HLE prefixes or
an HTM-enabled glibc. Furthermore, many transactional cycles often come at the price of frequent
aborted cycles which inhibits performance increases when measuring MySQL with the tx-bench
and too many aborts can even decrease the throughput to 29% of the unmodified version.