Center for Cognitive Brain Imaging

at Carnegie Mellon University


The Neural Basis of Semantic Representation

just mitchellMarcel Just and Tom Mitchell
Emerging brain imagining techniques have now made it possible for scientists to investigate the neural representations of individual concepts in the human brain. We are mapping the neural basis of this semantic space using a novel fusion of fMRI and machine learning techniques, in collaboration with the Dr. Tom Mitchell and his colleagues in the Machine Learning Department at CMU.

Our approach applies machine learning algorithms to fMRI data to identify neural patterns associated with individual concepts. The computer program learns a representation from a subset of our fMRI data and can then identify a novel brain image it has not yet seen. We have also developed a computational model that predicts fMRI activation of words for which fMRI data are not yet available (Mitchell et al., 2008). By leaving out different subsets of images to test the program, we obtain a measure of how neurosemantics modelwell our algorithms can generalize to new data, or an independent subset of the same participant's data (Just et al., 2010). The predictions consistently achieve a high level of accuracy, and generalize in interesting ways: we can obtain successful predictions when training on a different language in bilingual participants, when training on one modality to predict another (words and pictures), and training on one study to predict another study with different stimuli.

Brain imaging and machine learning thought reading video.

4 FactorsLocations of the voxel clusters (spheres) associated with the 4 factors
One of the most novel findings is the similarity of concepts across people. A recent study (Just et al., 2010) found that the neural representations of concrete nouns could be characterized from fMRI data using factor analysis for dimension reduction. Factor analysis produced three main semantic factors underpinning neural representation of physical objects, which we label shelter, manipulation, and eating, and a fourth factor, word length, affecting activation in visual areas. Each factor was common to all participants and represented in three to five brain locations, suggesting that a factor corresponds to a cortical network of areas that co-activate to represent a particular type of interaction with an object. A classifier based on these semantic factors produced high identification accuracies for the individual nouns.

The continued success across these domains indicates that we have uncovered a robust neural representation of several dimensions of the semantic space. Current work includes mapping along other dimensions, investigating how individual words combine into sentences, and integrating these findings into computational models of language comprehension.