Abstract
Previous studies have shown the possibility to decode
the semantic category of an object from the fMRI signal in
different modalities of object presentation. Furthermore,
by generalizing a classifier across different modalities
(for instance, from pictures to written words), cortical
structures that process semantic information in an
amodal fashion have been identified. In this study we
employ high-resolution fMRI in combination with
surface-based searchlight mapping to further explore
the architecture of modality-independent responses.
Stimuli of 2 semantic categories (animals and tools)
were presented in 2 modalities: photographs and
written words. Stimuli were presented in 40-seconds
blocks with 10-seconds intervals. Subjects (N=3) were
instructed to judge whether each stimulus within a
block was semantically consistent with the others. The
experiment also included 8 free recall blocks, in which
name of a category appeared on the screen for 2 seconds,
followed by 40 seconds of a blank screen. In theses blocks
subjects were instructed to covertly recall all entities
from the probed category that they had seen during the
experiment. Subjects were scanned with 7 Tesla MRIscanner,
using 3D EPI sequence with isotropic resolution
of 1.5 mm. In each subject, reconstruction of cortical
surface was performed. After that, for each vertex on the
surface, a set of adjacent voxels in the functional volume
was assigned. Subsequently, a linear support vector
machine classifier was used to decode object category in
each surface-based patch. Generalization analysis across
picture and written word presentation was performed,
where the classifier was trained on the fMRI data from
blocks of written words, and tested on the data from picture blocks, and vice versa. The second analysis was
performed on the free recall blocks, where the classifier
was trained on merged data from pictures and written
words blocks, and tested on the free recall blocks.
Further, we explored how the decoding accuracy in the
inferior temporal cortex changes with the diameter of the
searchlight patch. Since surface-based voxel grouping
takes into account the cortical folding and ensures that
voxels belonging to different gyri do not fall in the same
searchlight group, it allows answering the question,
at what spatial scale is the modality-independent
information is represented. The cross-modal analysis in
all three subjects revealed a cluster of voxels in inferior
temporal cortex (lateral fusiform and inferotemporal gyri)
and posterior middle temporal gyrus. The topography
of significant clusters also suggested involvement of
the inferior frontal gyrus, lateral prefrontal cortex, and
medial prefrontal cortex. Interestingly, these areas were
the most evident in the free recall test, although the
searchlight maps of the three subjects showed substantial
individual differences in this analysis. Overall, the data
yield a similar picture as previous research, highlighting
the role of IT/pMTG and prefrontal cortex in the crossmodal
semantic representation. We further extended
previous research, by showing that the classification
accuracy in these areas decreases with the increase of
the searchlight patch size. These results indicate that the
modality-independent categorical activations in the IT
cortex are represented on the spatial scale of millimetres.