要旨
Hierarchies are sets or sequences of elements connected in the form of a rooted tree. They possess the key properties: (1) all elements are combined into one structure; (2) one element is superior to all others; and (3) no element is superior to itself (that is, there are no cycles, direct or indirect)” (Fitch & Martins, 2014).
Defined as such, hierarchies exist in multiple domains. Linguistic syntax, and tonal and action sequences display a multi-layered set-of-sets organization. Moreover, social (e.g. family and company structures) and spatial hierarchies (e.g. landmark-based navigation) also display asymmetrical and multi-layered relations between different elements and sets of elements.
Humans can represent the hierarchical structure in all these domains, and to extend their hierarchical depth when necessary. In the same way that we can extend any arbitrarily long sentence, we can also join any two arbitrarily complex social groups such as the armies of two countries to form a joint inter-national army (or inter-continental, inter-planetary, inter-galactic, etc.).
Humans are especially capable of generating hierarchies. While we are able to assemble these kinds of structures in language, music and complex action (Fitch & Martins, 2014), analogous capacities are missing in other species (Fitch & Friederici, 2012), even though they can process simpler structures to some extent (Wilson, Marslen-Wilson, & Petkov, 2017).
The cognitive and neural substrata supporting this capacity are a matter of active research and discussion. In neurolinguistics, this capacity is usually mapped to the ventral portions of Brodmann’s area 44 (BA44), and its interactions with the posterior Superior Temporal Sulcus (Fitch, 2017; Friederici, 2017; Milne et al., 2016). Interestingly, these two regions are connected by a fiber tract, called the Arcuate Fasciculus (AF), which is exceptionally well-developed in humans (Rilling et al., 2008).
The available data suggests the hypothesis that the human ability to represent linguistic hierarchy evolved over a general sequence-processing machinery already available in the primate brain, to which a highly-developed AF was added (Wilson et al., 2017). Some extended this framework to music and action, where hierarchical processing also recruits regions within the Inferior Frontal Gyrus (Fadiga, Craighero, & D’Ausilio, 2009; Fitch & Martins, 2014).
Here, we present a critical challenge to this hypothesis. Consider that there are two groups of domains in which humans can represent hierarchies. In the first, signals are composed of ordered sequences. Here, the serial order of the physical stimuli determines the perceived content or meaning (‘Mary likes John’ vs. ‘John Mary likes’). Even though linguistic hierarchies are not serial themselves, the signal through which they are communicated and decoded is. In the second group, the presentation order of the elements within the set does not necessarily determine the final structure (think of visual or spatial landscapes, or social structures). While the exact serial input order is crucial to determine the structure of ordered sequences, the same is not true for other hierarchical sets.
This taxonomy is important because while BA44 and the AF seem important to process hierarchies within the first group, they are mostly absent in the second (Kumaran, Melo, & Düzel, 2012; Ligneul, Obeso, Ruff, & Dreher, 2016; Martins et al., 2014). The human ability to represent hierarchies in the visual, spatial and social domains is not supported by these mechanisms but rather by the hippocampus, medial Prefrontal cortex, and other structures. The same has been demonstrated for semantic hierarchies (Neville, et al, 2017).
Taken together, these observations yield a logical puzzle:
1. Primates have a general system to process non-hierarchical sequences.
2. The emergence of the human BA44 and AF allowed for the capacity to represent hierarchies to evolve in language.
3. The human ability to represent hierarchies in some domains does not activate the brain areas connected via the AF.
There are two ways to solve this puzzle: The first is to assume that the capacity to represent hierarchies evolved several times, once within language, and for other domains in other time periods. The second entails that the capacity to process hierarchies was first present in the visual, spatial and social domains and then specific changes in BA44 and AF made this capacity available for language (or in general for domains hinging on specific serial order of the input).
In either case, BA 44 and AF seem to be important to process complex structured sequences, but not hierarchies in general. On the one hand, this neural system might be involved in the core generative capacity for hierarchical processing, but only in language. On the other hand, it might connect a previously available capacity to represent sets of sets with a robust capacity to parse sequential information. The latter would be especially important when sequences contain hierarchical relations between elements that are distant in the serial order.