hide
Free keywords:
-
Abstract:
The brain solves complex tasks with intricate temporal dependencies by maintaining the memory of previous inputs over long periods. Long timescales required for solving such tasks may arise from the biophysical properties of individual neurons (single-neuron timescale, e.g., membrane time constant) or recurrent interactions among them. While both mechanisms operate in brain networks, their interplay and individual contributions to optimally solving memory-dependent tasks remain poorly understood. We investigate the role of different mechanisms by training recurrent neural networks (RNNs) to solve N-parity and N-delayed match-to-sample (N-DMS) tasks with increasing memory requirements controlled by N. Networks are trained using two distinct curricula with gradually increasing N: (i) in single-N curriculum, networks learn a new N at each curriculum step; (ii) in multi-N curriculum, they learn a new N while maintaining the solutions for previous Ns, similar to biological learning. Each neuron has a leak parameter indicating the single-neuron timescale, optimized alongside recurrent weights. We estimate the network-mediated timescales from the autocorrelation decay of each neuron’s activity. We find that in both curricula, RNNs develop longer timescales with increasing N, but via distinct mechanisms. Single-N RNNs operate in a strong inhibitory state and mainly rely on increasing their single- neuron timescales with N. However, multi-N RNNs operate closer to a balanced state and use only recurrent connectivity to develop long timescales, while keeping their single-neuron timescales constant. The latter is compatible with findings in primate cortex. We show that using network-mediated mechanisms to develop long timescales, as in multi-N RNNs, increases training speed and stability to perturbations, and allows generalization to tasks beyond the training set. Our results suggest that adapting timescales to task requirements via recurrent connectivity enables learning more complex objectives (holding multiple concurrent memories) and improves computational robustness, which can be a beneficial strategy for implementing brain computations.