hide
Free keywords:
-
Abstract:
Understanding the circuit mechanisms of brain computation and behavior is one of the central questions in neuroscience. Recent advances in task-optimized recurrent neural networks (RNNs) have opened a new avenue to directly link neural dynamics to behavior and uncover the underlying mechanisms by reverse engineering the trained RNNs. Previous studies leveraged fixed points to compare RNNs across limited tasks and found that while the geometry of neural representations varies, fixed point topology is consistent across RNNs despite differences in model architecture [1]. However, linearizing dynamics only around fixed points falls short of describing global dynamics. To address this problem, we applied Dynamical Similarity Analysis (DSA) [2], which provides a more comprehensive linear representation of nonlinear dynamics using Koopman operators to explore broader dynamic patterns.
We compared 5 network architectures (vanilla continuous-time RNN (CTRNN) with tanh and ReLU activation functions, scaled vanilla CTRNN, Long Short-Term Memory, and Gated Recurrent Unit) across six tasks [3], with perceptual, memory, timing and context-dependent components. We found that task structure, training and weight initialization can independently impact representations of RNNs across tasks. Before training, most pair-wise task comparisons lead to linearly separable dynamics, indicating that distinct input structures alone induce separable dynamics (Fig 1a). In some cases (e.g., DC-CDM), training reduces separability, while in others (e.g., CDM-ID, Fig 1b), it increases separability, suggesting that input structure and learned dynamics might have differential effects. RNNs trained on tasks without shared components consistently exhibit linearly separable dynamics, regardless of weight initialization (e.g., PDM-MSI, Fig 1c). However, with shared task components, separation largely depends on initializations, with only some initializations leading to highly separable dynamics (e.g., PDM-CDM, Fig 1a). Overall, our result proposes that the initial state of the network, input structure, and training jointly determine how RNNs adapt to various tasks and need to be considered while reverse-engineering RNNs.