hide
Free keywords:
-
Abstract:
Learning representations that facilitate generalisation underlies efficient decision making in machines and animals alike. However, the representation at each point during the acquisition of competent behaviour needs to support not only the decisions at hand, but also the continuing improvement of the agent’s policy. Thus, the normative solution to the representation learning problem requires planning. We formalise this as meta-cognitive decision making in a model-based reinforcement learning agent that maintains its belief about the environment in an approximately Bayesian way. We use simple contextual bandits to show that representational planning confers an advantage in environments in which generalisation across contexts is possible and for suitable temporal horizons. Our model allows for detailed analyses of the relationships between the inductive biases of a learning agent, the characteristics of the environment, and the detailed dynamics of representational updates. This work contributes to a greater understanding of heuristic solutions in machine learning and the construction of abstractions by humans.