Proceedings:
Book One
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 20
Track:
Markov Decision Processes and Uncertainty
Downloads:
Abstract:
Most work on value function approximation adheres to Samuel’s original design: agents learn a task-specific value function using parameter estimation, where the approximation architecture (e.g, polynomials) is specified by a human designer. This paper proposes a novel framework generalizing Samuel’s paradigm using a coordinate-free approach to value function approximation. Agents learn both representations and value functions by constructing geometrically customized task-independent basis functions that form an orthonormal set for the Hilbert space of smooth functions on the underlying state space manifold. The approach rests on a technical result showing that the space of smooth functions on a (compact) Riemanian manifold has a discrete spectrum associated with the Laplace-Beltrami operator. In the discrete setting, spectral analysis of the graph Laplacian yields a set of geometrically customized basis functions for approximating and decomposing value functions. The proposed framework generalizes Samuel’s value function approximation paradigm by combining it with a formalization of Saul Amarel’s paradigm of representation learning through global state space analysis.
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 20