https://papers.nips.cc/paper/1143-improving-policies-without-measuring-merits.pdf (Publisher version)
Dayan, P., & Singh, S. (1996). Improving Policies without Measuring Merits. In D. Touretzky, M. Mozer, & M. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8 (pp. 1059-1065). Cambridge, MA, USA: MIT Press.