BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
Travelled to:
1 × Canada
1 × Finland
1 × Germany
1 × Israel
1 × United Kingdom
10 × USA
2 × China
Collaborated with:
D.Precup H.v.Seijen D.Silver S.P.Singh B.Tanner P.Stone S.D.Whitehead C.J.Matheus H.R.Maei C.Szepesvári S.Bhatnagar T.Degris M.White M.Müller A.Koop S.Dasgupta A.R.Mahmood H.v.Hasselt M.Cutumisu D.Szafron M.H.Bowling E.Wiewiora
Talks about:
learn (12) polici (5) function (4) approxim (4) tempor (4) off (4) differ (3) plan (3) reinforc (2) gradient (2)

Person: Richard S. Sutton

DBLP DBLP: Sutton:Richard_S=

Contributed to:

ICML c1 20142014
ICML c2 20142014
ICML c3 20132013
ICML 20122012
ICML 20102010
ICML 20092009
ICML 20082008
ICML 20072007
ICML 20052005
ICML 20012001
ICML 20002000
ICML 19981998
ICML 19971997
ICML 19951995
ICML 19931993
ML 19911991
ML 19901990
AIIDE 20082008

Wrote 20 papers:

ICML-c1-2014-SeijenS #online
True Online TD(λ) (HvS, RSS), pp. 692–700.
ICML-c2-2014-SuttonMPH #equivalence #monte carlo
A new Q(λ) with interim forward view and Monte Carlo equivalence (RSS, ARM, DP, HvH), pp. 568–576.
Planning by Prioritized Sweeping with Small Backups (HvS, RSS), pp. 361–369.
ICML-2012-DegrisWS #linear
Linear Off-Policy Actor-Critic (TD, MW, RSS), p. 28.
ICML-2010-MaeiSBS #approximate #learning #towards
Toward Off-Policy Learning Control with Function Approximation (HRM, CS, SB, RSS), pp. 719–726.
ICML-2009-SuttonMPBSSW #approximate #learning #linear #performance
Fast gradient-descent methods for temporal-difference learning with linear function approximation (RSS, HRM, DP, SB, DS, CS, EW), pp. 993–1000.
ICML-2008-SilverSM #learning
Sample-based learning and search with permanent and transient memories (DS, RSS, MM), pp. 968–975.
ICML-2007-SuttonKS #on the
On the role of tracking in stationary environments (RSS, AK, DS), pp. 871–878.
ICML-2005-TannerS #network
TD(λ) networks: temporal-difference networks with eligibility traces (BT, RSS), pp. 888–895.
ICML-2001-PrecupSD #approximate #difference #learning
Off-Policy Temporal Difference Learning with Function Approximation (DP, RSS, SD), pp. 417–424.
ICML-2001-StoneS #learning #scalability #towards
Scaling Reinforcement Learning toward RoboCup Soccer (PS, RSS), pp. 537–544.
ICML-2000-PrecupSS #evaluation #policy
Eligibility Traces for Off-Policy Policy Evaluation (DP, RSS, SPS), pp. 759–766.
ICML-1998-SuttonPS #learning
Intra-Option Learning about Temporally Abstract Actions (RSS, DP, SPS), pp. 556–564.
ICML-1997-PrecupS #learning
Exponentiated Gradient Methods for Reinforcement Learning (DP, RSS), pp. 272–277.
ICML-1995-Sutton #modelling
TD Models: Modeling the World at a Mixture of Time Scales (RSS), pp. 531–539.
ICML-1993-SuttonW #learning #online #random
Online Learning with Random Representations (RSS, SDW), pp. 314–321.
ML-1991-Sutton #incremental #programming
Planning by Incremental Dynamic Programming (RSS), pp. 353–357.
ML-1991-SuttonM #learning #polynomial
Learning Polynomial Functions by Feature Construction (RSS, CJM), pp. 208–212.
ML-1990-Sutton #approximate #architecture #learning #programming
Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming (RSS), pp. 216–224.
AIIDE-2008-CutumisuSBS #game studies #learning #using
Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games (MC, DS, MHB, RSS).

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.