BibSLEIGH
BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
EDIT!
CC-BY
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
Travelled to:
1 × Finland
1 × France
1 × Germany
1 × Israel
1 × Italy
1 × United Kingdom
2 × Canada
2 × China
4 × USA
Collaborated with:
A.György B.A.Pires R.Greiner Y.Yu J.Audibert D.Schuurmans I.Szita R.Munos W.D.Smart M.L.Littman B.Póczos H.R.Maei S.Bhatnagar R.S.Sutton R.Huang Y.Wu T.Dick P.Joulani M.Ghavamzadeh G.Bartók N.Zolghadr A.Farhangfar V.Mnih A.M.Farahmand Z.Gábor Z.Kalmár B.Kveton Z.Wen A.Ashkan J.Neufeld A.Afkanpour M.Bowling H.Cheng L.Li Y.Abbasi-Yadkori N.R.Sturtevant D.Precup D.Silver E.Wiewiora
Talks about:
learn (14) reinforc (4) under (3) model (3) finit (3) estim (3) bound (3) adapt (3) base (3) algorithm (2)

Person: Csaba Szepesvári

DBLP DBLP: Szepesv=aacute=ri:Csaba

Contributed to:

ICML 20152015
ICML c1 20142014
ICML c2 20142014
ICML c1 20132013
ICML c3 20132013
ICML 20122012
ICML 20102010
ICML 20092009
ICML 20082008
ICML 20072007
ICML 20052005
ICML 20042004
ICML 19981998
ICML 19961996

Wrote 24 papers:

ICML-2015-HuangGS #analysis #component #independence
Deterministic Independent Component Analysis (RH, AG, CS), pp. 2521–2530.
ICML-2015-KvetonSWA #learning #rank
Cascading Bandits: Learning to Rank in the Cascade Model (BK, CS, ZW, AA), pp. 767–776.
ICML-2015-WuGS #combinator #feedback #finite #identification #on the
On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments (YW, AG, CS), pp. 1283–1291.
ICML-c1-2014-DickGS #learning #markov #online #process #sequence
Online Learning in Markov Decision Processes with Changing Cost Sequences (TD, AG, CS), pp. 512–520.
ICML-c2-2014-NeufeldGSS #adaptation #monte carlo
Adaptive Monte Carlo via Bandit Allocation (JN, AG, CS, DS), pp. 1944–1952.
ICML-c1-2013-AfkanpourGSB #algorithm #kernel #learning #multi #random #scalability
A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning (AA, AG, CS, MB), pp. 374–382.
ICML-c1-2013-YuCSS #theorem
Characterizing the Representer Theorem (YY, HC, DS, CS), pp. 570–578.
ICML-c3-2013-JoulaniGS #feedback #learning #online
Online Learning under Delayed Feedback (PJ, AG, CS), pp. 1453–1461.
ICML-c3-2013-PiresSG #bound #classification #multi
Cost-sensitive Multiclass Classification Risk Bounds (BAP, CS, MG), pp. 1391–1399.
ICML-2012-BartokZS #adaptation #algorithm #finite #monitoring #probability
An adaptive algorithm for finite stochastic partial monitoring (GB, NZ, CS), p. 231.
ICML-2012-PiresS #estimation #learning #linear #statistics
Statistical linear estimation with penalized estimators: an application to reinforcement learning (BAP, CS), p. 228.
ICML-2012-YuS #analysis #kernel
Analysis of Kernel Mean Matching under Covariate Shift (YY, CS), p. 150.
ICML-2010-LiPSG #learning #parametricity
Budgeted Distribution Learning of Belief Net Parameters (LL, BP, CS, RG), pp. 879–886.
ICML-2010-MaeiSBS #approximate #learning #towards
Toward Off-Policy Learning Control with Function Approximation (HRM, CS, SB, RSS), pp. 719–726.
ICML-2010-SzitaS #bound #complexity #learning #modelling
Model-based reinforcement learning with nearly tight exploration complexity bounds (IS, CS), pp. 1031–1038.
ICML-2009-FarhangfarGS #image #learning
Learning to segment from a few well-selected training images (AF, RG, CS), pp. 305–312.
ICML-2009-PoczosASGS #exclamation #learning
Learning when to stop thinking and do something! (BP, YAY, CS, RG, NRS), pp. 825–832.
ICML-2009-SuttonMPBSSW #approximate #learning #linear #performance
Fast gradient-descent methods for temporal-difference learning with linear function approximation (RSS, HRM, DP, SB, DS, CS, EW), pp. 993–1000.
ICML-2008-MnihSA #empirical
Empirical Bernstein stopping (VM, CS, JYA), pp. 672–679.
ICML-2007-FarahmandSA #adaptation #estimation
Manifold-adaptive dimension estimation (AMF, CS, JYA), pp. 265–272.
ICML-2005-SzepesvariM #bound #finite
Finite time bounds for sampling based fitted value iteration (CS, RM), pp. 880–887.
ICML-2004-SzepesvariS
Interpolation-based Q-learning (CS, WDS).
ICML-1998-GaborKS #learning #multi
Multi-criteria Reinforcement Learning (ZG, ZK, CS), pp. 197–205.
ICML-1996-LittmanS #convergence
A Generalized Reinforcement-Learning Model: Convergence and Applications (MLL, CS), pp. 310–318.

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.