Travelled to:
1 × Finland
1 × France
1 × Germany
1 × Israel
1 × Italy
1 × United Kingdom
2 × Canada
2 × China
4 × USA
Collaborated with:
A.György B.A.Pires R.Greiner Y.Yu J.Audibert D.Schuurmans I.Szita R.Munos W.D.Smart M.L.Littman B.Póczos H.R.Maei S.Bhatnagar R.S.Sutton R.Huang Y.Wu T.Dick P.Joulani M.Ghavamzadeh G.Bartók N.Zolghadr A.Farhangfar V.Mnih A.M.Farahmand Z.Gábor Z.Kalmár B.Kveton Z.Wen A.Ashkan J.Neufeld A.Afkanpour M.Bowling H.Cheng L.Li Y.Abbasi-Yadkori N.R.Sturtevant D.Precup D.Silver E.Wiewiora
Talks about:
learn (14) reinforc (4) under (3) model (3) finit (3) estim (3) bound (3) adapt (3) base (3) algorithm (2)
Person: Csaba Szepesvári
DBLP: Szepesv=aacute=ri:Csaba
Contributed to:
Wrote 24 papers:
- ICML-2015-HuangGS #analysis #component #independence
- Deterministic Independent Component Analysis (RH, AG, CS), pp. 2521–2530.
- ICML-2015-KvetonSWA #learning #rank
- Cascading Bandits: Learning to Rank in the Cascade Model (BK, CS, ZW, AA), pp. 767–776.
- ICML-2015-WuGS #combinator #feedback #finite #identification #on the
- On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments (YW, AG, CS), pp. 1283–1291.
- ICML-c1-2014-DickGS #learning #markov #online #process #sequence
- Online Learning in Markov Decision Processes with Changing Cost Sequences (TD, AG, CS), pp. 512–520.
- ICML-c2-2014-NeufeldGSS #adaptation #monte carlo
- Adaptive Monte Carlo via Bandit Allocation (JN, AG, CS, DS), pp. 1944–1952.
- ICML-c1-2013-AfkanpourGSB #algorithm #kernel #learning #multi #random #scalability
- A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning (AA, AG, CS, MB), pp. 374–382.
- ICML-c1-2013-YuCSS #theorem
- Characterizing the Representer Theorem (YY, HC, DS, CS), pp. 570–578.
- ICML-c3-2013-JoulaniGS #feedback #learning #online
- Online Learning under Delayed Feedback (PJ, AG, CS), pp. 1453–1461.
- ICML-c3-2013-PiresSG #bound #classification #multi
- Cost-sensitive Multiclass Classification Risk Bounds (BAP, CS, MG), pp. 1391–1399.
- ICML-2012-BartokZS #adaptation #algorithm #finite #monitoring #probability
- An adaptive algorithm for finite stochastic partial monitoring (GB, NZ, CS), p. 231.
- ICML-2012-PiresS #estimation #learning #linear #statistics
- Statistical linear estimation with penalized estimators: an application to reinforcement learning (BAP, CS), p. 228.
- ICML-2012-YuS #analysis #kernel
- Analysis of Kernel Mean Matching under Covariate Shift (YY, CS), p. 150.
- ICML-2010-LiPSG #learning #parametricity
- Budgeted Distribution Learning of Belief Net Parameters (LL, BP, CS, RG), pp. 879–886.
- ICML-2010-MaeiSBS #approximate #learning #towards
- Toward Off-Policy Learning Control with Function Approximation (HRM, CS, SB, RSS), pp. 719–726.
- ICML-2010-SzitaS #bound #complexity #learning #modelling
- Model-based reinforcement learning with nearly tight exploration complexity bounds (IS, CS), pp. 1031–1038.
- ICML-2009-FarhangfarGS #image #learning
- Learning to segment from a few well-selected training images (AF, RG, CS), pp. 305–312.
- ICML-2009-PoczosASGS #exclamation #learning
- Learning when to stop thinking and do something! (BP, YAY, CS, RG, NRS), pp. 825–832.
- ICML-2009-SuttonMPBSSW #approximate #learning #linear #performance
- Fast gradient-descent methods for temporal-difference learning with linear function approximation (RSS, HRM, DP, SB, DS, CS, EW), pp. 993–1000.
- ICML-2008-MnihSA #empirical
- Empirical Bernstein stopping (VM, CS, JYA), pp. 672–679.
- ICML-2007-FarahmandSA #adaptation #estimation
- Manifold-adaptive dimension estimation (AMF, CS, JYA), pp. 265–272.
- ICML-2005-SzepesvariM #bound #finite
- Finite time bounds for sampling based fitted value iteration (CS, RM), pp. 880–887.
- ICML-2004-SzepesvariS
- Interpolation-based Q-learning (CS, WDS).
- ICML-1998-GaborKS #learning #multi
- Multi-criteria Reinforcement Learning (ZG, ZK, CS), pp. 197–205.
- ICML-1996-LittmanS #convergence
- A Generalized Reinforcement-Learning Model: Convergence and Applications (MLL, CS), pp. 310–318.