BibSLEIGH — bandit stem

Used together with:

arm (20)
multi (15)
problem (10)
learn (10)
regret (9)

Stem bandit$ (all stems)

62 papers:

CASE-2015-LaskeyMMPPBKAG #2d #modelling #multi #nondeterminism: Multi-armed bandit models for 2D grasp planning with uncertainty (ML, JM, ZM, FTP, SP, JPvdB, DK, PA, KG), pp. 572–579.
ICEIS-v1-2015-BurtiniLL #multi #online: Improving Online Marketing Experiments with Drifting Multi-armed Bandits (GB, JL, RL), pp. 630–636.
ICML-2015-CarpentierV #infinity: Simple regret for infinitely many armed bandits (AC, MV), pp. 1133–1141.
ICML-2015-GajaneUC #algorithm #exponential: A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits (PG, TU, FC), pp. 218–227.
ICML-2015-HanawalSVM: Cheap Bandits (MKH, VS, MV, RM), pp. 2133–2142.
ICML-2015-KandasamySP #modelling #optimisation: High Dimensional Bayesian Optimisation and Bandits via Additive Models (KK, JGS, BP), pp. 295–304.
ICML-2015-KomiyamaHN #analysis #multi #probability #problem: Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays (JK, JH, HN), pp. 1152–1161.
ICML-2015-KvetonSWA #learning #rank: Cascading Bandits: Learning to Rank in the Cascade Model (BK, CS, ZW, AA), pp. 767–776.
ICML-2015-SwaminathanJ #feedback #learning: Counterfactual Risk Minimization: Learning from Logged Bandit Feedback (AS, TJ), pp. 814–823.
ICML-2015-SzorenyiBWH #approach #multi: Qualitative Multi-Armed Bandits: A Quantile-Based Approach (BS, RBF, PW, EH), pp. 1660–1668.
ICML-2015-WenKA #combinator #learning #performance #scalability: Efficient Learning in Large-Scale Combinatorial Semi-Bandits (ZW, BK, AA), pp. 1113–1122.
SIGIR-2015-TangJLZL #personalisation #recommendation: Personalized Recommendation via Parameter-Free Contextual Bandits (LT, YJ, LL, CZ, TL), pp. 323–332.
STOC-2014-DekelDKP: Bandits with switching costs: T2/3 regret (OD, JD, TK, YP), pp. 459–467.
CIKM-2014-NguyenL #clustering #multi: Dynamic Clustering of Contextual Multi-Armed Bandits (TTN, HWL), pp. 1959–1962.
ICML-c1-2014-ChenLL #multi #online #problem: Boosting with Online Binary Learners for the Multiclass Bandit Problem (STC, HTL, CJL), pp. 342–350.
ICML-c1-2014-CombesP #algorithm #bound: Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms (RC, AP), pp. 521–529.
ICML-c1-2014-MaillardM: Latent Bandits (OAM, SM), pp. 136–144.
ICML-c1-2014-SeldinBCA #multi #predict: Prediction with Limited Advice and Multiarmed Bandits with Paid Observations (YS, PLB, KC, YAY), pp. 280–287.
ICML-c2-2014-AgarwalHKLLS #algorithm #performance: Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits (AA, DH, SK, JL, LL, RES), pp. 1638–1646.
ICML-c2-2014-AilonKJ: Reducing Dueling Bandits to Cardinal Bandits (NA, ZSK, TJ), pp. 856–864.
ICML-c2-2014-AzarLB #correlation #feedback #online #optimisation #probability: Online Stochastic Optimization under Correlated Bandit Feedback (MGA, AL, EB), pp. 1557–1565.
ICML-c2-2014-GentileLZ #clustering #online: Online Clustering of Bandits (CG, SL, GZ), pp. 757–765.
ICML-c2-2014-MaryPN #algorithm #evaluation: Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques (JM, PP, ON), pp. 172–180.
ICML-c2-2014-NeufeldGSS #adaptation #monte carlo: Adaptive Monte Carlo via Bandit Allocation (JN, AG, CS, DS), pp. 1944–1952.
ICML-c2-2014-SeldinS #algorithm #probability: One Practical Algorithm for Both Stochastic and Adversarial Bandits (YS, AS), pp. 1287–1295.
ICML-c2-2014-ValkoMKK #graph: Spectral Bandits for Smooth Graph Functions (MV, RM, BK, TK), pp. 46–54.
ICML-c2-2014-ZoghiWMR #bound #problem: Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem (MZ, SW, RM, MdR), pp. 10–18.
KDD-2014-FangT #linear: Networked bandits with disjoint linear payoffs (MF, DT), pp. 1106–1115.
RecSys-2014-TangJLL #personalisation #recommendation: Ensemble contextual bandits for personalized recommendation (LT, YJ, LL, TL), pp. 73–80.
ICML-c1-2013-AbernethyAKD #learning #problem #scalability: Large-Scale Bandit Problems and KWIK Learning (JA, KA, MK, MD), pp. 588–596.
ICML-c1-2013-BubeckWV #identification #multi: Multiple Identifications in Multi-Armed Bandits (SB, TW, NV), pp. 258–265.
ICML-c1-2013-ChenWY #combinator #framework #multi: Combinatorial Multi-Armed Bandit: General Framework and Applications (WC, YW, YY), pp. 151–159.
ICML-c2-2013-UrvoyCFN: Generic Exploration and K-armed Voting Bandits (TU, FC, RF, SN), pp. 91–99.
ICML-c3-2013-AgrawalG #linear: Thompson Sampling for Contextual Bandits with Linear Payoffs (SA, NG), pp. 127–135.
ICML-c3-2013-KarninKS #multi: Almost Optimal Exploration in Multi-Armed Bandits (ZSK, TK, OS), pp. 1238–1246.
ICML-c3-2013-SzorenyiBHOJK #algorithm #distributed #probability: Gossip-based distributed stochastic bandit algorithms (BS, RBF, IH, RO, MJ, BK), pp. 19–27.
CGO-2013-EklovNBH #memory management: Bandwidth Bandit: Quantitative characterization of memory contention (DE, NN, DBS, EH), p. 10.
ICML-2012-AvnerMS #multi: Decoupling Exploration and Exploitation in Multi-Armed Bandits (OA, SM, OS), p. 145.
ICML-2012-DekelTA #adaptation #learning #online #policy: Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret (OD, AT, RA), p. 227.
ICML-2012-DesautelsKB #optimisation #process #trade-off: Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization (TD, AK, JWB), p. 109.
ICML-2012-FreitasSZ #bound #exponential #process: Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations (NdF, AJS, MZ), p. 125.
ICML-2012-KalyanakrishnanTAS #multi #probability #set: PAC Subset Selection in Stochastic Multi-armed Bandits (SK, AT, PA, PS), p. 34.
ICML-2012-YueHG: Hierarchical Exploration for Accelerating Contextual Bandits (YY, SAH, CG), p. 128.
ICML-2011-CrammerG #adaptation #classification #feedback #multi #using: Multiclass Classification with Bandit Feedback using Adaptive Regularization (KC, CG), pp. 273–280.
ICML-2011-YueJ: Beat the Mean Bandit (YY, TJ), pp. 241–248.
ICML-2011-YuM: Unimodal Bandits (JYY, SM), pp. 41–48.
KDD-2011-ValizadeganJW #learning #multi #predict: Learning to trade off between exploration and exploitation in multiclass bandit prediction (HV, RJ, SW), pp. 204–212.
ICML-2010-Busa-FeketeK #performance #using: Fast boosting using adversarial bandits (RBF, BK), pp. 143–150.
ICML-2010-KalyanakrishnanS #multi #performance #theory and practice: Efficient Selection of Multiple Bandit Arms: Theory and Practice (SK, PS), pp. 511–518.
ICML-2010-SrinivasKKS #design #optimisation #process: Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design (NS, AK, SK, MWS), pp. 1015–1022.
ICALP-v2-2009-GuhaM #metric #multi: Multi-armed Bandits with Metric Switching Costs (SG, KM), pp. 496–507.
ICML-2009-MesmayRVP #graph #library #optimisation #performance: Bandit-based optimization on graphs with application to library performance tuning (FdM, AR, YV, MP), pp. 729–736.
ICML-2009-YueJ #information retrieval #optimisation #problem: Interactively optimizing information retrieval systems as a dueling bandits problem (YY, TJ), pp. 1201–1208.
ICML-2009-YuM #problem: Piecewise-stationary bandit problems with side observations (JYY, SM), pp. 1177–1184.
STOC-2008-KleinbergSU #metric #multi: Multi-armed bandits in metric spaces (RK, AS, EU), pp. 681–690.
ICML-2008-KakadeST #algorithm #multi #online #performance #predict: Efficient bandit algorithms for online multiclass prediction (SMK, SSS, AT), pp. 440–447.
ICML-2008-RadlinskiKJ #learning #multi #ranking: Learning diverse rankings with multi-armed bandits (FR, RK, TJ), pp. 784–791.
ICML-2007-PandeyCA #multi #problem: Multi-armed bandit problems with dependent arms (SP, DC, DA), pp. 721–728.
ICML-2006-StrehlMLH #learning #problem: Experience-efficient learning in associative bandit problems (ALS, CM, MLL, HH), pp. 889–896.
ICML-1998-Cesa-BianchiF #bound #finite #multi #problem: Finite-Time Regret Bounds for the Multiarmed Bandit Problem (NCB, PF), pp. 100–108.
ICML-1995-Duff #problem: Q-Learning for Bandit Problems (MOD), pp. 209–217.
ICML-1995-SalganicoffU #learning #multi #using: Active Exploration and Learning in real-Valued Spaces using Multi-Armed Bandit Allocation Indices (MS, LHU), pp. 480–487.