User Tools

Site Tools


A PCRE internal error occured. This might be caused by a faulty plugin
Writing /home/users/ashutosn/public_html/CommNetS2016/dokuwiki/data/cache/4/4c2f97149de6dda9d767b48ed94dfe9f.i failed
Unable to save cache file. Hint: disk full; file permissions; safe_mode setting.
Writing /home/users/ashutosn/public_html/CommNetS2016/dokuwiki/data/meta/linearly_parameterized_bandits.meta failed
Unable to save metadata file. Hint: disk full; file permissions; safe_mode setting.
linearly_parameterized_bandits
A PCRE internal error occured. This might be caused by a faulty plugin
Writing /home/users/ashutosn/public_html/CommNetS2016/dokuwiki/data/cache/4/4c2f97149de6dda9d767b48ed94dfe9f.i failed
Unable to save cache file. Hint: disk full; file permissions; safe_mode setting.
Writing /home/users/ashutosn/public_html/CommNetS2016/dokuwiki/data/cache/4/4c2f97149de6dda9d767b48ed94dfe9f.xhtml failed

== Linearly Parameterized Bandits == Paat Rusmevichientong (USC) Wed April 4, 2:00pm. EEB 248 Abstract: Motivated by applications in e-commerce and revenue management, we consider multiarmed bandit problems involving a large (possibly infinite) collection of arms, in which the expected reward of each arm is a linear function of an unknown univariate or multivariate random variable. The objective is to choose a sequence of arms to minimize the cumulative regret and Bayes risk. For the univariate case, we find that a greedy policy suffices. For the multivariate case, when the set of arms is strongly convex, we provide a lower bound and a policy that results in a matching upper bound. In the absence of strong convexity, we describe a policy whose performance is within a polylogarithmic factor from the optimal. This is a joint work with John Tsitsiklis and Adam Mersereau. Back to [[start | CommNetS Seminar Page ]]

linearly_parameterized_bandits.txt · Last modified: 2016/09/01 19:15 (external edit)