am

(no subject)

May 13, 2018 05:01

Max-entropy
and risk-aware
Inverse RL:

B.D. Ziebart, A.L. Maas, J.A. Bagnell, A.K. Dey,
Maximum entropy inverse reinforcement learning
in Proc. AAAI Conf. on A.I., 2008, 1433-1438.

A. Boularias, J. Kober, J.R. Peters (2011)
Relative entropy inverse reinforcement learning
in Proc. Int. Conf. A.I. Stat., 2011, 182-189.

N. Aghasadeghi, T. Bretl (2011) “Maximum entropy
inverse reinforcement learning in continuous state
spaces with path integrals
” in Proc. IEEE/RSJ Int.
Conf. Intell. Robots Syst., 2011, pp. 1561-1566.

T. Park, S. Levine (2013) “Inverse optimal control
for humanoid locomotion
” in Robot. Sci. Syst. WS
Inverse Opt. Contr. Robot. Learn. Demonstr., 2013.

M. Kalakrishnan, P. Pastor, L. Righetti, S. Schaal,
(2013) “Learning objective functions for manipulation
in Proc. IEEE Int. Conf. Robot. Autom., 1331-1336.

J. Mainprice, D. Berenson (‎2014) "Learning Cost
Functions for Motion Planning of Human-Robot
Collaborative Manipulation Tasks from Human-
Human Demonstration.
" 2014 AAAI Fall Symposium.

Previously: 1, 2, 3.

Related:

N. Sugimoto, J. Morimoto (2011) "Phase-dependent
trajectory optimization for CPG-based biped walking
using path integral reinforcement learning,
" in Proc.
11th Int. Conf. on Humanoid Robots, IEEE-RAS, 255-260.
M.B. Horowitz, A. Damle, J.W. Burdick (2014)
"Linear Hamilton Jacobi Bellman equations in
high dimensions,
" in Proc. 53rd Ann. IEEE Conf.
on Decision and Control (CDC), 2014, 5880-5887.
M.B. Horowitz, J.W. Burdick (2014) "Optimal
navigation functions for nonlinear stochastic
systems.
" Intel. Robots and Syst. IROS 2014.

Links from Horowitz,
Damle, Burdick (2014):
Fast methods for linear HJB
(for linear solvable MDPs):

G. Beylkin, M.J. Mohlenkamp (2005) "Algorithms
for Numerical Analysis in High Dimensions.
"
SIAM J. on Sci. Comp., 26(6):2133-2159.
M.B. Horowitz, J.W. Burdick (2014) "Semidefinite
relaxations for stochastic optimal control policies
"
In Am. Controls Conf. (ACC, 2014), 3006-3012.
Y.P. Leong, M.B. Horowitz, J.W. Burdick (2016)
"Linearly Solvable Stochastic Control Lyapunov Functions"
I.M. Mitchell, C.J. Tomlin (2003)
"Overapproximating Reachable Sets
by Hamilton-Jacobi Projections.
"
J. of Sci. Comp., 19(1-3):323-346.
W.M. McEneaney (2007) "A curse-of-dimensionality-free
numerical method for solution of certain HJB PDEs
."
SIAM J. on Control and Optim., 46(4):1239-1276.
J.B. Lasserre (2001) "Global Optimization with
Polynomials and the Problem of Moments.
"
SIAM J. on Optimization, 11(3):796-817.
J.B. Lasserre, D. Henrion, C. Prieur, E. Trélat
(2008) "Nonlinear Optimal Control via Occupation
Measures and LMI-Relaxations
" SIAM J. on Control
and Optim., 47(4):1643-1666. Erratum.
A. Majumdar, A.A. Ahmadi, R. Tedrake (2013)
"Control design along trajectories with sums
of squares programming
" In IEEE Int. Conf.
on Robotics and Autom. (ICRA), pp.:4054-4061.
Cf.

Previously: Tensors, Lasserre,
Fienup, Transitive Closure,
DP speedups, & Doubling,
Schur-Nevanlinna-Pick.

Links from Mitchell & Tomlin (2003):
Mitchell, I., Bayen, A., Tomlin, C.J. (2001)
"Validating a Hamilton-Jacobi approximation to
hybrid system reachable sets
" in Benedetto, M.D.D.,
et al. (eds.), "Hybrid Systems: Computation and
Control", L.N.C.S.2034, Springer-Verlag, pp.418-432.
Mitchell, I., Tomlin, C. (2002) "Level set methods
for computation in hybrid systems
" in Krogh, B., et al.
(eds.), "Hybrid Systems: Computation and Control,"
L.N.C.S. 1790, Springer-Verlag, pp.310-323.
Mitchell, I., Bayen, A., Tomlin, C. J. (2005)
"A Time-Dependent Hamilton-Jacobi Formulation of
Reachable Sets for Continuous Dynamic Games
"
IEEE Trans. on Autom. and Contr., 50(7), 974.
Osher, S., Sethian, J.A. (1988) "Fronts propagating
with curvature-dependent speed: Algorithms based on
Hamilton-Jacobi formulations
" J.Comput.Phys. 79, 12-49.
Canonical transform.: 1, 2.

stat, pde, gm, bss, slam, dsp, mdp, me, imit, math, lp, ai, dc, sdp, mds, em, ta, optics, dbn, tl, vlsn, cs, regr, ml, rl, bp, ct, pomdp, nlogn, mt, mc, qp, cyb, dp, pca

Previous post Next post
Up