Solver APIs

iLQG/iLEQG Solver

RATiLQR.ILEQGSolverType
ILEQGSolver(problem::FiniteHorizonRiskSensitiveOptimalControlProblems, kwargs...)

iLQG and iLEQG Solver for problem.

Optional Keyword Arguments

  • μ_min::Float64 – minimum value for Hessian regularization parameter μ (> 0). Default: 1e-6.
  • Δ_0::Float64 – minimum multiplicative modification factor (> 0) for μ. Default: 2.0.
  • λ::Float64 – multiplicative modification factor in (0, 1) for line search step size ϵ. Default: 0.5.
  • d::Float64 – convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less than d, the solver is considered to be converged. Default: 1e-2.
  • iter_max::Int64 – maximum iteration number. Default: 100.
  • ϵ_init::Float64 – initial step size in (ϵ_min, 1] to start the backtracking line search with. If adaptive_ϵ_init is true, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. If adaptive_ϵ_init is false, the specified value of ϵ_init is used across all the iterations as the initial step size. Default:1.0.
  • adaptive_ϵ_init::Bool – if true, ϵ_init is adaptively changed based on the last step size ϵ of the previous iLEQG iteration. Default: false.
    • If the first line search iterate ϵ_init_prev in the previous iLEQG iteration is successful, then ϵ_init for the next iLEQG iteration is set to ϵ_init = ϵ_init_prev / λ so that the initial line search step increases.
    • Otherwise ϵ_init = ϵ_last where ϵ_last is the line search step accepted in the previous iLEQG iteration.
  • ϵ_min::Float64 – minimum value of step size ϵ to terminate the line search. When ϵ_min is reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default: 1e-6.
  • f_returns_jacobian::Bool – if true, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default: false.
source

RAT iLQR Solver

RATiLQR.CrossEntropyBilevelOptimizationSolverType
CrossEntropyBilevelOptimizationSolver(kwargs...)

RAT iLQR (i.e. Cross Entropy Method + iLEQG) Solver.

Optional Keyword Arguments

iLEQG Solver Parameters

  • μ_min_ileqg::Float64 – minimum value for Hessian regularization parameter μ (> 0). Default: 1e-6.
  • Δ_0_ileqg::Float64 – minimum multiplicative modification factor (> 0) for μ. Default: 2.0.
  • λ_ileqg::Float64 – multiplicative modification factor in (0, 1) for line search step size ϵ. Default: 0.5.
  • d_ileqg::Float64 – convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less than d, the solver is considered to be converged. Default: 1e-2.
  • iter_max_ileqg::Int64 – maximum iteration number. Default: 100
  • ϵ_init_ileqg::Float64 – initial step size in (ϵ_min, 1] to start the backtracking line search with. If adaptive_ϵ_init is true, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. If adaptive_ϵ_init is false, the specified value of ϵ_init is used across all the iterations as the initial step size. Default:1.0.
  • adaptive_ϵ_init_ileqg::Bool – if true, ϵ_init is adaptively changed based on the last step size ϵ of the previous iLEQG iteration. Default: false.
    • If the first line search iterate ϵ_init_prev in the previous iLEQG iteration is successful, then ϵ_init for the next iLEQG iteration is set to ϵ_init = ϵ_init_prev / λ so that the initial line search step increases.
    • Otherwise ϵ_init = ϵ_last where ϵ_last is the line search step accepted in the previous iLEQG iteration.
  • ϵ_min_ileqg::Float64 – minimum value of step size ϵ to terminate the line search. When ϵ_min is reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default: 1e-6.
  • f_returns_jacobian::Bool – if true, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default: false.

Cross Entropy Solver Parameters

  • μ_init::Float64 – initial value of the mean parameter μ used in the first Cross Entropy iteration. Default: 1.0.
  • σ_init::Float64 – initial value of the standard deviation parameter σ used in the first Cross Entropy iteration. Default: 2.0.
  • num_samples::Int64 – number of Monte Carlo samples for the risk-sensitivity parameter θ. Default: 10.
  • num_elite::Int64 – number of elite samples. Default: 3.
  • iter_max::Int64 – maximum iteration number. Default: 5.
  • λ::Float64 – multiplicative modification factor in (0, 1) for `μ_init and σ_init. Default: 0.5.
  • use_θ_max::Bool – if true, the maximum feasible θ found is used to perform the final iLEQG optimization instead of the optimal one. Default: false.

Notes

  • The values of μ_init and σ_init, which may be modified during optimization, are stored internally in the solver and carried over to the next call to solve!.
source

RAT iLQR++ Solver

RATiLQR.NelderMeadBilevelOptimizationSolverType
NelderMeadBilevelOptimizationSolver(kwargs...)

RAT iLQR++ (i.e. Nelder-Mead Simplex Method + iLEQG) Solver.

Optional Keyword Arguments

iLEQG Solver Parameters

  • μ_min_ileqg::Float64 – minimum value for Hessian regularization parameter μ (> 0). Default: 1e-6.
  • Δ_0_ileqg::Float64 – minimum multiplicative modification factor (> 0) for μ. Default: 2.0.
  • λ_ileqg::Float64 – multiplicative modification factor in (0, 1) for line search step size ϵ. Default: 0.5.
  • d_ileqg::Float64 – convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less than d, the solver is considered to be converged. Default: 1e-2.
  • iter_max_ileqg::Int64 – maximum iteration number. Default: 100.
  • ϵ_init_ileqg::Float64 – initial step size in (ϵ_min, 1] to start the backtracking line search with. If adaptive_ϵ_init is true, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. If adaptive_ϵ_init is false, the specified value of ϵ_init is used across all the iterations as the initial step size. Default:1.0.
  • adaptive_ϵ_init_ileqg::Bool – if true, ϵ_init is adaptively changed based on the last step size ϵ of the previous iLEQG iteration. Default: false.
    • If the first line search iterate ϵ_init_prev in the previous iLEQG iteration is successful, then ϵ_init for the next iLEQG iteration is set to ϵ_init = ϵ_init_prev / λ so that the initial line search step increases.
    • Otherwise ϵ_init = ϵ_last where ϵ_last is the line search step accepted in the previous iLEQG iteration.
  • ϵ_min_ileqg::Float64 – minimum value of step size ϵ to terminate the line search. When ϵ_min is reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default: 1e-6.
  • f_returns_jacobian::Bool – if true, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default: false.

Nelder-Mead Simplex Solver Parameters

  • α::Float64 – reflection parameter. Default: 1.0.
  • β::Float64 – expansion parameter. Default: 2.0.
  • γ::Float64 – contraction parameter. Default: 0.5.
  • ϵ::Float64 – convergence parameter. The algorithm is said to have convergeced if the standard deviation of the objective values at the vertices of the simplex is below ϵ. Default: 1e-2.
  • λ::Float64 – multiplicative modification factor in (0, 1) for θ_high_init and θ_low_init, which is repeatedly applied in case the objective value is infinity until a feasible region is find. Default: 0.5.
  • θ_high_init::Float64 – Initial guess for θ_high. Default: 3.0.
  • θ_low_init::Float64 – Initial guess for θ_low. Default: 1e-8.
  • iter_max::Int64 – maximum iteration number. Default: 100.

Notes

  • The Nelder-Mead Simplex method maintains a 1D simplex (i.e. a line segment that consists of 2 points, θ_high and θ_low) to search for the optimal risk-sensitivity parameter θ. θ_high and θ_low refer to the verteces of the simplex with the highest and the lowest objective values, respectively.
  • The initial guesses θ_high_init and θ_low_init, which may be modified during optimization, are stored internally in the solver and carried over to the next call to solve!.
source

PETS Solver

RATiLQR.CrossEntropyDirectOptimizationSolverType
CrossEntropyDirectOptimizationSolver(μ_init_array::Vector{Vector{Float64}},
Σ_init_array::Vector{Matrix{Float64}}; kwargs...)

PETS Solver initialized with μ_init_array = [μ_0,...,μ_{N-1}] and Σ_init_array = [Σ_0,...,Σ_{N-1}], where the initial control distribution at time k is a Gaussian distribution Distributions.MvNormal(μ_k, Σ_k).

Optional Keyword Arguments

  • num_control_samples::Int64 – number of Monte Carlo samples for the control trajectory. Default: 10.
  • num_trajectory_samples::Int64 – number of Monte Carlo samples for the state trajectory. Default: 10.
  • num_elite::Int64 – number of elite samples. Default: 3.
  • iter_max::Int64 – maximum iteration number. Default: 5.
  • smoothing_factor::Float64 – smoothing factor in (0, 1), used to update the mean and the variance of the Cross Entropy distribution for the next iteration. If smoothing_factor is 0.0, the updated distribution is independent of the previous iteration. If it is 1.0, the updated distribution is the same as the previous iteration. Default. 0.1.
source

The solve! Function

Once a problem is defined and a solver is instantiated, you can call solve! with appropriate arguments to perform optimization.

RATiLQR.solve!Function
solve!(ileqg::ILEQGSolver, problem::FiniteHorizonRiskSensitiveOptimalControlProblem,
x_0::Vector{Float64}, u_array::Vector{Vector{Float64}}; θ::Float64, verbose=true)

Given problem, and ileqg solver, solve iLQG (if θ == 0) or iLEQG (if θ > 0) with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}].

Return Values (Ordered)

  • x_array::Vector{Vector{Float64}} – nominal state trajectory [x_0,...,x_N].
  • l_array::Vector{Vector{Float64}} – nominal control schedule [l_0,...,l_{N-1}].
  • L_array::Vector{Matrix{Float64}} – feedback gain schedule [L_0,...,L_{N-1}].
  • value::Float64 – optimal cost-to-go (i.e. value) found by the solver.
  • ϵ_history::Vector{Float64} – history of line search step sizes used during the iLEQG iteration. Mainly for debugging purposes.

Notes

  • Returns a time-varying affine state-feedback policy π_k of the form π_k(x) = L_k(x - x_k) + l_k.
source
solve!(ce_solver::CrossEntropyBilevelOptimizationSolver,
problem::FiniteHorizonRiskSensitiveOptimalControlProblem,
x_0::Vector{Float64}, u_array::Vector{Vector{Float64}}, rng::AbstractRNG;
kl_bound::Float64, verbose=true, serial=false)

Given problem and ce_solver (i.e. a RAT iLQR Solver), solve distributionally robust control with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}] under the KL divergence bound of kl_bound (>= 0).

Return Values (Ordered)

  • θ_opt::Float64 – optimal risk-sensitivity parameter.
  • x_array::Vector{Vector{Float64}} – nominal state trajectory [x_0,...,x_N].
  • l_array::Vector{Vector{Float64}} – nominal control schedule [l_0,...,l_{N-1}].
  • L_array::Vector{Matrix{Float64}} – feedback gain schedule [L_0,...,L_{N-1}].
  • value::Float64 – optimal cost-to-go (i.e. objective value) found by the solver.
  • θ_min::Float64 – minimum feasible risk-sensitivity parameter found.
  • θ_max::Float64 – maximum feasible risk-sensitivity parameter found.

Notes

  • Returns a time-varying affine state-feedback policy π_k of the form π_k(x) = L_k(x - x_k) + l_k.
  • If kl_bound is 0.0, the solver reduces to iLQG.
  • If serial is true, Monte Carlo sampling of the Cross Entropy method is serialized on a single process. If false it is distributed on all the available worker processes.
source
solve!(nm_solver::NelderMeadBilevelOptimizationSolver,
problem::FiniteHorizonRiskSensitiveOptimalControlProblem, x_0::Vector{Float64},
u_array::Vector{Vector{Float64}}; kl_bound::Float64, verbose=true)

Given problem and nm_solver (i.e. a RAT iLQR++ Solver), solve distributionally robust control with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}] under the KL divergence bound of kl_bound (>= 0).

Return Values (Ordered)

  • θ_opt::Float64 – optimal risk-sensitivity parameter.
  • x_array::Vector{Vector{Float64}} – nominal state trajectory [x_0,...,x_N].
  • l_array::Vector{Vector{Float64}} – nominal control schedule [l_0,...,l_{N-1}].
  • L_array::Vector{Matrix{Float64}} – feedback gain schedule [L_0,...,L_{N-1}].
  • value::Float64 – optimal cost-to-go (i.e. objective value) found by the solver.

Notes

  • Returns a time-varying affine state-feedback policy π_k of the form π_k(x) = L_k(x - x_k) + l_k.
  • If kl_bound is 0.0, the solver reduces to iLQG.
source
solve!(direct_solver::CrossEntropyDirectOptimizationSolver,
problem::FiniteHorizonGenerativeOptimalControlProblem, x_0::Vector{Float64},
rng::AbstractRNG; use_true_model=false, verbose=true, serial=true)

Given problem and direct_solver (i.e. a PETS Solver), solve stochastic optimal control with current state x_0.

Return Values (Ordered)

  • μ_array::Vector{Vector{Float64}} – array of means [μ_0,...,μ_{N-1}] for the final Cross Entropy distribution for the control schedule.
  • Σ_array::Vector{Matrix{Float64}} – array of covariance matrices [Σ_0,...,Σ_{N-1}] for the final Cross Entropy distribution for the control schedule.

Notes

  • Returns an open-loop control policy.
  • If use_true_model is true, the solver uses the true stochastic dynamics model defined in problem.f_stochastic.
  • If serial is true, Monte Carlo sampling of the Cross Entropy method is serialized on a single process. If false it is distributed on all the available worker processes. We recommend to leave this to true as distributed processing can be slower for this algorithm.
source