Solver APIs
iLQG/iLEQG Solver
RATiLQR.ILEQGSolver — TypeILEQGSolver(problem::FiniteHorizonRiskSensitiveOptimalControlProblems, kwargs...)iLQG and iLEQG Solver for problem.
Optional Keyword Arguments
μ_min::Float64– minimum value for Hessian regularization parameterμ(> 0). Default:1e-6.Δ_0::Float64– minimum multiplicative modification factor (> 0) forμ. Default:2.0.λ::Float64– multiplicative modification factor in (0, 1) for line search step sizeϵ. Default:0.5.d::Float64– convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less thand, the solver is considered to be converged. Default:1e-2.iter_max::Int64– maximum iteration number. Default: 100.ϵ_init::Float64– initial step size in (ϵ_min, 1] to start the backtracking line search with. Ifadaptive_ϵ_initistrue, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. Ifadaptive_ϵ_initisfalse, the specified value ofϵ_initis used across all the iterations as the initial step size. Default:1.0.adaptive_ϵ_init::Bool– iftrue,ϵ_initis adaptively changed based on the last step sizeϵof the previous iLEQG iteration. Default:false.- If the first line search iterate
ϵ_init_previn the previous iLEQG iteration is successful, thenϵ_initfor the next iLEQG iteration is set toϵ_init = ϵ_init_prev / λso that the initial line search step increases. - Otherwise
ϵ_init = ϵ_lastwhereϵ_lastis the line search step accepted in the previous iLEQG iteration.
- If the first line search iterate
ϵ_min::Float64– minimum value of step sizeϵto terminate the line search. Whenϵ_minis reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default:1e-6.f_returns_jacobian::Bool– iftrue, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default:false.
RAT iLQR Solver
RATiLQR.CrossEntropyBilevelOptimizationSolver — TypeCrossEntropyBilevelOptimizationSolver(kwargs...)RAT iLQR (i.e. Cross Entropy Method + iLEQG) Solver.
Optional Keyword Arguments
iLEQG Solver Parameters
μ_min_ileqg::Float64– minimum value for Hessian regularization parameterμ(> 0). Default:1e-6.Δ_0_ileqg::Float64– minimum multiplicative modification factor (> 0) forμ. Default:2.0.λ_ileqg::Float64– multiplicative modification factor in (0, 1) for line search step sizeϵ. Default:0.5.d_ileqg::Float64– convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less thand, the solver is considered to be converged. Default:1e-2.iter_max_ileqg::Int64– maximum iteration number. Default: 100ϵ_init_ileqg::Float64– initial step size in (ϵ_min, 1] to start the backtracking line search with. Ifadaptive_ϵ_initistrue, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. Ifadaptive_ϵ_initisfalse, the specified value ofϵ_initis used across all the iterations as the initial step size. Default:1.0.adaptive_ϵ_init_ileqg::Bool– iftrue,ϵ_initis adaptively changed based on the last step sizeϵof the previous iLEQG iteration. Default:false.- If the first line search iterate
ϵ_init_previn the previous iLEQG iteration is successful, thenϵ_initfor the next iLEQG iteration is set toϵ_init = ϵ_init_prev / λso that the initial line search step increases. - Otherwise
ϵ_init = ϵ_lastwhereϵ_lastis the line search step accepted in the previous iLEQG iteration.
- If the first line search iterate
ϵ_min_ileqg::Float64– minimum value of step sizeϵto terminate the line search. Whenϵ_minis reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default:1e-6.f_returns_jacobian::Bool– iftrue, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default:false.
Cross Entropy Solver Parameters
μ_init::Float64– initial value of the mean parameterμused in the first Cross Entropy iteration. Default:1.0.σ_init::Float64– initial value of the standard deviation parameterσused in the first Cross Entropy iteration. Default:2.0.num_samples::Int64– number of Monte Carlo samples for the risk-sensitivity parameterθ. Default:10.num_elite::Int64– number of elite samples. Default:3.iter_max::Int64– maximum iteration number. Default:5.λ::Float64– multiplicative modification factor in (0, 1) for `μ_initandσ_init. Default:0.5.use_θ_max::Bool– iftrue, the maximum feasibleθfound is used to perform the final iLEQG optimization instead of the optimal one. Default:false.
Notes
- The values of
μ_initandσ_init, which may be modified during optimization, are stored internally in the solver and carried over to the next call tosolve!.
RAT iLQR++ Solver
RATiLQR.NelderMeadBilevelOptimizationSolver — TypeNelderMeadBilevelOptimizationSolver(kwargs...)RAT iLQR++ (i.e. Nelder-Mead Simplex Method + iLEQG) Solver.
Optional Keyword Arguments
iLEQG Solver Parameters
μ_min_ileqg::Float64– minimum value for Hessian regularization parameterμ(> 0). Default:1e-6.Δ_0_ileqg::Float64– minimum multiplicative modification factor (> 0) forμ. Default:2.0.λ_ileqg::Float64– multiplicative modification factor in (0, 1) for line search step sizeϵ. Default:0.5.d_ileqg::Float64– convergence error norm threshold (> 0). If the maximum l2 norm of the change in nominal control over the horizon is less thand, the solver is considered to be converged. Default:1e-2.iter_max_ileqg::Int64– maximum iteration number. Default: 100.ϵ_init_ileqg::Float64– initial step size in (ϵ_min, 1] to start the backtracking line search with. Ifadaptive_ϵ_initistrue, then this value is overridden by the solver's adaptive initialization functionality after the first iLEQG iteration. Ifadaptive_ϵ_initisfalse, the specified value ofϵ_initis used across all the iterations as the initial step size. Default:1.0.adaptive_ϵ_init_ileqg::Bool– iftrue,ϵ_initis adaptively changed based on the last step sizeϵof the previous iLEQG iteration. Default:false.- If the first line search iterate
ϵ_init_previn the previous iLEQG iteration is successful, thenϵ_initfor the next iLEQG iteration is set toϵ_init = ϵ_init_prev / λso that the initial line search step increases. - Otherwise
ϵ_init = ϵ_lastwhereϵ_lastis the line search step accepted in the previous iLEQG iteration.
- If the first line search iterate
ϵ_min_ileqg::Float64– minimum value of step sizeϵto terminate the line search. Whenϵ_minis reached, the last candidate nominal trajectory is accepted regardless of the Armijo condition and the current iLEQG iteration is finished. Default:1e-6.f_returns_jacobian::Bool– iftrue, Jacobian matrices of the dynamics function are user-provided. This can reduce computation time since automatic differentiation is not used. Default:false.
Nelder-Mead Simplex Solver Parameters
α::Float64– reflection parameter. Default:1.0.β::Float64– expansion parameter. Default:2.0.γ::Float64– contraction parameter. Default:0.5.ϵ::Float64– convergence parameter. The algorithm is said to have convergeced if the standard deviation of the objective values at the vertices of the simplex is belowϵ. Default:1e-2.λ::Float64– multiplicative modification factor in (0, 1) forθ_high_initandθ_low_init, which is repeatedly applied in case the objective value is infinity until a feasible region is find. Default:0.5.θ_high_init::Float64– Initial guess forθ_high. Default:3.0.θ_low_init::Float64– Initial guess forθ_low. Default:1e-8.iter_max::Int64– maximum iteration number. Default:100.
Notes
- The Nelder-Mead Simplex method maintains a 1D simplex (i.e. a line segment that consists of 2 points,
θ_highandθ_low) to search for the optimal risk-sensitivity parameterθ.θ_highandθ_lowrefer to the verteces of the simplex with the highest and the lowest objective values, respectively. - The initial guesses
θ_high_initandθ_low_init, which may be modified during optimization, are stored internally in the solver and carried over to the next call tosolve!.
PETS Solver
RATiLQR.CrossEntropyDirectOptimizationSolver — TypeCrossEntropyDirectOptimizationSolver(μ_init_array::Vector{Vector{Float64}},
Σ_init_array::Vector{Matrix{Float64}}; kwargs...)PETS Solver initialized with μ_init_array = [μ_0,...,μ_{N-1}] and Σ_init_array = [Σ_0,...,Σ_{N-1}], where the initial control distribution at time k is a Gaussian distribution Distributions.MvNormal(μ_k, Σ_k).
Optional Keyword Arguments
num_control_samples::Int64– number of Monte Carlo samples for the control trajectory. Default:10.num_trajectory_samples::Int64– number of Monte Carlo samples for the state trajectory. Default:10.num_elite::Int64– number of elite samples. Default:3.iter_max::Int64– maximum iteration number. Default:5.smoothing_factor::Float64– smoothing factor in (0, 1), used to update the mean and the variance of the Cross Entropy distribution for the next iteration. Ifsmoothing_factoris0.0, the updated distribution is independent of the previous iteration. If it is1.0, the updated distribution is the same as the previous iteration. Default.0.1.
The solve! Function
Once a problem is defined and a solver is instantiated, you can call solve! with appropriate arguments to perform optimization.
RATiLQR.solve! — Functionsolve!(ileqg::ILEQGSolver, problem::FiniteHorizonRiskSensitiveOptimalControlProblem,
x_0::Vector{Float64}, u_array::Vector{Vector{Float64}}; θ::Float64, verbose=true)Given problem, and ileqg solver, solve iLQG (if θ == 0) or iLEQG (if θ > 0) with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}].
Return Values (Ordered)
x_array::Vector{Vector{Float64}}– nominal state trajectory[x_0,...,x_N].l_array::Vector{Vector{Float64}}– nominal control schedule[l_0,...,l_{N-1}].L_array::Vector{Matrix{Float64}}– feedback gain schedule[L_0,...,L_{N-1}].value::Float64– optimal cost-to-go (i.e. value) found by the solver.ϵ_history::Vector{Float64}– history of line search step sizes used during the iLEQG iteration. Mainly for debugging purposes.
Notes
- Returns a time-varying affine state-feedback policy
π_kof the formπ_k(x) = L_k(x - x_k) + l_k.
solve!(ce_solver::CrossEntropyBilevelOptimizationSolver,
problem::FiniteHorizonRiskSensitiveOptimalControlProblem,
x_0::Vector{Float64}, u_array::Vector{Vector{Float64}}, rng::AbstractRNG;
kl_bound::Float64, verbose=true, serial=false)Given problem and ce_solver (i.e. a RAT iLQR Solver), solve distributionally robust control with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}] under the KL divergence bound of kl_bound (>= 0).
Return Values (Ordered)
θ_opt::Float64– optimal risk-sensitivity parameter.x_array::Vector{Vector{Float64}}– nominal state trajectory[x_0,...,x_N].l_array::Vector{Vector{Float64}}– nominal control schedule[l_0,...,l_{N-1}].L_array::Vector{Matrix{Float64}}– feedback gain schedule[L_0,...,L_{N-1}].value::Float64– optimal cost-to-go (i.e. objective value) found by the solver.θ_min::Float64– minimum feasible risk-sensitivity parameter found.θ_max::Float64– maximum feasible risk-sensitivity parameter found.
Notes
- Returns a time-varying affine state-feedback policy
π_kof the formπ_k(x) = L_k(x - x_k) + l_k. - If
kl_boundis 0.0, the solver reduces to iLQG. - If
serialistrue, Monte Carlo sampling of the Cross Entropy method is serialized on a single process. Iffalseit is distributed on all the available worker processes.
solve!(nm_solver::NelderMeadBilevelOptimizationSolver,
problem::FiniteHorizonRiskSensitiveOptimalControlProblem, x_0::Vector{Float64},
u_array::Vector{Vector{Float64}}; kl_bound::Float64, verbose=true)Given problem and nm_solver (i.e. a RAT iLQR++ Solver), solve distributionally robust control with current state x_0 and nominal control schedule u_array = [u_0, ..., u_{N-1}] under the KL divergence bound of kl_bound (>= 0).
Return Values (Ordered)
θ_opt::Float64– optimal risk-sensitivity parameter.x_array::Vector{Vector{Float64}}– nominal state trajectory[x_0,...,x_N].l_array::Vector{Vector{Float64}}– nominal control schedule[l_0,...,l_{N-1}].L_array::Vector{Matrix{Float64}}– feedback gain schedule[L_0,...,L_{N-1}].value::Float64– optimal cost-to-go (i.e. objective value) found by the solver.
Notes
- Returns a time-varying affine state-feedback policy
π_kof the formπ_k(x) = L_k(x - x_k) + l_k. - If
kl_boundis 0.0, the solver reduces to iLQG.
solve!(direct_solver::CrossEntropyDirectOptimizationSolver,
problem::FiniteHorizonGenerativeOptimalControlProblem, x_0::Vector{Float64},
rng::AbstractRNG; use_true_model=false, verbose=true, serial=true)Given problem and direct_solver (i.e. a PETS Solver), solve stochastic optimal control with current state x_0.
Return Values (Ordered)
μ_array::Vector{Vector{Float64}}– array of means[μ_0,...,μ_{N-1}]for the final Cross Entropy distribution for the control schedule.Σ_array::Vector{Matrix{Float64}}– array of covariance matrices[Σ_0,...,Σ_{N-1}]for the final Cross Entropy distribution for the control schedule.
Notes
- Returns an open-loop control policy.
- If
use_true_modelistrue, the solver uses the true stochastic dynamics model defined inproblem.f_stochastic. - If
serialistrue, Monte Carlo sampling of the Cross Entropy method is serialized on a single process. Iffalseit is distributed on all the available worker processes. We recommend to leave this totrueas distributed processing can be slower for this algorithm.