ElegantRL_Solver: High-performance RL solvers
For nonconvex optimizations (continuous variables) and combinatorial optimizations (discrete variables), we aim to find high-quality optimum, or even (nearly) global optimum.
For combinatorial optimization problems, we compare with Benchmark.
This project is built based on ElegantRL and OpenAI Gym.
The following two key technologies are under active development:
-
Massively parallel simuations of gym-environments on GPU, using thousands of CUDA cores and tensor cores.
-
Podracer scheduling on a GPU cloud, e.g., DGX-2 SuperPod.
Several key references:
-
Mazyavkina, Nina, et al. "Reinforcement learning for combinatorial optimization: A survey." Computers & Operations Research 134 (2021): 105400.
-
Bengio, Yoshua, Andrea Lodi, and Antoine Prouvost. "Machine learning for combinatorial optimization: a methodological tour dβhorizon." European Journal of Operational Research 290.2 (2021): 405-421.
-
Makoviychuk, Viktor, et al. "Isaac Gym: High performance GPU based physics simulation for robot learning." Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). 2021.
-
Nair, Vinod, et al. "Solving mixed integer programs using neural networks." arXiv preprint arXiv:2012.13349 (2020).
News
- We are currently developing optimization (OPT) environments that utilizes massive parallel simulation on GPU, the first version of which will be available at the end of January 2023. We welcome any suggestions or feedback!"
Outline
File Structure
RLSolver
βββ optimal
| βββbranch-and-bound.py
| βββcutting_plane.py
βββ helloworld
| βββmilp
| βββtsp
| βββgraph_maxcut
βββ rlsolver (main folder)
βββ envs
| (nonconvex optimizations)
| βββ learn2optimize
| βββ mimo_beamforming
| (combinatorial optimizations)
| βββ portfolio_management
| βββ quantum_circuits
| βββ vehicle_routing
| βββ virtual_machine_placement
| βββ chip_design
|ββ rlsolver_learn2optimize
|ββ rlsolver_mimo_beamforming
|ββ rlsolver_portfolio_management
|ββ rlsolver_quantum_circuits
βββ utils
Progress
- mimo_beamforming
- graph_maxcut
- traveling salesman problem
- portfolio_management