RLSolver

RLSolver: High-performance RL solvers.


Keywords
Reinforcement, Learning, Solver, Combinatorial, optimization, Non-convex, gpu-acceleration, massively-parallel
License
MIT
Install
pip install RLSolver==0.0.1

Documentation

ElegantRL_Solver: High-performance RL solvers

For nonconvex optimizations (continuous variables) and combinatorial optimizations (discrete variables), we aim to find high-quality optimum, or even (nearly) global optimum.

For combinatorial optimization problems, we compare with Benchmark.

This project is built based on ElegantRL and OpenAI Gym.

The following two key technologies are under active development:

  • Massively parallel simuations of gym-environments on GPU, using thousands of CUDA cores and tensor cores.

  • Podracer scheduling on a GPU cloud, e.g., DGX-2 SuperPod.

Several key references:

  • Mazyavkina, Nina, et al. "Reinforcement learning for combinatorial optimization: A survey." Computers & Operations Research 134 (2021): 105400.

  • Bengio, Yoshua, Andrea Lodi, and Antoine Prouvost. "Machine learning for combinatorial optimization: a methodological tour d’horizon." European Journal of Operational Research 290.2 (2021): 405-421.

  • Makoviychuk, Viktor, et al. "Isaac Gym: High performance GPU based physics simulation for robot learning." Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). 2021.

  • Nair, Vinod, et al. "Solving mixed integer programs using neural networks." arXiv preprint arXiv:2012.13349 (2020).

News

  • We are currently developing optimization (OPT) environments that utilizes massive parallel simulation on GPU, the first version of which will be available at the end of January 2023. We welcome any suggestions or feedback!"

Outline

File Structure

RLSolver
β”œβ”€β”€ optimal
|   β”œβ”€β”€branch-and-bound.py
|   └──cutting_plane.py
β”œβ”€β”€ helloworld
|   β”œβ”€β”€milp
|   β”œβ”€β”€tsp
|   └──graph_maxcut
└── rlsolver (main folder)
    β”œβ”€β”€ envs
    |   (nonconvex optimizations)
    |   β”œβ”€β”€ learn2optimize
    |   └── mimo_beamforming 
    |   (combinatorial optimizations)
    |   β”œβ”€β”€ portfolio_management
    |   β”œβ”€β”€ quantum_circuits
    |   β”œβ”€β”€ vehicle_routing
    |   β”œβ”€β”€ virtual_machine_placement
    |   └── chip_design
    |── rlsolver_learn2optimize
    |── rlsolver_mimo_beamforming
    |── rlsolver_portfolio_management
    |── rlsolver_quantum_circuits
    └── utils


Progress

  • mimo_beamforming
  • graph_maxcut
  • traveling salesman problem
  • portfolio_management