MoE Mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta.
Install
pip install moe-mamba
Usage
MoEMambaBlock
import torch
from moe_mamba import MoEMambaBlock
x = torch.randn(1, 10, 512)
model = MoEMambaBlock(
dim=512,
depth=6,
d_state=128,
expand=4,
num_experts=4,
)
out = model(x)
print(out)
Code Quality 馃Ч
-
make style
to format the code -
make check_code_quality
to check code quality (PEP8 basically) black .
ruff . --fix
Citation
@misc{pi贸ro2024moemamba,
title={MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts},
author={Maciej Pi贸ro and Kamil Ciebiera and Krystian Kr贸l and Jan Ludziejewski and Sebastian Jaszczur},
year={2024},
eprint={2401.04081},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
License
MIT