moe-mamba

Paper - Pytorch


Keywords
artificial, intelligence, deep, learning, optimizers, Prompt, Engineering, ai, ml, moe, multi-modal-fusion, multi-modality, swarms
License
MIT
Install
pip install moe-mamba==0.0.3

Documentation

Multi-Modality

MoE Mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta.

PAPER LINK

Install

pip install moe-mamba

Usage

MoEMambaBlock

import torch 
from moe_mamba import MoEMambaBlock

x = torch.randn(1, 10, 512)
model = MoEMambaBlock(
    dim=512,
    depth=6,
    d_state=128,
    expand=4,
    num_experts=4,
)
out = model(x)
print(out)

Code Quality 馃Ч

  • make style to format the code
  • make check_code_quality to check code quality (PEP8 basically)
  • black .
  • ruff . --fix

Citation

@misc{pi贸ro2024moemamba,
    title={MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts}, 
    author={Maciej Pi贸ro and Kamil Ciebiera and Krystian Kr贸l and Jan Ludziejewski and Sebastian Jaszczur},
    year={2024},
    eprint={2401.04081},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

MIT