Model Status | FMN-GPT

In Development

Vocab Size 491

Architecture FMN Transformer

Tokenization Character-level

Availability Coming Soon

112

Model Dimension

Layers

Attention Heads

65K

Context Length

FMN Rank

120

Max Loops/Neuron

Dynamic Routing REINFORCE based

QK Normalization Enabled

Gated Residuals Enabled

Recurrent Mixer Enabled

SwiGLU FFN Disabled

Model weights will be released on HuggingFace once development is complete. Everything here is subject to change.

View CompactAI on HuggingFace