FMN-GPT Unified

In Development
Vocab Size 491
Architecture FMN Transformer
Tokenization Character-level
Availability Coming Soon

Model Specifications (Subject to Change)

112
Model Dimension
6
Layers
4
Attention Heads
65K
Context Length
40
FMN Rank
120
Max Loops/Neuron

Feature Flags

Dynamic Routing REINFORCE based
QK Normalization Enabled
Gated Residuals Enabled
Recurrent Mixer Enabled
SwiGLU FFN Disabled

Model Availability

Model weights will be released on HuggingFace once development is complete. Everything here is subject to change.

View CompactAI on HuggingFace