← Back to Test Report

📦 Model Configuration

Models available for E2E testing

Configuration Summary

55
Total Models
32
Enabled
23
NLP
19
Vision
2
Multimodal
11
LLM
📝

NLP Models

23 models

Gpt2

Enabled

DistilGPT-2 - Lightweight text generation

hf/distilgpt2@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Bert

Enabled

BERT base - Masked language model

hf/bert-base-uncased@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Roberta

Enabled

RoBERTa base - Robust BERT variant

hf/roberta-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

T5

Enabled

T5 small - Text-to-text transformer

hf/t5-small@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens
⚠️ ✅ Fully supported: Encoder-decoder architecture enabled in Axon v3.1.6 + Core v3.2.9-alpha

Distilbert

Enabled

DistilBERT - Smaller, faster BERT variant

hf/distilbert-base-uncased@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Albert

Enabled

ALBERT - Parameter-efficient BERT variant

hf/albert-base-v2@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Sentence-Transformers

Enabled

Sentence-BERT - Text embeddings for semantic search

hf/sentence-transformers/all-MiniLM-L6-v2@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test128 tokens
⚠️ Produces 384-dimensional embeddings

Xlnet

Disabled

XLNet - Generalized autoregressive pretraining

hf/xlnet-base-cased@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens
⚠️ ONNX conversion fails - xlnet not in supported model list

Electra

Disabled

ELECTRA - Efficient pre-training with replaced token detection

hf/google/electra-base-discriminator@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Deberta

Disabled

DeBERTa - Decoding-enhanced BERT with disentangled attention

hf/microsoft/deberta-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Deberta-V3

Disabled

DeBERTa v3 - Latest version with improved performance

hf/microsoft/deberta-v3-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Mpnet

Disabled

MPNet - Masked and Permuted Pre-training

hf/microsoft/mpnet-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Xlm-Roberta

Disabled

XLM-RoBERTa - Cross-lingual RoBERTa (100 languages)

hf/xlm-roberta-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Distilroberta

Enabled

DistilRoBERTa - Distilled RoBERTa for faster inference

hf/distilroberta-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Squeezebert

Enabled

SqueezeBERT - Mobile-optimized BERT variant

hf/squeezebert/squeezebert-uncased@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Minilm

Enabled

MiniLM - Compact model with deep self-attention distillation

hf/microsoft/MiniLM-L12-H384-uncased@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Bart-Base

Enabled

BART base - Denoising autoencoder for pretraining

hf/facebook/bart-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test7 tokens
Large Test128 tokens

Bge-Small

Enabled

BGE Small - BAAI General Embedding (384-dim)

hf/BAAI/bge-small-en-v1.5@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ State-of-the-art embedding model from BAAI

Bge-Base

Enabled

BGE Base - BAAI General Embedding (768-dim)

hf/BAAI/bge-base-en-v1.5@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ Higher quality embeddings, larger model

E5-Small

Enabled

E5 Small - Text embeddings for retrieval (384-dim)

hf/intfloat/e5-small-v2@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ Microsoft E5 embeddings

E5-Base

Enabled

E5 Base - Text embeddings for retrieval (768-dim)

hf/intfloat/e5-base-v2@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ Microsoft E5 embeddings - larger variant

Gte-Small

Enabled

GTE Small - General Text Embeddings (384-dim)

hf/thenlper/gte-small@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ Alibaba GTE embeddings

Gte-Base

Enabled

GTE Base - General Text Embeddings (768-dim)

hf/thenlper/gte-base@latest
Category
NLP
Input Type
Text

Input Specifications

Small Test16 tokens
Large Test512 tokens
⚠️ Alibaba GTE embeddings - larger variant
👁️

Vision Models

19 models

Resnet

Enabled

ResNet-50 - Image classification (1000 classes)

hf/microsoft/resnet-50@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Vit

Enabled

Vision Transformer (ViT) - Image classification

hf/google/vit-base-patch16-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Convnext

Enabled

ConvNeXt Tiny - Modern CNN architecture

hf/facebook/convnext-tiny-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Mobilenet

Enabled

MobileNetV2 - Efficient mobile architecture

hf/google/mobilenet_v2_1.0_224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Deit

Enabled

DeiT Small - Data-efficient Image Transformer

hf/facebook/deit-small-patch16-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Swin

Disabled

Swin Transformer - Shifted window attention

hf/microsoft/swin-tiny-patch4-window7-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3
⚠️ Known PyTorch-to-ONNX conversion issues

Efficientnet

Enabled

EfficientNet-B0 - Compound scaling CNN

hf/google/efficientnet-b0@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Regnet

Enabled

RegNet - Designing Network Design Spaces

hf/facebook/regnet-y-040@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Beit

Enabled

BEiT - BERT Pre-Training of Image Transformers

hf/microsoft/beit-base-patch16-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Dinov2

Disabled

DINOv2 - Self-supervised vision transformer

hf/facebook/dinov2-base@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Poolformer

Enabled

PoolFormer - MetaFormer baseline with pooling

hf/sail/poolformer_s12@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Levit

Disabled

LeViT - Vision Transformer in ConvNet's Clothing

hf/facebook/levit-128@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Efficientnet-B1

Disabled

EfficientNet-B1 - Larger EfficientNet variant

hf/google/efficientnet-b1@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test240x240x3

Efficientnet-B2

Disabled

EfficientNet-B2 - Medium EfficientNet variant

hf/google/efficientnet-b2@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test260x260x3

Densenet

Disabled

DenseNet-121 - Densely Connected Networks

hf/facebook/densenet-121@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Convnext-Small

Enabled

ConvNeXt Small - Larger ConvNeXt variant

hf/facebook/convnext-small-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Vit-Large

Disabled

ViT Large - Large Vision Transformer

hf/google/vit-large-patch16-224@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test224x224x3

Detr

Disabled

DETR - End-to-end object detection with transformer

hf/facebook/detr-resnet-50@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test800x600x3
⚠️ Blocked: Object detection output format not yet supported. Needs: Core bbox parsing, test framework IoU validation, reference annotations.

Segformer

Disabled

SegFormer - Semantic segmentation

hf/nvidia/segformer-b0-finetuned-ade-512-512@latest
Category
VISION
Input Type
Image

Input Specifications

Small Test64x64x3
Large Test512x512x3
⚠️ Blocked: Segmentation mask output format not yet supported. Needs: Core 2D mask output, test framework pixel accuracy validation.
🎭

Multimodal Models

2 models

Clip

Enabled

CLIP - Image-text matching and zero-shot classification

hf/openai/clip-vit-base-patch32@latest
Category
MULTIMODAL
Input Type
Multimodal

Input Specifications

Text + Image input

⚠️ ✅ Fully supported: Multi-encoder architecture (text_model.onnx + vision_model.onnx) enabled in Axon v3.1.6 + Core v3.2.9-alpha

Wav2Vec2

Disabled

Wav2Vec2 - Speech recognition

hf/facebook/wav2vec2-base-960h@latest
Category
MULTIMODAL
Input Type
Audio

Input Specifications

⚠️ Audio models require waveform input - pending Core support
🤖

LLM Models (GGUF)

11 models

Large Language Models using GGUF format. Requires Core GGUF runtime plugin (llama.cpp).

Tinyllama

Enabled

TinyLlama 1.1B - Small but capable chat model

hf/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test32 tokens
Large Test256 tokens
⚠️ Native GGUF execution via llama.cpp plugin. 4-bit quantized (~637MB)

Phi2

Disabled

Microsoft Phi-2 - 2.7B parameter small language model

hf/TheBloke/phi-2-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ Ready for GGUF execution. 4-bit quantized (~1.6GB) - disabled for CI performance

Qwen2-0.5B

Enabled

Qwen2 0.5B - Ultra-small instruction-tuned model (Alibaba)

hf/Qwen/Qwen2-0.5B-Instruct-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test32 tokens
Large Test128 tokens
⚠️ Tested working. 4-bit quantized (~380MB) - Smallest viable LLM for CI

Llama-3.2-1B

Enabled

Meta Llama 3.2 1B - Latest small model optimized for mobile

hf/bartowski/Llama-3.2-1B-Instruct-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test32 tokens
Large Test256 tokens
⚠️ Meta's latest 1B model (Dec 2024). 4-bit quantized (~700MB)

Llama-3.2-3B

Disabled

Meta Llama 3.2 3B - Excellent quality/size ratio

hf/bartowski/Llama-3.2-3B-Instruct-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ Meta's latest 3B model. Best small-model quality. 4-bit quantized (~1.8GB)

Deepseek-Coder-1.3B

Enabled

DeepSeek Coder 1.3B - Code generation specialist

hf/TheBloke/deepseek-coder-1.3b-instruct-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ Code-focused LLM from DeepSeek. 4-bit quantized (~750MB)

Deepseek-Llm-7B

Disabled

DeepSeek LLM 7B Chat - High-quality open model

hf/TheBloke/deepseek-llm-7B-chat-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ DeepSeek's flagship 7B chat model. 4-bit quantized (~4GB) - Local testing only

Stablelm-2-Zephyr

Disabled

StableLM 2 Zephyr 1.6B - Stability AI's chat model

hf/stabilityai/stablelm-2-zephyr-1_6b-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test32 tokens
Large Test128 tokens
⚠️ Stability AI's latest small chat model. 4-bit quantized (~1GB) - CI timeout

Gemma-2B

Disabled

Gemma 2B - Google's lightweight open model

hf/google/gemma-2b-it-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ Google's efficient 2B parameter model. 4-bit quantized (~1.5GB) - CI timeout

Openchat-3.5

Disabled

OpenChat 3.5 - High-quality chat model (7B)

hf/TheBloke/openchat-3.5-0106-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test256 tokens
⚠️ OpenChat 3.5 - Surpasses ChatGPT on many benchmarks. Q4 (~4GB) - CI timeout

Mistral-7B

Disabled

Mistral 7B Instruct - Leading open 7B model

hf/TheBloke/Mistral-7B-Instruct-v0.2-GGUF@latest
Category
LLM
Input Type
Text_Generation

Input Specifications

FormatGGUF
Small Test64 tokens
Large Test512 tokens
⚠️ Mistral AI's flagship model. Best 7B performance. Q4 (~4.1GB) - CI timeout

➕ How to Add a New Model

  1. Edit config/models.yaml
  2. Add your model under the appropriate category
  3. Set enabled: true to include in tests
  4. Run make config to verify
  5. Run make test to test

Example:

my_model:
  enabled: true
  category: nlp
  axon_id: "hf/my-org/my-model@latest"
  description: "My awesome model"
  input_type: text
  small_input:
    tokens: 7
  large_input:
    tokens: 128