Model Configuration - MLOS E2E Test

← Back to Test Report

Configuration Summary

Total Models

Enabled

NLP

Vision

Multimodal

LLM

📝

NLP Models

7 models

Gpt2

Enabled

DistilGPT-2 - Lightweight text generation

hf/distilgpt2@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

Bert

Enabled

BERT base - Masked language model

hf/bert-base-uncased@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

Roberta

Enabled

RoBERTa base - Robust BERT variant

hf/roberta-base@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

T5

Enabled

T5 small - Text-to-text transformer

hf/t5-small@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

⚠️ ✅ Fully supported: Encoder-decoder architecture enabled in Axon v3.1.6 + Core v3.2.9-alpha

Distilbert

Enabled

DistilBERT - Smaller, faster BERT variant

hf/distilbert-base-uncased@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

Albert

Enabled

ALBERT - Parameter-efficient BERT variant

hf/albert-base-v2@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	7 tokens
Large Test	128 tokens

Sentence-Transformers

Enabled

Sentence-BERT - Text embeddings for semantic search

hf/sentence-transformers/all-MiniLM-L6-v2@latest

Category

NLP

Input Type

Text

Input Specifications

Small Test	16 tokens
Large Test	128 tokens

⚠️ Produces 384-dimensional embeddings

👁️

Vision Models

9 models

Resnet

Enabled

ResNet-50 - Image classification (1000 classes)

hf/microsoft/resnet-50@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Vit

Enabled

Vision Transformer (ViT) - Image classification

hf/google/vit-base-patch16-224@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Convnext

Enabled

ConvNeXt Tiny - Modern CNN architecture

hf/facebook/convnext-tiny-224@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Mobilenet

Enabled

MobileNetV2 - Efficient mobile architecture

hf/google/mobilenet_v2_1.0_224@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Deit

Enabled

DeiT Small - Data-efficient Image Transformer

hf/facebook/deit-small-patch16-224@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Swin

Disabled

Swin Transformer - Shifted window attention

hf/microsoft/swin-tiny-patch4-window7-224@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

⚠️ Known PyTorch-to-ONNX conversion issues

Efficientnet

Enabled

EfficientNet-B0 - Compound scaling CNN

hf/google/efficientnet-b0@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	224x224x3

Detr

Disabled

DETR - End-to-end object detection with transformer

hf/facebook/detr-resnet-50@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	800x600x3

⚠️ Blocked: Object detection output format not yet supported. Needs: Core bbox parsing, test framework IoU validation, reference annotations.

Segformer

Disabled

SegFormer - Semantic segmentation

hf/nvidia/segformer-b0-finetuned-ade-512-512@latest

Category

VISION

Input Type

Image

Input Specifications

Small Test	64x64x3
Large Test	512x512x3

⚠️ Blocked: Segmentation mask output format not yet supported. Needs: Core 2D mask output, test framework pixel accuracy validation.

🎭

Multimodal Models

2 models

Clip

Enabled

CLIP - Image-text matching and zero-shot classification

hf/openai/clip-vit-base-patch32@latest

Category

MULTIMODAL

Input Type

Multimodal

Input Specifications

Text + Image input

⚠️ ✅ Fully supported: Multi-encoder architecture (text_model.onnx + vision_model.onnx) enabled in Axon v3.1.6 + Core v3.2.9-alpha

Wav2Vec2

Disabled

Wav2Vec2 - Speech recognition

hf/facebook/wav2vec2-base-960h@latest

Category

MULTIMODAL

Input Type

Audio

Input Specifications

⚠️ Audio models require waveform input - pending Core support

🤖

LLM Models (GGUF)

7 models

Large Language Models using GGUF format. Requires Core GGUF runtime plugin (llama.cpp).

Tinyllama

Enabled

TinyLlama 1.1B - Small but capable chat model

hf/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	32 tokens
Large Test	256 tokens

⚠️ Native GGUF execution via llama.cpp plugin. 4-bit quantized (~637MB)

Phi2

Disabled

Microsoft Phi-2 - 2.7B parameter small language model

hf/TheBloke/phi-2-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	64 tokens
Large Test	256 tokens

⚠️ Ready for GGUF execution. 4-bit quantized (~1.6GB) - disabled for CI performance

Qwen2-0.5B

Enabled

Qwen2 0.5B - Ultra-small instruction-tuned model (Alibaba)

hf/Qwen/Qwen2-0.5B-Instruct-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	32 tokens
Large Test	128 tokens

⚠️ Tested working. 4-bit quantized (~380MB) - Smallest viable LLM for CI

Llama-3.2-1B

Enabled

Meta Llama 3.2 1B - Latest small model optimized for mobile

hf/bartowski/Llama-3.2-1B-Instruct-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	32 tokens
Large Test	256 tokens

⚠️ Meta's latest 1B model (Dec 2024). 4-bit quantized (~700MB)

Llama-3.2-3B

Disabled

Meta Llama 3.2 3B - Excellent quality/size ratio

hf/bartowski/Llama-3.2-3B-Instruct-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	64 tokens
Large Test	256 tokens

⚠️ Meta's latest 3B model. Best small-model quality. 4-bit quantized (~1.8GB)

Deepseek-Coder-1.3B

Enabled

DeepSeek Coder 1.3B - Code generation specialist

hf/TheBloke/deepseek-coder-1.3b-instruct-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	64 tokens
Large Test	256 tokens

⚠️ Code-focused LLM from DeepSeek. 4-bit quantized (~750MB)

Deepseek-Llm-7B

Disabled

DeepSeek LLM 7B Chat - High-quality open model

hf/TheBloke/deepseek-llm-7B-chat-GGUF@latest

Category

LLM

Input Type

Text_Generation

Input Specifications

Format	GGUF
Small Test	64 tokens
Large Test	256 tokens

⚠️ DeepSeek's flagship 7B chat model. 4-bit quantized (~4GB) - Local testing only

➕ How to Add a New Model

Edit config/models.yaml
Add your model under the appropriate category
Set enabled: true to include in tests
Run make config to verify
Run make test to test

Example:

my_model:
  enabled: true
  category: nlp
  axon_id: "hf/my-org/my-model@latest"
  description: "My awesome model"
  input_type: text
  small_input:
    tokens: 7
  large_input:
    tokens: 128