๐Ÿš€ MLOS Release E2E Validation

Comprehensive testing of Axon and MLOS Core releases

Overall Status

100.0%

Total Duration

808.3s

Inferences

36/36

Models Tested

18

๐Ÿ“ฆ Release Versions

Axon Version

v3.1.7

MLOS Core Version

4.1.3-alpha

Runtime Mode

Userspace Only (No Kernel Optimizations)
Kernel Module: No

๐Ÿ’ป Hardware Specifications

Operating System

Linux Ubuntu 24.04.3 LTS
Architecture: x86_64

CPU

AMD EPYC 7763 64-Core Processor
Cores: 4 | Threads: 4

Memory

15 GB

GPU

Count: 0 | Memory:

Disk

72G
Available: 21G

๐Ÿ“Š Resource Usage

MLOS Core (Idle)

CPU: 0.0% | Memory: 18 MB

MLOS Core (Under Load)

CPU: 17.8% (max: 17.8%)
Memory: 9908 MB (max: 9908 MB)

Axon

CPU: 0% | Memory: 0 MB

Resource Type

CPU: Used for model inference execution, ONNX Runtime operations, and HTTP request handling.

Memory: Used for model loading, input/output tensor buffers, and ONNX Runtime workspace.

GPU: Not used (CPU-only inference)

โฑ๏ธ Installation & Setup Times

Axon Download Time

565 ms

Core Download Time

2.0s

Core Startup Time

1.1s

Total Model Install Time

12.6 min

๐Ÿš€ Inference Performance

๐Ÿ”ค NLP Models

ALBERT

โœ…
Small Inference
202 ms
Large Inference
917 ms

ROBERTA

โœ…
Small Inference
249 ms
Large Inference
1.3s

SENTENCE-TRANSFORMERS

โœ…
Small Inference
133 ms
Large Inference
212 ms

T5

โœ…
Small Inference
122 ms
Large Inference
215 ms

DISTILBERT

โœ…
Small Inference
117 ms
Large Inference
253 ms

GPT2

โœ…
Small Inference
472 ms
Large Inference
329 ms

BERT

โœ…
Small Inference
346 ms
Large Inference
1.3s

๐Ÿ‘๏ธ Vision Models

CONVNEXT

โœ…
Small Inference
1.2s
Large Inference
1.2s

RESNET

โœ…
Small Inference
1.2s
Large Inference
1.2s

EFFICIENTNET

โœ…
Small Inference
1.1s
Large Inference
1.1s

VIT

โœ…
Small Inference
1.3s
Large Inference
1.3s

MOBILENET

โœ…
Small Inference
1.1s
Large Inference
1.1s

DEIT

โœ…
Small Inference
1.2s
Large Inference
1.2s

๐ŸŽจ Multimodal Models

CLIP

โœ…
Small Inference
1.2s
Large Inference
1.2s

๐Ÿค– LLM Models

LLAMA-3.2-1B

โœ…
Small Inference
1.3s
Large Inference
1.3s

QWEN2-0.5B

โœ…
Small Inference
691 ms
Large Inference
5.3s

TINYLLAMA

โœ…
Small Inference
599 ms
Large Inference
6.9s

DEEPSEEK-CODER-1.3B

โœ…
Small Inference
1.6s
Large Inference
1.5s

๐Ÿค– Model Support by Category

๐Ÿงช Test Details โ†’ ๐Ÿ“ฆ View Full Config โ†’

๐Ÿ˜Š NLP Models

  • โœ… GPT-2
  • โœ… BERT
  • โœ… RoBERTa
  • โœ… T5
Status: โœ… Passing

๐Ÿ”ฅ Vision Models

  • โœ… ResNet-50
  • โœ… ViT
  • โœ… ConvNeXt
  • โœ… MobileNet
  • โœ… DeiT
  • โœ… EfficientNet
Status: โœ… Passing

๐ŸŽจ Multi-Modal

  • โœ… CLIP
  • โณ Wav2Vec2 (audio)
Status: โœ… Passing

๐Ÿ“Š Model Details

๐Ÿ”ค NLP Models

ALBERT

โœ…
Install Time
25.4s
Register Time
311 ms

ROBERTA

โœ…
Install Time
1.9 min
Register Time
595 ms

SENTENCE-TRANSFORMERS

โœ…
Install Time
25.8s
Register Time
249 ms

T5

โœ…
Install Time
1.2 min
Register Time
495 ms

DISTILBERT

โœ…
Install Time
45.7s
Register Time
323 ms

GPT2

โœ…
Install Time
1.2 min
Register Time
762 ms

BERT

โœ…
Install Time
32.8s
Register Time
880 ms

๐Ÿ‘๏ธ Vision Models

CONVNEXT

โœ…
Install Time
28.4s
Register Time
189 ms

RESNET

โœ…
Install Time
27.2s
Register Time
201 ms

EFFICIENTNET

โœ…
Install Time
18.6s
Register Time
164 ms

VIT

โœ…
Install Time
56.7s
Register Time
420 ms

MOBILENET

โœ…
Install Time
16.7s
Register Time
135 ms

DEIT

โœ…
Install Time
25.8s
Register Time
200 ms

๐ŸŽจ Multimodal Models

CLIP

โœ…
Install Time
58.7s
Register Time
822 ms

๐Ÿค– LLM Models

LLAMA-3.2-1B

โœ…
Install Time
36.0s
Register Time
875 ms

QWEN2-0.5B

โœ…
Install Time
18.1s
Register Time
367 ms

TINYLLAMA

โœ…
Install Time
36.1s
Register Time
583 ms

DEEPSEEK-CODER-1.3B

โœ…
Install Time
47.4s
Register Time
794 ms

๐Ÿ“ˆ Performance Breakdown

๐Ÿ“ฆ Model Installation (not shown in chart)

12.6 min

Model installation includes downloading from HuggingFace and ONNX conversion via Docker. This dominates the total time (~99%) so it's shown separately.