Complete transparency on test inputs, expected outputs, and validation results
โ Back to Main ReportModel documentation, tokenizer configs, and example inputs from official model cards.
huggingface.co/modelsValidated ONNX models with test data and expected outputs for vision models.
github.com/onnx/modelsText generation model - validates output tensor exists
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_exists | output_exists | INFO |
| Output Elements | >= 10,000 | 12,288 | PASS |
| Inference Time | - | 42.97 ms | INFO |
Source: HuggingFace Model Hub
Masked language model - validates inference produces output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Output Size | >= 100,000 bytes | 4,952,939 bytes | PASS |
| Inference Time | - | 192.32 ms | INFO |
Source: HuggingFace Model Hub
Robust BERT - validates inference produces output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Output Size | >= 100,000 bytes | 7,933,700 bytes | PASS |
| Inference Time | - | 169.59 ms | INFO |
Source: HuggingFace Model Hub
Seq2seq model - encoder-decoder architecture
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Inference Time | - | 37.52 ms | INFO |
Source: HuggingFace Model Hub
Distilled BERT - validates inference produces output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Output Size | >= 100,000 bytes | 116,880 bytes | PASS |
| Inference Time | - | 14.69 ms | INFO |
Source: HuggingFace Model Hub
ALBERT - validates inference produces output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Output Size | >= 100,000 bytes | 4,805,253 bytes | PASS |
| Inference Time | - | 103.29 ms | INFO |
Source: HuggingFace Model Hub
Sentence embedding model - validates embedding output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Output Size | >= 100,000 bytes | 469,489 bytes | PASS |
| Inference Time | - | 37.09 ms | INFO |
Source: HuggingFace Model Hub
ResNet-50 ImageNet classifier - validates classification output
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1000] | [1000] | PASS |
| Inference Time | - | 72.05 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | tabby cat (class 281) or [282, 283, 284, 285] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 632(-0.749), 409(-1.264), 818(-2.071), 507(-2.405), 567(-2.755) | INFO |
| Classification Result | Class 281 in top-5 | Class 281 not in top-5 | FAIL |
| Inference Time | - | 72.05 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | golden retriever (class 207) or [206, 208, 209] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 632(-0.749), 409(-1.264), 818(-2.071), 507(-2.405), 567(-2.755) | INFO |
| Classification Result | Class 207 in top-5 | Class 207 not in top-5 | FAIL |
| Inference Time | - | 72.05 ms | INFO |
Source: ONNX Model Zoo / ImageNet
Vision Transformer - validates transformer-based classification
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1000] | [1000] | PASS |
| Inference Time | - | 269.18 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | tabby cat (class 281) or [282, 283, 284, 285] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 868(5.282), 646(4.419), 599(4.118), 611(4.040), 506(3.681) | INFO |
| Classification Result | Class 281 in top-5 | Class 281 not in top-5 | FAIL |
| Inference Time | - | 269.18 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | coffee mug (class 504) or [968] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 868(5.282), 646(4.419), 599(4.118), 611(4.040), 506(3.681) | INFO |
| Classification Result | Class 504 in top-5 | Class 504 not in top-5 | FAIL |
| Inference Time | - | 269.18 ms | INFO |
Source: ONNX Model Zoo / ImageNet
ConvNeXt - modern CNN architecture
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1000] | [1000] | PASS |
| Inference Time | - | 96.93 ms | INFO |
Source: HuggingFace Model Hub
MobileNetV2 - efficient mobile classifier
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1001] | [1001] | PASS |
| Inference Time | - | 34.66 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | sports car (class 817) or [511, 609, 627, 656, 717, 751, 864] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 972(8.685), 712(6.836), 645(6.623), 620(6.241), 563(6.176) | INFO |
| Classification Result | Class 817 in top-5 | Class 817 not in top-5 | FAIL |
| Inference Time | - | 34.66 ms | INFO |
Source: ONNX Model Zoo / ImageNet
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | top_k_class_match | top_k_class_match | INFO |
| Expected Class | analog clock (class 409) or [530, 892] | - | INFO |
| Top-K Threshold | 5 | - | INFO |
| Top-5 Predictions | - | 972(8.685), 712(6.836), 645(6.623), 620(6.241), 563(6.176) | INFO |
| Classification Result | Class 409 in top-5 | Class 409 not in top-5 | FAIL |
| Inference Time | - | 34.66 ms | INFO |
Source: ONNX Model Zoo / ImageNet
DeiT - data-efficient ViT
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1000] | [1000] | PASS |
| Inference Time | - | 85.85 ms | INFO |
Source: HuggingFace Model Hub
EfficientNet-B0 - compound scaled CNN
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | output_shape | output_shape | INFO |
| Output Shape | [1000] | [1000] | PASS |
| Inference Time | - | 35.01 ms | INFO |
Source: HuggingFace Model Hub
CLIP - image-text similarity model
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | status_success | status_success | INFO |
| Status | success | success | PASS |
| Inference Time | - | 165.53 ms | INFO |
Source: HuggingFace Model Hub
TinyLlama 1.1B GGUF - validates text generation
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['Paris'] | Found: ['Paris'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': '\nYes, the capital of France is Paris.', 'tokens_generated': 10}" | MATCH |
| Inference Time | - | 453.90 ms | INFO |
Source: HuggingFace Model Hub
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['Einstein', 'relativity', 'physics', 'time', 'space'] | Found: ['Einstein', 'relativity', 'physics', 'time', 'space'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': '\n\nRelativity is a theory that describes the behavior of matter and energy in space and time. It is based on the principle of relativity, which states that the laws of physics are..." | MATCH |
| Inference Time | - | 6574.45 ms | INFO |
Source: HuggingFace Model Hub
Qwen2 0.5B GGUF - validates instruction following
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['4'] | Found: ['4'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': '4', 'tokens_generated': 1}" | MATCH |
| Inference Time | - | 173.20 ms | INFO |
Source: HuggingFace Model Hub
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['AI', 'learning', 'neural', 'model'] | Found: ['learning', 'neural', 'model'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': 'The key developments in artificial intelligence over the past decade include:\n\n1. Deep Learning: Deep learning is a type of artificial intelligence that uses neural networks to l..." | MATCH |
| Inference Time | - | 4942.51 ms | INFO |
Source: HuggingFace Model Hub
Llama 3.2 1B GGUF - validates instruction following
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['Tokyo'] | Found: ['Tokyo'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': 'Tokyo.', 'tokens_generated': 3}" | MATCH |
| Inference Time | - | 406.84 ms | INFO |
Source: HuggingFace Model Hub
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['data', 'learn', 'train', 'model', 'algorithm'] | Found: ['data', 'learn', 'train', 'model'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': "Machine learning is a way for computers to learn from data and make predictions or decisions on their own. Here are the simple principles of machine learning:\n\n**1. Data Collecti..." | MATCH |
| Inference Time | - | 8173.24 ms | INFO |
Source: HuggingFace Model Hub
DeepSeek Coder 1.3B GGUF - validates code generation
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['def add', 'return'] | Found: ['def add', 'return'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': 'def add(num1, num2):\n return num1 + num2\n\n<|assistant|>\nprint(add(5, 3))\n\n<|assistant|>\nprint(add(10, 20))\n\n<|assistant|>\n', 'tokens_generated': 64}" | MATCH |
| Inference Time | - | 2650.28 ms | INFO |
Source: HuggingFace Model Hub
| Field | Expected | Actual | Result |
|---|---|---|---|
| Validation Type | generation_contains | generation_contains | INFO |
| Expected Keywords | ['def', 'binary', 'return'] | Found: ['def', 'binary', 'return'] | PASS |
| Generated Text | (any containing keywords) | "{'generated_text': 'Sure, here is a Python function that implements binary search on a sorted list:\n\n```python\ndef binary_search(arr, low, high, x):\n \n if high >= low:\n \n mid = (high ..." | MATCH |
| Inference Time | - | 9015.38 ms | INFO |
Source: HuggingFace Model Hub