Comprehensive testing of Axon and MLOS Core releases
CPU: Used for model inference execution, ONNX Runtime operations, and HTTP request handling.
Memory: Used for model loading, input/output tensor buffers, and ONNX Runtime workspace.
GPU: Not used (CPU-only inference)
Model installation includes downloading from HuggingFace and ONNX conversion via Docker. This dominates the total time (~99%) so it's shown separately.