AI Inference Engineer at Tether

Tether is seeking a AI Inference Engineer to build and optimize the C++ inference layer powering local and edge AI. You’ll enhance engines like llama.cpp, ggml, and ONNX to deliver fast, efficient model performance across diverse hardware.

What you’ll do

Optimize and deploy LLM inference on edge devices
Improve model load times, performance, and stability
Collaborate with researchers to move models from research to production
Integrate cutting-edge AI features into Tether products

What we’re looking for

Strong C++ expertise with hands-on llama.cpp/ggml experience
Solid understanding of LLMs, transformers, and deep learning
Experience with ONNX; JavaScript is a plus

Join Tether and help push the boundaries of efficient, production-ready AI in global digital finance.

AI Inference Engineer at Tether

Apply Here

Leave a Comment Cancel Reply