AI Inference Engineer at Tether

 

Tether is seeking a AI Inference Engineer to build and optimize the C++ inference layer powering local and edge AI. You’ll enhance engines like llama.cpp, ggml, and ONNX to deliver fast, efficient model performance across diverse hardware.

What you’ll do

  • Optimize and deploy LLM inference on edge devices
  • Improve model load times, performance, and stability
  • Collaborate with researchers to move models from research to production
  • Integrate cutting-edge AI features into Tether products

What we’re looking for

  • Strong C++ expertise with hands-on llama.cpp/ggml experience
  • Solid understanding of LLMs, transformers, and deep learning
  • Experience with ONNX; JavaScript is a plus

Join Tether and help push the boundaries of efficient, production-ready AI in global digital finance.

Apply Here

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top