Back to List

Simplify Orchestration & Optimization for AI Accelerators 

Drive Software Architecture to Achieve Scalable AI ops 

Achieve up to 70% GPU utilization.

  • US based startup (series A)
  • Company Size: 30
  • 100% REMOTE (in North America)
  • Compensation: Base Salary 200k - 300k + Equity

 

Key Responsibilities:

  • GPU Optimization for ML/AI: Lead the continued expansion into GPU optimization for ML/AI workloads.
  • Design and deliver Autonomous Cloud Optimization product
  • Port architecture to Kubernetes
  • Established product architecture and development plans.
  • Customer and Sales Support: Provided training and documentation to support sales and customers.

Qualifications:

  • Entrepreneurial, Startup Experience
  • 10 years+ infrastructure level software architecture and development.

 

Extensive Experience:

  • Linux, Virtualization platforms (hands-on)
  • AWS, GCP or Azure.

 

Strong experience: 

  • Kubernetes-based ML/AI systems (Kubeflow, Kueue, KServe, GPU Operators, DRA, Karpenter)

 

Deep knowledge: 

  • ML/AI use cases & customer stories of model development, training, inference, & hardware accelerator usage (CPU, GPU, TPU).
  • Modern cloud-native architectures (scalability, availability, reliability, security, observability).
  • Proven track record of delivering complex distributed systems.
  • Active involvement in open-source communities, particularly CNCF and related projects.
  • Strong leadership and team collaboration skills.
  • Excellent communication skills, both verbal and written.

 

Preferred Qualifications:

  • Knowledge of additional ML/AI frameworks and tools.
  • Experience in DevOps practices and tools.
  • Certification in Kubernetes or related technologies.
  • Awareness of FinOps and SRE best practices
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
Apply to this Job
First Name *
Last Name *
Email Address *
Phone Number
Yes
No