Ashton North’s client is seeking a Staff Software Engineer to design, build, and scale backend systems that power Large Language Model (LLM) applications in the healthcare sector. This is a high-impact role for an engineer who thrives at the intersection of backend architecture and applied AI—developing APIs, pipelines, and infrastructure that make LLMs reliable, secure, and cost-efficient in production.
\nIf you’re passionate about taking LLMs beyond demos and embedding them into mission-critical healthcare workflows, this role offers the chance to shape the future of AI-driven innovation in a regulated, high-stakes environment.
\n\nBackend Architecture for LLMs: Design and implement scalable, low-latency APIs and services that orchestrate, optimize, and secure LLMs for healthcare applications.
\nData & Retrieval Pipelines: Build data ingestion, preprocessing, and retrieval-augmented generation (RAG) pipelines to ground LLMs in clinical and revenue-cycle datasets.
\nSystems Reliability: Drive high availability, observability, and fault tolerance across distributed systems handling sensitive healthcare data.
\nOperational Excellence: Apply best practices in MLOps and LLMOps—ensuring continuous integration, evaluation, and monitoring of models in production.
\nCross-Functional Collaboration: Partner closely with AI researchers, product teams, and healthcare domain experts to translate requirements into robust technical solutions.
\nExperience:
\n5+ years of backend or full-stack software engineering experience.
\n3+ years of hands-on experience developing ML or LLM-enabled applications.
\nProven experience applying LLMs in healthcare or other regulated industries (FHIR, HL7, HIPAA familiarity preferred).
\nTechnical Skills:
\nStrong coding proficiency in Python.
\nExperience with LLM integration frameworks and RAG pipelines.
\nDeep understanding of distributed systems, API design, and microservices architecture.
\nExpertise with cloud-native technologies (AWS, GCP, or Azure) and container orchestration (Kubernetes, Docker, Terraform).
\nFamiliarity with MLOps/LLMOps concepts including CI/CD for models, evaluation harnesses, and reproducibility.
\nAdditional coding proficiency in a statically typed language (Go, Java, or TypeScript) is a plus.
\nEducation:
\nBachelor’s degree in Computer Science or a related technical field from a leading institution.
\nStrong communication and collaboration skills—comfortable interfacing directly with customers and internal stakeholders.
\nHighly analytical with the ability to balance speed and precision in complex, data-driven environments.
\nEntrepreneurial mindset with a drive to build systems that push AI into real-world production.
\nThis is an on-site role requiring 4 days per week in the San Francisco office.
\nSend this job to your inbox!
Location: San Francisco, CA (4 days in office)
Client: Ashton North’s client
Ashton North’s client is seeking a Staff Software Engineer to design, build, and scale backend systems that power Large Language Model (LLM) applications in the healthcare sector. This is a high-impact role for an engineer who thrives at the intersection of backend architecture and applied AI—developing APIs, pipelines, and infrastructure that make LLMs reliable, secure, and cost-efficient in production.
If you’re passionate about taking LLMs beyond demos and embedding them into mission-critical healthcare workflows, this role offers the chance to shape the future of AI-driven innovation in a regulated, high-stakes environment.
Backend Architecture for LLMs: Design and implement scalable, low-latency APIs and services that orchestrate, optimize, and secure LLMs for healthcare applications.
Data & Retrieval Pipelines: Build data ingestion, preprocessing, and retrieval-augmented generation (RAG) pipelines to ground LLMs in clinical and revenue-cycle datasets.
Systems Reliability: Drive high availability, observability, and fault tolerance across distributed systems handling sensitive healthcare data.
Operational Excellence: Apply best practices in MLOps and LLMOps—ensuring continuous integration, evaluation, and monitoring of models in production.
Cross-Functional Collaboration: Partner closely with AI researchers, product teams, and healthcare domain experts to translate requirements into robust technical solutions.
Experience:
5+ years of backend or full-stack software engineering experience.
3+ years of hands-on experience developing ML or LLM-enabled applications.
Proven experience applying LLMs in healthcare or other regulated industries (FHIR, HL7, HIPAA familiarity preferred).
Technical Skills:
Strong coding proficiency in Python.
Experience with LLM integration frameworks and RAG pipelines.
Deep understanding of distributed systems, API design, and microservices architecture.
Expertise with cloud-native technologies (AWS, GCP, or Azure) and container orchestration (Kubernetes, Docker, Terraform).
Familiarity with MLOps/LLMOps concepts including CI/CD for models, evaluation harnesses, and reproducibility.
Additional coding proficiency in a statically typed language (Go, Java, or TypeScript) is a plus.
Education:
Bachelor’s degree in Computer Science or a related technical field from a leading institution.
Strong communication and collaboration skills—comfortable interfacing directly with customers and internal stakeholders.
Highly analytical with the ability to balance speed and precision in complex, data-driven environments.
Entrepreneurial mindset with a drive to build systems that push AI into real-world production.
This is an on-site role requiring 4 days per week in the San Francisco office.
Phone
Job Type
Remote Status
Get notified about new listings!
Can't find the job you want?
Submit a general applicationLoading Jobs...