Senior Ml Solutions Architect

 

Description:

Senior ML Solutions Architect (AI Inference)

United States 

The Company

Our client is a well-funded cloud platform provider building critical infrastructure for the global AI ecosystem. They offer a serverless AI inference platform that enables businesses to deploy and scale open-source Large Language Models without the burden of managing GPU infrastructure or building specialised in-house ML teams. Backed by significant investment and led by a world-class engineering team, they are scaling rapidly across the US and internationally.

The Role

This is a high-impact, customer-facing technical role sitting at the intersection of ML engineering and solutions architecture. You will work directly with customers to design, build, and deploy production-grade applications on the company’s serverless LLM inference platform. You’ll be the technical bridge between the platform and its users — architecting custom solutions, driving adoption, and feeding critical insights back to the product and engineering teams.

 

What You’ll Do

  • Design and implement bespoke LLM-based solutions using serverless AI inference services to solve real business problems and drive customer outcomes.
  • Build and demonstrate production-ready applications leveraging LLM APIs across multimodal capabilities (text, vision, audio) and domain-specific models.
  • Provide expert technical guidance on prompt engineering, RAG architectures, model selection, and inference optimisation.
  • Collaborate closely with product and engineering teams to relay customer feedback and shape the platform’s roadmap.
  • Guide customers through the full lifecycle from proof-of-concept through to production ensuring solutions are performant, reliable, and cost-effective.

 

What You’ll Bring

  • 5+ years of experience in ML/AI systems, with at least 2 years focused on LLMs and generative AI.
  • Deep knowledge of the LLM ecosystem: model architectures, training, and fine-tuning methodologies.
  • Hands-on experience with prompt engineering, end-to-end LLM pipeline development, and evaluation.
  • Practical experience with agentic frameworks (LangChain, LangSmith, smolagents, or equivalents).
  • Experience implementing vector databases and RAG patterns in production environments.
  • Strong Python proficiency and experience deploying LLM applications via APIs.

 

Nice to Have

  • Experience with inference frameworks such as vLLM, SGLang, TensorRT-LLM, or Transformers.
  • Knowledge of inference optimisation techniques: quantization, dynamic batching, KV caching, model routing.
  • Experience with multimodal AI models (vision-language, speech).
  • Proficiency with Docker, Kubernetes, and MLOps tooling.
  • Open-source ML/AI contributions.

Organization Asobbi
Industry IT / Telecom / Software Jobs
Occupational Category Senior ML Solutions Architect
Job Location New York,USA
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 5 Years
Posted at 2026-03-06 12:43 pm
Expires on 2026-04-20