SalaryPeak

AI Engineer

ELLIOTT MOSS CONSULTING PTE. LTD.
Singapore 3+ years Posted Jan 16, 2026

Salary Range

SGD 66,000 - SGD 75,600 /year

SGD 5,500 - SGD 6,300/month

Apply on MyCareersFuture

Skills Required

TensorFlowAirflowPipelinesInformation TechnologyKerasPyTorchPythonContainerizationDockerOrchestrationAPISchedulingApacheDatabasesLinuxC++

Job Description

Job Description

We are seeking a skilled AI Engineer with 3+ years of experience to implement AI solutions using large language models (LLMs). The role involves managing the end-to-end lifecycle of LLM-based applications, including configuring high-performance inference engines, architecting advanced Agentic AI workflows, and bridging model capabilities with business logic using RAG and CAG patterns.

Responsibilities

  • Configure and optimize vLLM and other inference frameworks to deliver low-latency, high-throughput model serving that meets performance targets
  • Design and implement Retrieval-Augmented Generation (RAG) pipelines using vector databases and Chain-of-Agents Generation (CAG) strategies to minimize redundant computation and enhance efficiency
  • Deploy and tune vLLM clusters to provide scalable, high-throughput, low-latency API endpoints for open-source LLMs, ensuring reliable service delivery
  • Design, develop, and maintain Apache Airflow DAGs and RAGFlow workflows to automate AI lifecycle tasks including data ingestion, automated evaluation, and prompt versioning for continuous improvement
  • Develop, version-control, and refine system prompts using Chain-of-Thought (CoT) techniques to enhance LLM reasoning capabilities and output quality
  • Implement CAG strategies to optimize key-value (KV) cache reuse, reducing compute costs for long-context tasks and improving resource efficiency
  • Author and refine system prompts using Agentic AI techniques to ensure consistent and robust performance across multiple LLM backends

Required competencies and certifications

  • Bachelor’s degree in information technology, Computer Science, Finance, or a related field
  • Minimum 3+ years of hands-on experience working with large language models (LLMs), including expertise in vLLM and model quantization techniques such as AWQ and GPTQ
  • Strong proficiency in Apache Airflow for designing and scheduling complex AI and data pipelines
  • Experience with RAGFlow or similar deep-document Retrieval-Augmented Generation frameworks and vector databases
  • Proven experience building multi-agent AI systems using external APIs and tools to execute multi-step tasks effectively
  • Advanced programming skills in Python, and practical experience with containerization and orchestration technologies including Docker and Kubernetes
  • Experience using AI observability tools to monitor and analyze latency, cost, and hallucination rates for model performance optimization