SalaryPeak

AI Engineer

ELLIOTT MOSS CONSULTING PTE. LTD.
Singapore 3+ years Posted Jan 21, 2026

Salary Range

SGD 57,600 - SGD 84,000 /year

SGD 4,800 - SGD 7,000/month

Apply on MyCareersFuture

Skills Required

TensorFlowAirflowScalabilityKubernetesPipelinesThroughputReliabilityPyTorchPythonDockerOrchestrationAPISchedulingApacheDatabasesC++

Job Description

  Job Description

·       We are seeking a skilled AI Engineer with a minimum of 3+ years of hands-on experience in designing, building, and deploying Large Language Model (LLM)-based solutions. 

·      The ideal candidate will be responsible for the end-to-end lifecycle of AI applications, from high-performance model inference and optimization to the development of advanced Agentic AI workflows using RAG and CAG patterns. 

·      This role requires close collaboration with product, data, and engineering teams to translate business requirements into scalable, reliable, and cost-efficient AI systems. 

Required Skills & Qualifications 

·      Bachelor’s degree in Information Technology, Computer Science, Finance, or a related field. 

·      Minimum 3+ years of experience working with Large Language Models (LLMs) in production environments. 

·      Hands-on expertise with vLLM and model quantization techniques such as AWQ and GPTQ. 

·      Strong proficiency in Apache Airflow for scheduling and orchestrating complex data and AI pipelines.

·       Experience with RAGFlow or similar deep-document Retrieval-Augmented Generation (RAG) frameworks.

·       Practical experience with vector databases (e.g., FAISS, Milvus, Pinecone, Weaviate). 

·      Proven ability to design and implement multi-agent systems that leverage tools and external APIs to perform multi-step tasks.

·       Advanced proficiency in Python, Docker, and Kubernetes. 

·      Experience using AI observability and monitoring tools to track latency, cost, throughput, and hallucination rates. 

Key Responsibilities 

·      Configure, deploy, and optimize vLLM and other inference frameworks to ensure low-latency, high-throughput LLM serving.

·       Design and implement RAG pipelines using vector databases and Cache-Augmented Generation (CAG) strategies to reduce redundant computation and improve response quality. 

·      Deploy and tune vLLM clusters to support scalable, production-grade API endpoints for multiple open-source LLMs. 

·      Design, implement, and maintain Apache Airflow DAGs and RAGFlow pipelines to automate the AI lifecycle, including data ingestion, indexing, evaluation, and prompt/version management.

·       Develop, version-control, and continuously refine system prompts, applying techniques such as Chain-of-Thought (CoT) to improve reasoning accuracy and consistency.

·       Implement CAG strategies to optimize KV cache reuse and minimize compute costs for long-context and multi-step AI tasks.

·       Build and refine Agentic AI workflows, enabling autonomous task planning, tool usage, and API orchestration across different LLM backends. 

·      Monitor and analyze AI system performance using observability tools, ensuring reliability, cost efficiency, and controlled hallucination rates.

·       Collaborate with cross-functional teams to align AI solutions with business objectives, security standards, and scalability requirements. 

·      Experience Level 3+ years of relevant experience in AI/ML engineering, with demonstrated production experience in LLM-based systems.