SalaryPeak

Senior / Staff SLM & VLM Engineer - Post-Training, Tool Calling & Agents

JABIL CIRCUIT (SINGAPORE) PTE. LTD.
Singapore 5+ years Posted 2w ago

Salary Range

SGD 120,000 - SGD 180,000 /year

SGD 10,000 - SGD 15,000/month

Skills Required

Excellent Communication SkillsContinuous ImprovementA/B TestingSoftware EngineeringProblem SolvingPyTorchCUDAPythonHigh Performance ComputingDistillationC++

Job Description

Job Summary

We are looking for a highly capable engineer/researcher to lead the R&D of Small Language Models (SLMs) and Vision-Language Models (VLMs) for edge / low-latency and cost-efficient production scenarios. You will own the continuous pretraining, supervised instruction tuning (SFT), and compression/distillation pipelines, and work closely with platform teams to deliver reliable, measurable improvements in inference efficiency, tool-use success rate, and overall model quality.

Key Responsibilities

1) SLM/VLM Training: Continuous Pretraining & Instruction Tuning (SFT)

• Conduct continuous pretraining and SFT for SLMs and VLMs to improve task performance and domain adaptation.

• Build reproducible training workflows in PyTorch, including data processing, training, evaluation, and model versioning.

2) Compression, Distillation & Edge/Low-Latency Inference Optimization

• Design and implement efficient compression strategies for SLM/VLM, including knowledge distillation, pruning, and quantization-oriented training or post-training optimization.

• Optimize model serving and inference for low-latency / edge scenarios by improving throughput and cost-per-token via techniques such as quantization, caching/KV optimizations, batching strategies, and decoding-time optimizations.

3) Tool Calling System: Catalog, Routing, Validation, Fallback & Observability

• Architect and implement a production-grade tool calling (function/tool calling) framework:

• Tool cataloging and metadata/schema design

• Tool selection/routing and argument construction

• Parameter validation, result verification, and safe fallback/retry strategies

• Call-chain tracing, monitoring, and observability to improve success rate and ROI

4) RL & Reward Modeling for Alignment and Tool-Use Reliability

• Apply post-training methods such as PPO / DPO / GRPO-like optimization and reward modeling to align the model toward objectives including:

• semantic understanding

• tool-use success rate

• content generation quality and consistency

• Support both offline and online iteration loops, including policy evaluation, regression checks, and safe deployment gating.

5) Data Pipeline Automation (Collection, Cleaning, Curation)

• Design automated pipelines for data collection, filtering, cleaning, de-duplication, labeling/weak supervision, and dataset version management to continuously improve training quality.

• Ensure datasets support both SFT and preference/RL style post-training.

6) Rigorous Evaluation, Testing & Iteration

• Build robust evaluation mechanisms: offline benchmarks, task suites for tool-use, regression tests, and reliability metrics.

• Drive rapid iteration through A/B comparisons, ablations, and failure analysis, improving both quality and efficiency over time.

Required Qualifications

• Strong software engineering skills in Python and C++, including experience building ML training/evaluation pipelines in PyTorch.

• Hands-on experience in model efficiency and inference optimization (e.g., distillation, quantization, pruning, serving optimization).

• Experience with high-performance computing and acceleration: CUDA and/or SIMD, profiling and performance tuning.

• Ability to read and reproduce key ideas from recent papers and implement algorithms with strong experimental discipline.

• Ability to communicate effectively in both Chinese (Mandarin) and English as the successful candidate will have to liaise with our counterparts in China.