About Nscale
Nscale is taking on the hyperscalers by building a vertically integrated GenAI cloud platform.
We own the data centres, software, and applications that power today's AI stack using sustainable technology solutions.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency.
As a Nscaler, you'll build trust through openness and transparency, where everyone is inspired to do their best work.
Collaboration is key, and we work together swiftly and respectfully, embracing adaptability and resilience in all we do.
About the Role
Nscale is looking for AI Engineers to join our core AI team and drive the development of training, fine-tuning, and inference products for our GenAI cloud platform.
In this role, you will design and optimise scalable systems for generative AI models, tackle complex performance challenges, and implement cutting-edge AI solutions.
This is a unique opportunity to redefine how generative AI is built and consumed globally, working alongside a world-class team renowned for record-setting performance and a strong commitment to open-source innovation.
Responsibilities
- Collaborate with researchers, engineers, and product teams to develop innovative platform features and integrate cutting-edge advancements.
- Architect, build and optimise scalable, high-performance systems for training, fine-tuning, and inference of generative AI models.
- Design and implement advanced methodologies like LoRA, prefix-tuning, and adapter-based approaches for fine-tuning AI models.
- Develop robust, fault-tolerant systems for data ingestion, processing, and model customisation.
- Optimise GPU utilisation and system performance using frameworks like DeepSpeed, Triton Inference Server, TensorRT, and custom CUDA/Rocm kernels.
- Conduct performance testing and resolve bottlenecks across training, fine-tuning, and inference workflows.
- Document and build tooling to ensure successful use of Nscale's training, fine-tuning, and inference APIs.
Requirements
- 5+ years of experience building and deploying machine learning systems in production environments.
- Proficiency in Python and PyTorch, with a strong understanding of transformer architectures, LLMs, and multimodal generative models.
- Expertise in distributed training frameworks like DeepSpeed or Fully Sharded Data Parallel (FSDP).
- Experience with GPU programming and optimisation e.g CUDA, TensorRT, or ROCm.
- Knowledge of fine-tuning methods, such as LoRA, prefix-tuning, and adapter-based techniques, and experience improving model performance.
- Experience with containerised environments and Kubernetes for managing AI workloads, including tools like Kubeflow.
- Familiarity with inference systems such as Triton Inference Server, vLLM, or TGI.
- Demonstrated ability to write performant, well-tested, and production-quality code.
- Passionate about building tools and systems that enhance the AI developer experience
Preferred
- Experience developing large-scale and high-load production systems.
- Contributions to open-source AI frameworks or inference systems.
- Hands-on experience with advanced inference optimisation techniques, such as KVCache, MoE, adaptive batching, or gradient checkpointing.
- Strong understanding of low-level operating systems concepts, including multi-threading, memory management, and performance tuning.
- Experience developing APIs using OpenAPI 3.0+ specifications.
- Knowledge of efficient training and inference evaluation strategies, with demonstrated success in improving model efficiency.
In all we do, our core values guide us.
Relentless Innovation
At Nscale, we constantly push the boundaries of innovation, embracing creative risks to shape the future.
Our aim is to deliver products that not only meet but exceed today's expectations, setting new standards for tomorrow.
Ownership and Accountability
Every Nscaler is fully accountable for their work, driving it with excellence and urgency.
We set high standards, ensuring that our contributions are not just good but exceptional.
Openness and Transparency
We believe trust and transparency are key to our success.
We maintain open communication within our teams and with stakeholders, sharing both successes and challenges.
Our open-source approach allows customers to explore our technology, building trust and ensuring our solutions are both innovative, secure, and reliable.
Customer-Centric Focus
Our customers are central to our mission, and we are committed to delivering impactful solutions that drive real-world success.
We focus on deeply understanding their needs and challenges, striving to exceed expectations in both product quality and service.
Sustainability
We are dedicated to considering the long-term environmental and societal impacts of our technologies.
By integrating sustainability into our operations and product development, we ensure that our innovations are both effective and responsible, contributing positively to the world around us.
Full-Speed Collaboration
Collaboration at Nscale is fast, efficient, and respectful.
We work together seamlessly, with clear communication and mutual respect, ensuring our shared goals are met with high standards and impactful outcomes.
Equal Opportunities Statement
At Nscale, we are committed to fostering an inclusive, diverse, and equitable workplace.
We believe that a variety of perspectives enriches our work environment, and we warmly welcome applications from individuals of all backgrounds, experiences, and perspectives.
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.