Senior Cloud Engineer
San Francisco Bay Area (Hybrid)
As a Senior Cloud Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI’s rapid growth.
This system will span many, heterogeneous environments (Kubernetes, VMs, bare metal compute, and edge deployments) and will provide a cohesive and reliable abstraction for running AI workloads in them. You will get to be a technology thought leader, evangelize new, cutting-edge technologies, and solve complex problems.
To be successful, you’ll need to be deeply technical and capable of holding your own with other strong peers. You possess excellent communication, collaboration, and diplomacy skills. You have experience practicing infrastructure-as-code, including using tools like Terraform and Ansible. You’ll have strong software development fundamentals and skills. In addition, you’ll have strong systems knowledge and troubleshooting abilities.
- 5+ years of professional software development experience and proficiency in at least one backend programming language (Golang desired).
- Demonstrated experience with high performance or distributed cloud microservices architectures and ideally experience building them in operation at a global scale using multiple cloud providers such as AWS, Azure, or GCP.
- Excellent understanding of low level operating systems concepts including multi-threading, memory management, networking and storage, performance, and scale.
- Pragmatic, methodical, well-organized, detail-oriented, and self-starting.
- Experience with Kubernetes and containerization, VPNs, AI workloads, and blockchain based protocols a plus.
- GPU programming, NCCL, Cuda knowledge a plus.
- Experience with Pytorch or Tensorflow a plus.
- 5+ years experience writing high-performance, well-tested, production quality code.
- Perform architecture and research work for decentralized AI workloads.
- Work on the core, open-source Together AI platform.
- Create services, tools, and developer documentation.
- Create testing frameworks for robustness and fault-tolerance.
About Together AI
Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society.
We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work.
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.