CLYSO is a global leader in open-source IT solutions, specializing in Ceph and Kubernetes. Through our core platform, CLYSO Enterprise Storage (CES), we empower companies worldwide to build highly scalable, vendor-independent storage infrastructures in the peta- and exabyte range.
To drive our international growth, we are looking for a Ceph Storage Engineer (m/f/d) – Remote India. If you are an expert in distributed storage systems and are passionate about optimizing complex Ceph clusters and shaping the future of our storage infrastructure from India, we want you on our global team!
Your mission
For our team in India we are looking for a Ceph Storage Engineer.
Your key responsibilities are:
- Cluster Maintenance: Perform major/minor version upgrades and manage cluster scaling and capacity expansion.
- Deep-Dive Debugging: Use Ceph logs and diagnostic tools to identify and resolve issues in production environments.
- Linux & Hardware Tuning: Optimize the Linux OS (kernel, networking, and I/O) and server hardware (BIOS/firmware) for peak Ceph performance.
- Operational Automation: Create scripts and automation to handle routine maintenance and CRUSH map cleanups.
- Lifecycle Management: Execute major and minor Ceph version upgrades to ensure cluster stability and security.
- Performance Engineering: Conduct deep-dive performance investigations and tuning if required.
- Automation: Develop and maintain automation for operational tasks, including capacity expansion, CRUSH map optimizations, and routine maintenance.
- Cluster Architecture: Build and configure new Ceph clusters from the ground up to support our rapidly scaling infrastructure.
- Incident Response: Participate in a 24x7 on-call rotation, monitoring cluster health and resolving complex issues efficiently.
- Upstream Contribution: Participate in the Ceph community through bug fixes, feature enhancements, and code reviews if required.
What We’re Looking For
- Storage Expertise: Proven experience managing and tuning production-grade Ceph clusters.
- Deep understanding of server hardware profiles, disk controllers, and low-level system troubleshooting.
- Experience using tools like Ansible, Python, or Go to automate repetitive infrastructure tasks.
- A desire to contribute to the upstream Ceph ecosystem (or a history of doing so). An appetite for public speaking and presenting at industry conferences would be a bonus.
- Ability to thrive in a fast-paced environment and handle high-pressure troubleshooting scenarios.
- Exceptional problem-solving skills with meticulous attention to detail.
Why Join Us?
You won't just be "keeping the lights on." You will be part of a collaborative team that treats storage as a product. We value engineering over manual intervention and encourage our team to spend time on upstream contributions that benefit the wider community.