MSG
Roche
Principal Data Scientist
09 April 2026
Department
Health & Biotech
Job Category
Health & Biotech
Description
- Provide technical direction and mentorship to hybrid teams of Data Scientists and Bioinformatics Software Engineers
- Establish best practices for code quality, collaborative development, and model lifecycle management across diverse teams
- Lead the development of algorithms for DNA sequence analysis, including basecalling and post-primary analyses
- Innovate on bioinformatics methods like string matching, graph assembly, and Hidden Markov Models to address SBX data challenges
- Design and deploy advanced deep learning models, such as Transformers, CNNs, and RNNs/LSTMs, for analyzing electrical signal data and predicting sequencing outcomes
- Advocate for MLOps practices to ensure model reproducibility, version control, and monitoring in production environments
- Architect scalable workflows using tools like Airflow and Nextflow for research exploration and production deployment
- Manage and optimize HPC workloads using SLURM, while writing Bash and Python scripts to integrate complex systems efficiently
Qualifications
- MS/Ph.D. in Bioinformatics, Computer Science, Computational Biology, Physics, or a related discipline
- 5+ years of post-PhD industrial experience, in similar fields
- Deep theoretical and practical knowledge of algorithms used in DNA sequence analysis (e.g., dynamic programming, BWT, de Bruijn graphs, HMMs) and experience implementing them from scratch or optimizing existing implementations
- Expert-level proficiency in applying Machine Learning and Deep Learning frameworks (PyTorch, TensorFlow, Keras) to biological data. Experience with supervised/unsupervised learning and sequence modeling is essential
- Advanced proficiency in Linux/Unix environments, including complex Bash scripting and workload management on HPC clusters using SLURM
- Mastery of workflow management systems, specifically Nextflow (DSL2), and experience deploying pipelines in cloud or cluster environments
- Expert-level proficiency in Python and a strong command of software engineering principles (OOP, Unit Testing, CI/CD, Git)