Description:
The Data Scientist on the Veracyte Data Engineering team will leverage data from diverse internal and external sources to derive actionable insights, develop predictive models, and support the company’s mission of improving diagnostic accuracy. This role collaborates closely with data engineers, the Technical Program Manager (TPM), and cross-functional teams in a Scrum environment to deliver high-impact data solutions aligned with Veracyte’s global data strategy and digital transformation goals.
The position is open to US Remote (working PST hours).
Key Responsibilities
- Data Analysis and Modeling:
- Analyze large, complex datasets from the Veracyte Lakehouse (e.g., genomic, clinical, operational data) to identify trends and patterns.
- Develop and deploy machine learning models using tools like Amazon SageMaker and Python for applications such as biomarker discovery and clinical decision support.
- Collaboration and Requirement Gathering:
- Work with the TPM and stakeholders to define data science requirements and user stories for inclusion in the team’s Jira backlog.
- Partner with data engineers to ensure data pipelines and cataloged datasets meet analytical needs.
- Model Development and Optimization:
- Build, validate, and refine AI/ML models (e.g., LLM refinement, Verachat RAG) to support AI training and operational dashboards.
- Optimize models for performance and scalability within cloud environments like AWS and Snowflake.
- Data Interpretation and Reporting:
- Translate data insights into actionable recommendations for business operations, R&D, and healthcare providers.
- Create visualizations and reports to communicate findings to technical and non-technical audiences.
- Support Data Strategy:
- Contribute to the development of data-driven strategies by providing analytical expertise.
- Ensure models and analyses comply with data governance and security policies.
- Knowledge Sharing:
- Mentor junior team members and promote a culture of continuous learning in data science practices.
- Stay updated on emerging tools and techniques to enhance team capabilities.
Who You Are
- Education: Bachelor’s or Master’s degree in Data Science, Computer Science, Statistics, or a related field.
- Experience:
- 5+ years (BS) or 3+ years (MS) of experience in data science or a similar role.
- Experience with healthcare or genomic data is a plus.
- Technical Skills:
- Proficiency in Python, R, or similar languages for data analysis and modeling.
- Experience with AWS services (e.g., SageMaker, Redshift) and Snowflake for data processing and storage.
- Familiarity with machine learning frameworks (e.g., TensorFlow, PyTorch) and data visualization tools (e.g., Matplotlib, Tableau).
- Knowledge of SQL and data cataloging concepts is advantageous.
- Soft Skills:
- Strong analytical and problem-solving skills.
- Excellent communication skills to collaborate with cross-functional teams in a Scrum setting.
- Ability to work effectively in a fast-paced, innovative environment.