DevOps for Data Scientists
32mIntermediate2018-05-10
Authors

Dan Sullivan
Enterprise Architect, Big Data Expert
Course details
Data scientists create data models that need to run in production environments. Many DevOps practices are relevant to production-oriented data science applications, but these practices are often overlooked in data science training. In addition, data science and machine learning have distinct requirements, such as the need to revise models while in use. This course was designed for data scientists who need to support their models in production, as well as for DevOps professionals who are tasked with supporting data science and machine learning applications. Learn about key data science development practices, including the testing and validation of data science models. This course also covers how to use the Predictive Model Markup Language (PMML), monitor models in production, work with Docker containers, and more.
Learning objectives
Using Git for version control
Incorporating model testing into the deployment process
Working with the Predictive Model Markup Language
Securing the data science models in production
Monitoring models in production
Creating a Dockerfile for data science models
Learning objectives
Using Git for version control
Incorporating model testing into the deployment process
Working with the Predictive Model Markup Language
Securing the data science models in production
Monitoring models in production
Creating a Dockerfile for data science models
Skills covered
Data Science FoundationsDevOps FoundationsDevOpsData ScienceDeep Dive (X:Y)
Concepts
0. Introduction
- 01 - Welcome
- 02 - Target audience
1. Data Science Development Practices
- 03 - Data science and software engineering
- 04 - Collecting and munging data
- 05 - Experimenting with data, features, and algorithms
- 06 - Testing and validating models
2. Data Science Models to Production
- 07 - Version control for data science models
- 08 - Predictive Model Markup Language
- 09 - Deploying models with automation tools
3. Deployment Practices
- 10 - Deploying to staging environment
- 11 - Canary deployments
- 12 - Securing the data science models in production
- 13 - Monitoring models in production
4. Data Science Models in Containers
- 14 - Introduction to Docker
- 15 - Creating a Dockerfile for data science models
- 16 - Data science Docker image repository
Conclusion
- 17 - Overview of DevOps best practices for data science
Related courses
- Python for Data Science and Machine Learning Essential Training Part 1
- Data Literacy: Exploring and Describing Data
- Big Data in the Age of AI
- Decision Science Fundamentals
- Did It Work? Program Evaluation in Data Science
- Program Evaluation for Data Science
- Cleaning Data for Effective Data Science: Data Ingestion, Anomaly Detection, Value Imputation, and Feature Engineering
- Scala Essential Training for Data Science