Data Science Foundations: Data Engineering
53mBeginner2017-01-23
Authors

Ben Sullins
Data Geek, Tech Consultant
Course details
Approach big data with confidence by mastering the core skills needed to put data to work for your business. This course covers the basics of data engineering, system design, analytics, and business intelligence. Data science expert Ben Sullins explains how to collect and organize your data so you can deliver results that your organization can leverage. Ben starts by examining the modern data ecosystem and how it relates to running a smart and efficient data hub. Then, he shows you how to perform the principle tasks involved in managing, loading, extracting, and transforming data. He also takes you through staging, profiling, cleansing, and migrating data. Along the way, he provides actionable recommendations that applicable to data experts throughout an organization—analysts, engineers, scientists, modelers, and more.
Learning objectives
Working with systems and schemas
Managing of a good data pipeline
Setting up an environment
Loading and profiling data
Testing quality
Adding data types
Handling missing values and inferred members
Performing master data lookups
Loading schemas and tables
Creating views
Learning objectives
Working with systems and schemas
Managing of a good data pipeline
Setting up an environment
Loading and profiling data
Testing quality
Adding data types
Handling missing values and inferred members
Performing master data lookups
Loading schemas and tables
Creating views
Skills covered
Data EngineeringFoundationsData Science
Concepts
0. Introduction
- 01 - Welcome
- 02 - What you should know before watching this course
- 03 - Using the exercise files
1. Ecosystem Overview
- 04 - Data science system overview
- 05 - Star schema design overview
- 06 - Where does data engineering fit
- 07 - Components of a good data pipeline
- 08 - Environment setup
2. Staging Data
- 09 - Loading and profiling data
- 10 - Data quality testing
3. Cleansing Data
- 11 - Adding data types
- 12 - Handling missing values
- 13 - Verifying addresses
4. Conforming Data
- 14 - Performing master data lookups
- 15 - Handling inferred members
5. Delivering Analytical Data Sets
- 16 - Loading the star schema
- 17 - Loading dimension tables
- 18 - Loading fact tables
- 19 - Creating views
Conclusion
- 20 - Next steps
Related courses
- Big Data in the Age of AI
- Complete Guide to Analytics Engineering
- Advanced Analytics Engineering: Real-World Practice
- Complete Guide to Google BigQuery for Data and ML Engineers
- PySpark Essential Training: Introduction to Building Data Pipelines
- Cleaning Data for Effective Data Science: Data Ingestion, Anomaly Detection, Value Imputation, and Feature Engineering
- Scala Essential Training for Data Science
- SPSS: Wrangling, Visualizing, and Modeling Data