AWS Machine Learning: Building an Expense Tracker Using Amazon Textract
24mIntermediate2021-02-10
Authors

Carlos Rivas
AWS Infrastructure Expert
Course details
If you’ve ever used an optical character recognition (OCR) program, you know that scanning documents for text can often be hit or miss. If you’re a developer working on an app that's dependent on accurate text scanning, say for an expense report, hit or miss just won’t cut it. In this project-based course, Carlos Rivera shows you how to use Amazon Textract to analyze scanned documents and convert them to text. Textract eliminates the complexity of having to train machine learning models from scratch to perform data capture tasks. And as Carlos points out, the program not only recognizes text, but it also considers the layout of the scanned document. Follow along with Carlos as he creates a serverless expense tracker that reads text from images using Textract, starting with the basic jargon of the program, through project implementation, and then implementing Textract.
Skills covered
Machine LearningAmazon Web Services (AWS)AmazonPersonaCloud ServicesCloud PlatformsArtificial Intelligence (AI)Cloud Computing
Concepts
0. Introduction
- 01 - Machine learning for optical character recognition
1. Architecture
- 02 - Textract concepts
- 03 - AWS Textract overview
- 04 - Expense tracker architecture
2. Project Implementation
- 05 - Implementing an AWS blueprint to integrate Lambda with S3
- 06 - Using S3 uploads to trigger a Lambda function
- 07 - Integrating Textract into the Python Lambda
- 08 - Using Textract in Python to process an image
3. Textract Implementation
- 09 - Parsing Textract metadata to get the required information
- 10 - Using regular expressions to find the desired values
- 11 - Looking for keywords within the extracted text
4. Totaling
- 12 - Updating JSON file with the current receipt total
- 13 - Using S3 as persistent storage for receipt details
- 14 - Validating and summarizing several executions of the code
Conclusion
- 15 - Next steps
Related courses
- Python for Data Science and Machine Learning Essential Training Part 1
- Artificial Intelligence Foundations: Neural Networks
- Spatial Machine Learning and Statistics in Python
- Complete Guide to Google BigQuery for Data and ML Engineers
- Applied Machine Learning: Value Estimation
- Applied Machine Learning: Supervised Learning
- Machine Learning in Telecommunication: From Basics to Real-World Cases
- Power BI: Integrating AI