Special offers now — see discounted courses.
day
:
hour
:
min
:
sec
See special offers
R Programming in Data Science: High Volume Data

R Programming in Data Science: High Volume Data

1h 25mIntermediate2018-10-26

Authors

Mark Niemann-Ross

Mark Niemann-Ross

Technologist experienced in hardware, software, and science fiction

Course details

Data fills all available space, and now that storage is cheap, the amount of data has exploded. However, all that information is useless without analysis and context. The R programming language is designed to make it easier to analyze and visualize massive amounts of data. For example, R provides the ability to multiply one block of variables by another—an assumption that provides inherent advantages over other languages. This course shows why R is ideal for high volumes of data, introduces more efficient ways to use the language, and explains how to avoid the problems and capitalize on the opportunities of big data. Learn how to determine if you have enough memory and processing power, produce visualizations of big data, optimize your R code, and use advanced techniques such as parallel processing to speed up your computations. Plus, discover how to integrate R with big-data solutions such as SQL databases and Apache Spark.

Learning objectives
Accessing memory and processing power
Visualizing high-volume data
Profiling and optimizing R code
Compiling R functions
Parallel processing with R
Using R with other big data solutions

Skills covered

RStudioRStatisticsData EngineeringData AnalysisProgramming LanguagesData ScienceBusiness Analysis and StrategyBusiness Software and ToolsOpen SourceSoftware DevelopmentDeep Dive (X:Y)

Concepts

0. Introduction

  • 01 - Wrangling high-volume data with R
  • 02 - Sample data set

1. Problems and Opportunities with High-Volume Data

  • 03 - Perspectives on high-volume data
  • 04 - Big data and available memory
  • 05 - Code - Finding available memory
  • 06 - Big data and CPU cycles
  • 07 - Code - How fast is your computer

2. Visualizing High-Volume Data

  • 08 - High-volume data and visualizations
  • 09 - Code - Graphs for high-volume data
  • 10 - Code - rug() and jitter()
  • 11 - Code - Applying statistics to plots
  • 12 - Code - Subsampled graphs for high-volume data
  • 13 - Code - Trellising data across multiple charts

3. Working within the R Programming Language

  • 14 - R programming tools for high-volume data
  • 15 - Downsampling
  • 16 - Profile R code to find inefficiencies
  • 17 - Code - Profile R code to find inefficiencies
  • 18 - Avoid the copy-on-modify problem with R
  • 19 - Code - Avoid copy-on-modify with data.table
  • 20 - Optimization versus readability

4. Advanced High-Volume Techniques

  • 21 - Compile R functions
  • 22 - Parallel processing with R
  • 23 - Code - Parallel R functions
  • 24 - bigmemory, LaF, and ff packages

5. Use R with External Big Data Solutions

  • 25 - Store high-volume data in a database
  • 26 - Code - R with databases
  • 27 - Cloud computing with R
  • 28 - Sparklyr with R
  • 29 - Code - R with Sparklyr

Conclusion

  • 30 - Summary of high-volume data with R

Related courses

About us

LyndaKade is a leading learning platform that helps people learn business, software, technology, and creative skills to achieve personal and professional goals.

Phone numberAparat ChannelTelegram SupportTelegram ChannelInstagram Page

All rights to this site belong to LyndaKade.

Terms of Service|Privacy Policy

نماد الکترونیک enamad در صورت اتصال با آی‌پی داخل کشور، نمایش داده خواهد شد.
logo-samandehi - لوگو ساماندهی
zarinpal
zibal