Data Science / Statistics / R Foundation / R / Exploratory Data Analysis using R

R - Exploratory Data Analysis using R

Details

ID 2858624
Classroom 3 days
Webinar 5 days
Method Presentation with examples and hands-on labs.
Prequisite Basics in R and Statistics
Audience Data Analysts

Services:

  • Lunch / Catering
  • Assistance for hotel / travel bookings
  • Comelio certificate
  • Flexible: Free cancellation up until 10 days before the training

Summary

Exploratory Data Analysis (EDA) is a statistical approach to analyzing data sets to summarize their main characteristics. This training primarily focuses on four main techniques of EDA: Principal Component Analysis (PCA) for quantitative variables, Correspondence Analysis (CA) and Multiple Correspondence Analysis (MCA) for categorical variables and finally (hierarchical and partitioning) clustering methods. As an umbrealla technique, this training also shows Factor Analysis (FA) and Multiple Factor Analysis (MFA). For the hands-on labs and practical examples the participants will use R and esp. FactoMineR - a special R package for the exploratory data analysis.

Training Dates

  • 2020-Nov-02 - Nov-06
  • 2021-Jan-11 - Jan-15
  • 2021-Mar-22 - Mar-26
  • 2021-May-31 - Jun-04

950 EUR +VAT

Location | Enrollment


Agenda

Principal Component Analysis (PCA)

Objectives of PCA and Introduction to PCA - Studying Individuals: The Cloud of Individuals, Fitting the Cloud of Individuals - Variables: The Cloud of Variables, Fitting the Cloud of Variables - Relationships - Interpreting the Data - Testing the Significance of the Components - Implementation with R and FactoMineR

Correspondence Analysis (CA)

Objectives and the Independence Model - Fitting the Clouds: Row and Column Profiles - Interpreting the Data - Implementation with R and FactoMineR

Multiple Correspondence Analysis (MCA)

Objectives: Studying Individuals, Variables, and Categories - Defining Distances between Individuals and Distances between Categories - CA on the Indicator Matrix: Relationship between MCA and CA, The Cloud of Individuals, Variables, and Categories - Implementation with R and FactoMineR

Clustering

Concepts of Similarity and Distance: Similarity between Individuals and Groups - Ward's Method - Partitioning and Hierarchical Clustering - Direct Search for Partitions: K-means Algorithm - Clustering and Principal Component Methods - Implementation with R and FactoMineR

Multiple Factor Analysis (MFA)

Factorial Analysis of Mixed Data - Weighting Groups of Variables - Comparing Groups of Variables and Indscal Model - Qualitative and Mixed Data - Multiple Factor Analysis and Procrustes Analysis - Hierarchical Multiple Factor Analysis - Implementation with R and FactoMineR

Trainer

Marco Skulschus (born in Germany in 1978) studied economics in Wuppertal (Germany) and Paris (France) and wrote his master´s thesis about semantic data modeling. He started working as a lecturer and consultant in 2002.

Publications
  • Grundlagen empirische Sozialforschung ISBN 978-3-939701-23-1
  • System und Systematik von Fragebögen ISBN 978-3-939701-26-2
  • Oracle PL/SQL ISBN 978-3-939701-40-8
  • MS SQL Server - T-SQL Programmierung und Abfragen ISBN 978-3-939701-69-9
Projects

- He works as an IT-consultant and project manager. He developed various Business Intelligence systems for industry clients and the public sector. For several years now, he is responsible for a BI-team in India which is mainly involved in BI and OLAP projects, reporting systems as well as statistical analysis and Data Mining.

Research

He led several research projects and was leading scientist and project manager of a publicly funded project about interactive questionnaires and online surveys.