Data Science / Visualization / R Foundation / R / Statistical Analysis with Graphics using R

R - Statistical Analysis with Graphics using R

Details

ID 2858711
Classroom 4 days
Webinar 5 days
Method Lecture with examples and exercises.
Prequisite Basic knowledge of statistics
Audience Data Analysts

Services:

  • Lunch / Catering
  • Assistance for hotel / travel bookings
  • Comelio certificate
  • Flexible: Free cancellation up until 10 days before the training

Summary

Graphics can effectively complement statistical data analysis in various ways. Successful graphics arise from a combination of good design and good implementation. This training explores mainly two R packages for statistical graphics: lattice and ggplot2. The lattice package extends the R language by providing a coherent set of tools to produce statistical graphics with an emphasis on multivariate data. ggplot2 is an R package for producing statistical, or data, graphics, but it is unlike most other graphics packages because it has a deep underlying grammar. This makes ggplot2 very powerful, because you are not limited to a set of pre-specified graphics, but you can create new graphics that are precisely tailored for your problem. The training is divided into two parts, with the first being an introduction to the development of graphics using the lattice package and the second one using the ggplot2 package for similar visualizations but also far more complex and more sophisticated visual analyses.

Training Dates

  • 2020-Nov-13 - Nov-12
  • 2021-Jan-22 - Jan-21
  • 2021-Apr-02 - Apr-01
  • 2021-Jun-11 - Jun-10

1350 EUR +VAT

Location | Enrollment


Agenda

Introduction to graphics in R

Introduction to the lattice package - Multipanel conditioning - The "trellis"" object and its properties: the formula, data, conditioning and various plots/tiles in one diagram - Dimension and physical layout - Grouped displays - Annotation: Captions, labels, and legends

Graphics for Univariate Distributions

Density Plot - Histograms - Normal Q–Q plots - The empirical CDF (Cumulative Distribution Function) - Box-and-whisker plots - Strip plots - Working with small and large datasets

Graphics for Multivariate Distributions

Displaying Multiway Tables: Dot plots, Bar charts, Visualizing categorical data - Scatter Plots and Extensions - Trivariate Displays: Three-dimensional scatter plots, Surfaces and two-way tables

Advanced Graphical Parameters of the lattice Package

The parameter system: Themes and devices - Plot Coordinates and Axis Annotation: Axis annotations (ticks and labels), Limits and aspect ratio, Scale components and the axis function, Labels and Legends - Data Manipulation: Combining data sources, Subsetting, Ordering levels of categorical variables, Manipulating the "trellis" Object

Introduction to graphics using ggplot2

Datasets - Basic use - Colour, size, shape and other aesthetic attributes - Plot geometries - Components of the layered ggplot2 grammar: Layers, Scales, Coordinate system, Faceting

Visualization in ggplot2

Layers - Overall layering strategy - Aesthetic mappings - Creating a plot - Basic plot types - Displaying distributions - Dealing with overplotting - Surface plots - Drawing maps - Revealing uncertainty - Statistical summaries - Annotating a plot

Optimizing plots for publication and presentation

Themes - Customising scales and geoms - Multiple plots on the same page

Trainer

Marco Skulschus (born in Germany in 1978) studied economics in Wuppertal (Germany) and Paris (France) and wrote his master´s thesis about semantic data modeling. He started working as a lecturer and consultant in 2002.

Publications
  • Grundlagen empirische Sozialforschung ISBN 978-3-939701-23-1
  • System und Systematik von Fragebögen ISBN 978-3-939701-26-2
  • Oracle PL/SQL ISBN 978-3-939701-40-8
  • MS SQL Server - T-SQL Programmierung und Abfragen ISBN 978-3-939701-69-9
Projects

- He works as an IT-consultant and project manager. He developed various Business Intelligence systems for industry clients and the public sector. For several years now, he is responsible for a BI-team in India which is mainly involved in BI and OLAP projects, reporting systems as well as statistical analysis and Data Mining.

Research

He led several research projects and was leading scientist and project manager of a publicly funded project about interactive questionnaires and online surveys.