Data Science / Data Mining / R / Data Mining using R

R - Data Mining using R

Details

ID 2858811
Classroom 2 days
Webinar 4 days
Method Lecture with examples and exercises.
Prequisite General knowledge of math
Audience Information workers, IT professionals

Services:

  • Lunch / Catering
  • Assistance for hotel / travel bookings
  • Comelio certificate
  • Flexible: Free cancellation up until 10 days before the training

Summary

Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD) is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

Training Dates

  • 2020-Oct-26 - Oct-29
  • 2021-Jan-04 - Jan-07
  • 2021-Mar-15 - Mar-18
  • 2021-May-24 - May-27

650 EUR +VAT

Location | Enrollment


Agenda

Data Mining-Grundlagen

Statistik, multivariate Statistik und Data Mining – Data Mining-Kreislauf - Daten-Vorverarbeitung: Beschreibende Datenaggregation, Datenbereinigung, Datenintegration und –transformation – Datenreduktion – Diskretisierung und Konzept-Hierarchien – Data Mining und Business Intelligence: Datenbanken, Data Warehouses und OLAP als Basis für Data Mining

Data Mining mit der Assoziationsanalyse

Suchen von häufigen Kombinationen (Frequent Itemset Mining) – Apriori-Algorithmus - Assoziationsregeln und Assoziationsanalyse - Warenkorbanalyse

Data Mining mit Entscheidungsbäumen

Ableitung von Entscheidungsbäumen – Auswahl von Attributen – Beschneidung von Bäumen – Ableitung von Regeln - Gütemaße und Vergleich von Modellen

Data Mining mit Wahrscheinlichkeitstheorie

Wahrscheinlichkeitstheorie und Bayes Theorem –Naïve Bayes-Algorithmus – Bayes Netze

Fortgeschrittene Data Mining-Verfahren für Klassifikation

Künstliche neuronale Netze und der Backpropagation-Algorithmus - Support Vector Machines für linear und nicht-linear trennbare Daten – Klassifikation mit Assoziationsanalyse – Lazy und Eager Learners

Cluster-Analyse

Einführung in die Cluster Analyse – Ähnlichkeits- und Distanzmessung - Varianten und grundlegende Techniken – Partitionierende Methoden: k-Means-Verfahren - Hierarchische Methoden: agglomerative und divisive Verfahren – Weitere Verfahren: Dichte- und Grid-basierte Methoden

Trainer

Marco Skulschus (born in Germany in 1978) studied economics in Wuppertal (Germany) and Paris (France) and wrote his master´s thesis about semantic data modeling. He started working as a lecturer and consultant in 2002.

Publications
  • Grundlagen empirische Sozialforschung ISBN 978-3-939701-23-1
  • System und Systematik von Fragebögen ISBN 978-3-939701-26-2
  • Oracle PL/SQL ISBN 978-3-939701-40-8
  • MS SQL Server - T-SQL Programmierung und Abfragen ISBN 978-3-939701-69-9
Projects

- He works as an IT-consultant and project manager. He developed various Business Intelligence systems for industry clients and the public sector. For several years now, he is responsible for a BI-team in India which is mainly involved in BI and OLAP projects, reporting systems as well as statistical analysis and Data Mining.

Research

He led several research projects and was leading scientist and project manager of a publicly funded project about interactive questionnaires and online surveys.