Data Mining

Course Description

Data mining - definitions and areas of application. Types of data. Data sources and their acquisition. Data preprocessing - data manipulation, data filtering, data transformation. Unbalanced datasets. Machine learning algorithms for data processing: feature selection methods, classification algorithms, clustering methods, association rules. Models with clear interpretation based on induction rules. Model learning and evaluation. Time series analysis. Deep learning in data mining. Specificities of data mining in different fields of application. Use of freely available tools for data mining. Data mining project.

Learning Outcomes

  1. identify any potential shortcomings of the analyzed data set
  2. evaluate the suitability of the used sequence of machine learning methods in various fields of application
  3. combine feature selection methods on a given problem
  4. analyze the given data set using a suitable sequence of machine learning methods in at least one existing software tool
  5. develop your own software to analyze a particular dataset
  6. classify machine learning techniques by the type of problem they are solving
  7. analyze time series from different domains with predictive analytics techniques
  8. construct explainable machine learning models to facilitate reaching decisions in specific domain

Forms of Teaching

Lectures

Uživo ili online

Independent assignments

Data mining project

Grading Method

Continuous Assessment Exam
Type Threshold Percent of Grade Threshold Percent of Grade
Seminar/Project 40 % 60 % 40 % 60 %
Final Exam: Written 40 % 40 %
Exam: Written 40 % 40 %

Week by Week Schedule

  1. Course administration. Introduction to data mining. Description of the field. References.
  2. Data preparation for data mining. Project.
  3. Data transformation and feature extraction. Project.
  4. Feature selection (filter methods, wrapper methods, embedded methods, hybrid methods), dimensionality reduction. Project.
  5. Dataset problems and their solutions : unbalanced data, concept drift. Project.
  6. Interpretable machine learning: rule-based inductive systems. Project.
  7. Noninterpretable or partially interpretable machine learning: data clustering, ensembles. Project.
  8. -
  9. Frequent pattern mining, association rules. Project.
  10. Time series data mining: preprocessing and classification methods. Project.
  11. Time series data mining: prediction algorithms and significant event detection. Project.
  12. Deep learning in data mining. Project.
  13. Applied data mining in multiple fields: biomedicine, computational biology, finances. Predaja projekta.
  14. Project presentations
  15. Final exam

Study Programmes

University graduate
Data Science (profile)
Recommended elective courses (2. semester)

Literature

(.), Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. Morgan Kaufmann, 2016.,
(.), Fuernkranz J, Gamberger D, Lavrač N. Foundations of Rule Learning. Heidelberg : Springer, 2012,
(.), James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning: with Applications in R. Springer, 2014.,
(.), Raschka S, Mirjalili V. Python Machine Learning. 2nd ed. Packt Publishing, Birmingham UK, 2017.,
(.), Ryza S, Laserson U, Owen S, Wills J. Advanced Analytics with Spark: Patterns for Learning from Data at Scale. 2nd ed. O'Reilly Media, Sebastopol CA, USA, 2017.,
(.), Mitchell, R. Web Scraping with Python: Collecting more data from the Modern Web. 2nd ed. O'Reilly Media, Sebastopol CA, USA, 2018.,

For students

General

ID 223066
  Summer semester
5 ECTS
L3 English Level
L1 e-Learning
45 Lectures
18 Laboratory exercises

Grading System

88 Excellent
75 Very Good
63 Good
50 Acceptable