Popis predmeta

Course Description

This course introduces the students to five key facets of a data-based research: (1) data wrangling, cleaning, and sampling to obtain a suitable data set, (2) data management to to facilitate efficient access to big data, (3) exploratory data analysis to generate hypotheses and intuition, (4) prediction based on statistical methods such as regression and classification, and (4) communication of results through visualization, stories, and interpretable summaries.

Learning Outcomes

  1. Use Python and other tools to scrape, clean, and process data
  2. Use data management techniques to store data locally and in cloud infrastructures
  3. Use statistical methods and visualization to quickly explore data
  4. Apply statistics and computational analysis to make predictions based on data
  5. Describe the outcome of data analysis using descriptive statistics and visualizations
  6. Use cluster and cloud infrastructure to perform data-intensive computation

Forms of Teaching

Lectures

Lectures and examples in jupyter notebook

Laboratory

help with projects

Grading Method

Continuous Assessment Exam
Type Threshold Percent of Grade Threshold Percent of Grade
Seminar/Project 25 % 40 % 25 % 40 %
Final Exam: Written 50 % 60 %
Comment:

final exam, data analysis task on a computer

Week by Week Schedule

  1. Course administration. Overview of the data science field. Supporting technologies for data science.
  2. Data handling: data acquisition, data models, common dataset issues, data reshaping, data cleanup. Laboratory: data handling in Python.
  3. Data visualization: various graphs for dataset visualization, best practice for data visualization, visualization for special purposes, visualization tools. Laboratory: data visualization in Python.
  4. Hypothesis testing. Confidence intervals. Relating two variables.
  5. Applied linear regression in descriptive data analysis. Data transformation. Linear regression assumptions.
  6. Data collection by observation.
  7. Applied supervised machine learning (classification and prediction).
  8. --
  9. Applied machine learning (data collection, labeling, discretization, features, normalization, izbor modela, metrics, model assessment).
  10. Applied unsupervised machine learning (clustering).
  11. Text handling.
  12. Text handling.
  13. Handling graphs and networks.
  14. Project presentations.
  15. Final exam.

Study Programmes

University undergraduate
Computing (study)
Free Elective Courses (5. semester)
Electrical Engineering and Information Technology (study)
Free Elective Courses (5. semester)
University graduate
Audio Technologies and Electroacoustics (profile)
Free Elective Courses (1. semester) (3. semester)
Communication and Space Technologies (profile)
Free Elective Courses (1. semester) (3. semester)
Computational Modelling in Engineering (profile)
Free Elective Courses (1. semester) (3. semester)
Computer Engineering (profile)
Elective Course of the Profile (1. semester)
Computer Science (profile)
Free Elective Courses (1. semester) (3. semester)
Control Systems and Robotics (profile)
Free Elective Courses (1. semester) (3. semester)
Data Science (profile)
Core-elective courses (1. semester)
Electrical Power Engineering (profile)
Free Elective Courses (1. semester) (3. semester)
Electric Machines, Drives and Automation (profile)
Free Elective Courses (1. semester) (3. semester)
Electronic and Computer Engineering (profile)
Elective Courses of the Profile (1. semester) (3. semester)
Electronics (profile)
Free Elective Courses (1. semester) (3. semester)
Information and Communication Engineering (profile)
Elective Courses of the Profile (1. semester) Elective Coursesof the Profile (3. semester)
Network Science (profile)
Free Elective Courses (1. semester) (3. semester)
Software Engineering and Information Systems (profile)
Elective Course of the profile (3. semester) Elective Course of the Profile (1. semester)

Literature

Jacob T. Vanderplas, Jake VanderPlas (2016.), Python Data Science Handbook, O'Reilly Media
Matt Harrison, Theodore Petrou (2020.), Pandas 1.x Cookbook, Packt Publishing Ltd
Alice Zheng, Amanda Casari (2018.), Feature Engineering for Machine Learning, "O'Reilly Media, Inc."

For students

General

ID 183455
  Winter semester
5 ECTS
L3 English Level
L1 e-Learning
45 Lectures
12 Laboratory exercises

Grading System

88 Excellent
75 Very Good
63 Good
50 Acceptable

Learning Outcomes

  1. Use Python and other tools to scrape, clean, and process data
  2. Use data management techniques to store data locally and in cloud infrastructures
  3. Use statistical methods and visualization to quickly explore data
  4. Apply statistics and computational analysis to make predictions based on data
  5. Describe the outcome of data analysis using descriptive statistics and visualizations
  6. Use cluster and cloud infrastructure to perform data-intensive computation

Forms of Teaching

Lectures

Lectures and examples in jupyter notebook

Laboratory

help with projects

Grading Method

Continuous Assessment Exam
Type Threshold Percent of Grade Threshold Percent of Grade
Seminar/Project 25 % 40 % 25 % 40 %
Final Exam: Written 50 % 60 %
Comment:

final exam, data analysis task on a computer

Week by Week Schedule

  1. Course administration. Overview of the data science field. Supporting technologies for data science.
  2. Data handling: data acquisition, data models, common dataset issues, data reshaping, data cleanup. Laboratory: data handling in Python.
  3. Data visualization: various graphs for dataset visualization, best practice for data visualization, visualization for special purposes, visualization tools. Laboratory: data visualization in Python.
  4. Hypothesis testing. Confidence intervals. Relating two variables.
  5. Applied linear regression in descriptive data analysis. Data transformation. Linear regression assumptions.
  6. Data collection by observation.
  7. Applied supervised machine learning (classification and prediction).
  8. --
  9. Applied machine learning (data collection, labeling, discretization, features, normalization, izbor modela, metrics, model assessment).
  10. Applied unsupervised machine learning (clustering).
  11. Text handling.
  12. Text handling.
  13. Handling graphs and networks.
  14. Project presentations.
  15. Final exam.

Study Programmes

University undergraduate
Computing (study)
Free Elective Courses (5. semester)
Electrical Engineering and Information Technology (study)
Free Elective Courses (5. semester)
University graduate
Audio Technologies and Electroacoustics (profile)
Free Elective Courses (1. semester) (3. semester)
Communication and Space Technologies (profile)
Free Elective Courses (1. semester) (3. semester)
Computational Modelling in Engineering (profile)
Free Elective Courses (1. semester) (3. semester)
Computer Engineering (profile)
Elective Course of the Profile (1. semester)
Computer Science (profile)
Free Elective Courses (1. semester) (3. semester)
Control Systems and Robotics (profile)
Free Elective Courses (1. semester) (3. semester)
Data Science (profile)
Core-elective courses (1. semester)
Electrical Power Engineering (profile)
Free Elective Courses (1. semester) (3. semester)
Electric Machines, Drives and Automation (profile)
Free Elective Courses (1. semester) (3. semester)
Electronic and Computer Engineering (profile)
Elective Courses of the Profile (1. semester) (3. semester)
Electronics (profile)
Free Elective Courses (1. semester) (3. semester)
Information and Communication Engineering (profile)
Elective Courses of the Profile (1. semester) Elective Coursesof the Profile (3. semester)
Network Science (profile)
Free Elective Courses (1. semester) (3. semester)
Software Engineering and Information Systems (profile)
Elective Course of the profile (3. semester) Elective Course of the Profile (1. semester)

Literature

Jacob T. Vanderplas, Jake VanderPlas (2016.), Python Data Science Handbook, O'Reilly Media
Matt Harrison, Theodore Petrou (2020.), Pandas 1.x Cookbook, Packt Publishing Ltd
Alice Zheng, Amanda Casari (2018.), Feature Engineering for Machine Learning, "O'Reilly Media, Inc."

For students

General

ID 183455
  Winter semester
5 ECTS
L3 English Level
L1 e-Learning
45 Lectures
12 Laboratory exercises

Grading System

88 Excellent
75 Very Good
63 Good
50 Acceptable