Statistical Programming Fundamentals

Course Description

The course starts from the fundamentals of statistical programming through the description of standard programming elements - data types, packages and data structures, designing user-defined functions and objects. After that, we describe how to import data from different sources and prepare them for analysis - transformation and tidying of data, managing missing values, deriving new variables from existing ones, managing date / time and textual type of data. The basics of statistical and exploratory analysis of data sets are learned. The concept of grammar of graphics and ways of designing professional visualizations are discussed. Knowledge of managing different types of distributions is acquired as well as basic ways of creating simulations. Knowledge is gained how to implement chosen machine learning methods. The programmatic approach to data mining is mastered - sampling, separation into training and test sets, creation and evaluation of predictive and descriptive models.

Learning Outcomes

  1. analyze small and large data sets in a meaningful and organized manner
  2. identify the nature of the data and the nature of its processing
  3. use the interactive programming approach to data analysis
  4. modify the raw data into a form suitable for analysis
  5. prepare complex functions and packages
  6. create professional visualizations of datasets
  7. apply machine learning methods in the programming environment
  8. apply the methodology of preparing reports

Forms of Teaching

Lectures

Lectures in the classroom with prepared digital workbooks

Laboratory

Solving digital workbooks, solving programming tasks

Grading Method

Continuous Assessment Exam
Type Threshold Percent of Grade Threshold Percent of Grade
Laboratory Exercises 10 % 20 % 10 % 0 %
Homeworks 5 % 10 % 5 % 5 %
Class participation 0 % 10 % 0 % 0 %
Seminar/Project 5 % 20 % 5 % 20 %
Mid Term Exam: Written 5 % 15 % 0 %
Final Exam: Written 10 % 25 %
Exam: Written 50 % 75 %

Week by Week Schedule

  1. Basic syntax and semantics of higher level languages, Variables and simple data types (eg numbers. characters. logical values), Expressions and assignments. The notion of missing values.
  2. Complex data structures - vectors, matrices and lists. The principle of vectorization and recycling. Index operator. Location, logical and nominal referencing of elements in complex structures.
  3. Data frames as the main structure for storing datasets. Internal representation of data frames. Categorical variables.
  4. Program flow control commands - conditional execution and loops.
  5. Built-in functions. The notion of search path, lexical scope and environment. User-defined functions. Functional programming. Declarative alternatives to programming loops.
  6. Object-oriented programming in the context of statistical programming and data analysis environment.
  7. Pipeline operator and code chaining. The notion of tidy data. Preparation of data for analysis in the context of rough data transformations and data reshaping into a tidy format.
  8. Midterm exam
  9. Dates and timestamps. The notion of temporal data in the context of data analysis. Character strings and string processing. Regular expressions and text analysis.
  10. Methods for data management and exploratory analysis. Procedural equivalents of language commands for retrieving relational data. Set operations. Missing value management.
  11. Basic elements of grammar of graphics. Data visualization. The notion of aesthetics and geometry in the context of visualization.
  12. Programming methods for descriptive and inferential statistics. Simulations.
  13. Selected machine learning methods - linear regression, kNN classification.
  14. Introduction to predictive modeling. Training and testing dataset splits. Cross-validation methods. Declarative approach to the development and evaluation of predictive models.
  15. Final exam

Study Programmes

University undergraduate
Free Elective Courses (5. semester)
Free Elective Courses (5. semester)
University graduate
[FER3-HR] Audio Technologies and Electroacoustics - profile
Elective Courses (1. semester)
[FER3-HR] Communication and Space Technologies - profile
Elective Courses (1. semester)
[FER3-HR] Computational Modelling in Engineering - profile
Elective Courses (1. semester)
[FER3-HR] Computer Engineering - profile
Elective Courses (1. semester)
[FER3-HR] Computer Science - profile
Elective Courses (1. semester)
[FER3-HR] Control Systems and Robotics - profile
Elective Courses (1. semester)
[FER3-HR] Data Science - profile
Elective Courses (1. semester)
[FER3-HR] Electrical Power Engineering - profile
Elective Courses (1. semester)
[FER3-HR] Electric Machines, Drives and Automation - profile
Elective Courses (1. semester)
[FER3-HR] Electronic and Computer Engineering - profile
Elective Courses (1. semester)
[FER3-HR] Electronics - profile
Elective Courses (1. semester)
[FER3-HR] Information and Communication Engineering - profile
Elective Courses (1. semester)
[FER3-HR] Network Science - profile
Elective Courses (1. semester)
Elective Courses of the Profile (1. semester)
[FER3-HR] Software Engineering and Information Systems - profile
Elective Course of the Profile (1. semester)
Elective Courses (1. semester)

Literature

(.), Programirajmo u R-u,
(.), R for Data Science,
(.), Openintro Statistics,
(.), Introduction to Statistical Learning,
(.), Advanced R,

Associate Lecturers

For students

General

ID 222597
  Winter semester
5 ECTS
L3 English Level
L2 e-Learning
45 Lectures
0 Seminar
0 Exercises
15 Laboratory exercises
0 Project laboratory

Grading System

87 Excellent
75 Very Good
62 Good
50 Acceptable