Statistical Programming Fundamentals
The course starts from the fundamentals of statistical programming through the description of standard programming elements - data types, packages and data structures, designing user-defined functions and objects. After that, we describe how to import data from different sources and prepare them for analysis - transformation and tidying of data, managing missing values, deriving new variables from existing ones, managing date / time and textual type of data. The basics of statistical and exploratory analysis of data sets are learned. The concept of grammar of graphics and ways of designing professional visualizations are discussed. Knowledge of managing different types of distributions is acquired as well as basic ways of creating simulations. Knowledge is gained how to implement chosen machine learning methods. The programmatic approach to data mining is mastered - sampling, separation into training and test sets, creation and evaluation of predictive and descriptive models.
- analyze small and large data sets in a meaningful and organized manner
- identify the nature of the data and the nature of its processing
- use the interactive programming approach to data analysis
- modify the raw data into a form suitable for analysis
- prepare complex functions and packages
- create professional visualizations of datasets
- apply machine learning methods in the programming environment
- apply the methodology of preparing reports
Forms of Teaching
Lectures in the classroom with prepared digital workbooksLaboratory
Solving digital workbooks, solving programming tasks
|Type||Threshold||Percent of Grade||Threshold||Percent of Grade|
|Laboratory Exercises||10 %||20 %||10 %||0 %|
|Homeworks||5 %||10 %||5 %||5 %|
|Class participation||0 %||10 %||0 %||0 %|
|Seminar/Project||5 %||20 %||5 %||20 %|
|Mid Term Exam: Written||5 %||15 %||0 %|
|Final Exam: Written||10 %||25 %|
|Exam: Written||50 %||75 %|
Week by Week Schedule
- Basic syntax and semantics of higher level languages, Variables and simple data types (eg numbers. characters. logical values), Expressions and assignments. The notion of missing values.
- Complex data structures - vectors, matrices and lists. The principle of vectorization and recycling. Index operator. Location, logical and nominal referencing of elements in complex structures.
- Data frames as the main structure for storing datasets. Internal representation of data frames. Categorical variables.
- Program flow control commands - conditional execution and loops.
- Built-in functions. The notion of search path, lexical scope and environment. User-defined functions. Functional programming. Declarative alternatives to programming loops.
- Object-oriented programming in the context of statistical programming and data analysis environment.
- Pipeline operator and code chaining. The notion of tidy data. Preparation of data for analysis in the context of rough data transformations and data reshaping into a tidy format.
- Midterm exam
- Dates and timestamps. The notion of temporal data in the context of data analysis. Character strings and string processing. Regular expressions and text analysis.
- Methods for data management and exploratory analysis. Procedural equivalents of language commands for retrieving relational data. Set operations. Missing value management.
- Basic elements of grammar of graphics. Data visualization. The notion of aesthetics and geometry in the context of visualization.
- Programming methods for descriptive and inferential statistics. Simulations.
- Selected machine learning methods - linear regression, kNN classification.
- Introduction to predictive modeling. Training and testing dataset splits. Cross-validation methods. Declarative approach to the development and evaluation of predictive models.
- Final exam
[FER3-EN] Computing - studyElective Courses (5. semester)
[FER3-EN] Electrical Engineering and Information Technology - studyElective Courses (5. semester)
[FER3-EN] Control Systems and Robotics - profileElective courses (1. semester)
[FER3-EN] Data Science - profileElective courses (1. semester)
[FER3-EN] Electrical Power Engineering - profileElective courses (1. semester)
(.), Programirajmo u R-u,
(.), R for Data Science,
(.), Openintro Statistics,
(.), Introduction to Statistical Learning,
(.), Advanced R,
L3 English Level
15 Laboratory exercises
0 Project laboratory
75 Very Good