Big spatial and spatio-temporal data management

Course Description

Introduction. Systems and programming description management. Lambda and Kappa architectures for big data. Basic principles and features of big spatial and spatio-temporal data. Modelling of spatial and spatio-temporal data. Specification of relevant operations on spatial and spatio-temporal data. Indexing. Global and local indexes. Static and dynamic indexes. Geohashes. Spatio-temporal data streams. SQL-based analysis of spatio-temporal data streams within integrated big data platforms. Implementation of data types and operations in object-functional programming language and distributed dataflow platforms. Implementation based on API of integrated platform for distributed batch and data stream processing. Development of user-defined functions. Specification of spatial and spatio-temporal queries in SQL-like query languages. Data mining of big spatio-temporal data.

Learning Outcomes

  1. Identify fundamental features of spatial and spatio-temporal big data
  2. Identify fundamental features of spatioto-temporal data streams
  3. Design and implement spatial and spatio-temporal data types in object-functional programming language and distributed data flow platforms
  4. Develop simple algorithms for big spatio-temporal data management
  5. Develop simple algorithms for spatio-temporal data streams management
  6. Develop spatial and spatio-temporal queries using SQL-like expressions
  7. Develop simple algorithms for spatio-temporal data mining and knowledge discovery.
  8. Choose big data management technologies in spatio-temporal application domain

Forms of Teaching

Lectures

Theory foundation with examples.

Other Forms of Group and Self Study

Students are divided into groups of 2. Each group is assigned a separate data set. By completing a project using an assigned data set, students exhibit relevant practical skills and application of learned theoretical concepts in the area of big spatial and spatio-temporal data management.

Grading Method

     
Continuous Assessment Exam
Type Threshold Percent of Grade Threshold Percent of Grade
Class participation 0 % 5 % 0 % 5 %
Seminar/Project 20 % 45 % 20 % 40 %
Attendance 5 % 10 % 0 % 5 %
Mid Term Exam: Written 0 % 20 % 0 %
Final Exam: Written 0 % 20 %
Exam: Written 0 % 50 %
Exam: Oral 50 %

Week by Week Schedule

  1. Introduction. Systems and programming frameworks for big data and data streams management. Lambda and Kappa architectures for big data.
  2. Basic principles and features of big spatial and spatio-temporal data. Modelling of spatial and spatio-temporal data types.
  3. Specification of relevant operations on spatial and spatio-temporal data types.
  4. Implementation of data types and operations in object-functional programming language based and distributed dataflow platforms.
  5. Development of user-defined functions. Specification of spatial and spatio-temporal queries in SQL-like query languages.
  6. Spatial and spatio-temporal queries in SQL-like expressions of integrated big data platforms.
  7. Indexing. Global and local indexes. Static and dynamic indexes. Geohashes.
  8. Midterm exam
  9. Midterm exam
  10. Spatio-temporal data streams.
  11. Management and processing of spatio-temporal data streams using object-functional programming languages on distributed data flow platforms.
  12. SQL-based analysis of spatio-temporal data streams within integrated big data platforms.
  13. Data mining of big spatio-temporal data within integrated big data platforms.
  14. Data mining of spatio-temporal data streams within integrated big data platforms.
  15. Final exam

Study Programmes

University graduate
Software Engineering and Information Systems (profile)
Specialization Course (1. semester) (3. semester)

Literature

Zdravko Galic (2016.), Spatio-Temporal Data Streams, Springer
Nikos Pelekis, Yannis Theodoridis (2014.), Mobility Data Management and Exploration, Springer
Nathan Marz, James Warren (2015.), Big Data: Principles and Best Practices of Scalable Realtime Data Systems, Manning Publications Company
Fabian Hueske, Vasiliki Kalavri (2018.), Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications, O'Reilly Media
Edward Capriolo, Dean Wampler, Jason Rutherglen (2012.), Programming Hive, "O'Reilly Media, Inc."
Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills (2015.), Advanced Analytics with Spark, "O'Reilly Media, Inc."

Associate Lecturers

General

ID 155248
  Winter semester
4 ECTS
L2 English Level
L1 e-Learning
30 Lectures
0 Exercises
0 Laboratory exercises
0 Project laboratory

Grading System

90 Excellent
75 Very Good
65 Good
55 Acceptable