Analysis of Massive Datasets

HrvatskihrEnglishen

Pristupačnost

Text size:A A

Page layout Normal Simple

Page contrast Normal High contrast Invert high contrast

Clear all

Analysis of Massive Datasets

Data is displayed for academic year: 2023./2024.

Course Description

An introduction to the analysis of large datasets. Finding similar entities. Data Flow Analysis. Analysis of links in data presented by graphs. Finding frequent gatherings. Finding groups in large datasets. Recommendation systems. Social Network Graph Analysis. Web Advertising Models. Dimensionality reduction. Scalable Machine Learning.

Study Programmes

University graduate

[FER3-EN] Data Science - profile

Recommended elective courses (2. semester)

Learning Outcomes

identify and understand why a problem belongs to the Big Data category
apply the MapReduce programming model when encountering certain types of problems
design and evaluate a system for finding similar entities in a large data set
design and evaluate a system for finding frequent sets in a large data set
design and evaluate a node ranking system for a very large data set represented by a graph
design and evaluate a recommendation system
apply appropriate algorithms to find groups in a large set of falls
apply appropriate algorithms to process data flows

Forms of Teaching

Lectures

Lecturer-driven classroom presentations of theoretical concepts.

Exercises

Examples and problem solving during lectures.

Laboratory

Software implementation of selected massive dataset analysis methods. Students individually implement given assignment in a recommended programming language or tool, and submit their solutions to automatic online evaluation.

Week by Week Schedule

Locality-sensitive hashing (LSH), minhash and simhash algorithms
Locality-sensitive hashing (LSH), minhash and simhash algorithms
Graph mining
Web search (PageRank and HITS)
Data mining with Map-Reduce, Feature selection (filter methods, subset selection, wrapper method)
Data stream mining
Data stream mining
Midterm exam
Time series and sequences mining
Collaborative filtering and recommender engines
Clustering algorithms for large datasets (BFR, CURE)
Sampling, filtering and estimating data stream moments
Large-scale algorithms for mining frequent item sets (Apriori, PCY, SON)
Detecting communities in large graphs (Girvan-Newman, Affiliation-Graph Model)
Final exam

Literature

(.), Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman (2014.), Mining of Massive Datasets, Cambridge University Press,

(.), Michael Manoochehri (2013.), Data Just Right, Addison-Wesley,

(.), Jiawei Han, Jian Pei, Micheline Kamber (2011.), Data Mining: Concepts and Techniques, Elsevier,

For students

General

ID 222942

Summer semester

5 ECTS

L1 English Level

L1 e-Learning

45 Lectures

0 Seminar

0 Exercises

5 Laboratory exercises

0 Project laboratory

0 Physical education excercises

Grading System

Excellent

Very Good

Good

Sufficient

Similar Courses

Mining Massive Data Sets, Stanford

Analysis of Massive Datasets

Poll

No polls currently selected on this page!

Analysis of Massive Datasets

Lecturers

Lectures

Laboratory exercises

Course Description

Study Programmes

University graduate

Learning Outcomes

Forms of Teaching

Week by Week Schedule

Literature

For students

General

Grading System

Similar Courses

Analysis of Massive Datasets

Poll