School / Prep
ENSEIRB-MATMECA
Internal code
EI9IS323
Description
The objective of this course is to teach students the basic concepts needed to process large amounts of data in a distributed way.
To this end, the following points will be discussed:
1. Introduction to big data
2. The Hadoop framework and the HDFS distributed file system
3. The MapReduce paradigm
4. Building applications with Spark
5. Introduction to NoSQL databases
Teaching hours
- CIIntegrated courses16h
Syllabus
Big data management:
Big data presentation: Issues and challenges
Storage
Processing and querying (NoSQL)
Infrastructures:
Virtualisation
Cloud-based
Technologies: Haddop
Visualisation:
Representation, navigation, correlation
Data analysis and extraction
Data mining
Assessment of knowledge
Initial assessment / Main session - Tests
Type of assessment | Type of test | Duration (in minutes) | Number of tests | Test coefficient | Eliminatory mark in the test | Remarks |
---|---|---|---|---|---|---|
Integral Continuous Control | Continuous control | 1 | Evaluation on moodle | |||
Project | Defense | 1 |