School / Prep
ENSEIRB-MATMECA
Internal code
EI9IS323
Description
The objective of this course is to teach students the basic concepts needed to process large amounts of data in a distributed way.
To this end, the following points will be discussed:
1. Introduction to big data
2. The Hadoop framework and the HDFS distributed file system
3. The MapReduce paradigm
4. Building applications with Spark
5. Introduction to NoSQL databases
Teaching hours
- CIIntegrated courses16h
Syllabus
Big data management:
Big data presentation: Issues and challenges
Storage
Processing and querying (NoSQL)
Infrastructures:
Virtualisation
Cloud-based
Technologies: Haddop
Visualisation:
Representation, navigation, correlation
Data analysis and extraction
Data mining
Assessment of knowledge
Initial assessment / Main session
| Type of assessment | Nature of assessment | Duration (in minutes) | Number of tests | Evaluation coefficient | Eliminatory evaluation mark | Remarks |
|---|---|---|---|---|---|---|
| Integral Continuous Control | Continuous control | 1 | Evaluation on moodle | |||
| Project | Defense | 1 |
