School / Prep
ENSEIRB-MATMECA
Internal code
EI9IT360
Description
1- Introduction to BigData issues
-Notion of order of magnitude
-Scientific/societal/economic issues
-Problems
2- Basics of how a large-scale distributed system works.
Introduction to the hadoop ecosystem
-Notionaire administration hadoop
3 Introduction to distributed file systems.
HDFS: how it works
-Using the HDFS client
-Introduction to the JAVA programming framework for handling HDFS
4 Introduction to the Map/Reduce programming paradigm
-Elemental principles
-Implementation with Hadoop/HDFS
-Introduction to the MapReduce Hadoop2 Java programming framework.
5 Introduction to the Map/Reduce design pattern
-Filtration
-Summarization
-Organization
-Jointure
6 Introduction to BigTable (nosql)
-Demonstrating BigTable
-Introduction to the Hadoop HBASE Big Table
-Introduction to the Java programming framework for HBASE
Teaching hours
- CIIntegrated Courses22h
- TDTutorial2h
- TIIndividual work10h
Syllabus
1/Mass data management:
- Presentation on big data: Issues and challenges
- Storage
- Processing and querying (NoSQL)
2/ Infrastructures:
- Virtualization
- Cloud-based infra.
- Technologies: Haddop,
3/ Visualization:
- Representation, navigation, correlation
4/- Data analysis and extraction
- Data mining
Assessment of knowledge
Initial assessment / Main session
| Type of assessment | Nature of assessment | Duration (in minutes) | Number of tests | Evaluation coefficient | Eliminatory evaluation mark | Remarks |
|---|---|---|---|---|---|---|
| Continuous control | Continuous control | 0.5 | ||||
| Project | Report | 0.5 |
Second chance / Catch-up session
| Type of assessment | Nature of assessment | Duration (in minutes) | Number of tests | Evaluation coefficient | Eliminatory evaluation mark | Remarks |
|---|---|---|---|---|---|---|
| Final test | Written | 60 | 1 | without document |
