School / Prep
ENSEIRB-MATMECA
Internal code
EIN9-PROG4
Description
1- Introduction to BigData issues
-Notion of order of magnitude
-Scientific/societal/economic issues
-Problems
2- Basics of how a large-scale distributed system works.
Introduction to the hadoop ecosystem
-Notionaire administration hadoop
3 Introduction to distributed file systems.
HDFS: how it works
-Using the HDFS client
-Introduction to the JAVA programming framework for handling HDFS
4 Introduction to the Map/Reduce programming paradigm
-Elemental principles
-Implementation with Hadoop/HDFS
-Introduction to the MapReduce Hadoop2 Java programming framework.
5 Introduction to the Map/Reduce design pattern
-Filtration
-Summarization
-Organization
-Jointure
6 Introduction to BigTable (nosql)
-Demonstrating BigTable
-Introduction to the Hadoop HBASE Big Table
-Introduction to the Java programming framework for HBASE
Teaching hours
- CIIntegrated Courses22h
- TDTutorial2,5h
- TIIndividual work10h
Syllabus
1/Mass data management:
- Presentation on big data: Issues and challenges
- Storage
- Processing and querying (NoSQL)
2/ Infrastructures:
- Virtualization
- Cloud-based infra.
- Technologies: Haddop,
3/ Visualization:
- Representation, navigation, correlation
4/- Data analysis and extraction
- Data mining
Assessment of knowledge
Initial assessment / Main session - Tests
Type of assessment | Type of test | Duration (in minutes) | Number of tests | Test coefficient | Eliminatory mark in the test | Remarks |
---|---|---|---|---|---|---|
Continuous control | Continuous control | 0.5 | ||||
Project | Report | 0.5 |
Second chance / Catch-up session - Tests
Type of assessment | Type of test | Duration (in minutes) | Number of tests | Test coefficient | Eliminatory mark in the test | Remarks |
---|---|---|---|---|---|---|
Final test | Written | 60 | 1 | without document |