School / Prep
ENSEIRB-MATMECA
Internal code
EI7IF252
Teaching hours
- CIIntegrated courses26h
Further information
In the field of data science, researchers perceive the emergence of three professional communities: (i) database management, (ii) statistics and machine learning converting data into knowledge, and (iii) computer systems enabling efficient processing of these data.
This course will focus on point (ii) from a computer scientist's point of view, drawing on comfort hypotheses for points (i) and (iii). The following topics will be covered in varying degrees of detail:
scientific approaches, including modeling and experimental design,
from the point of view of probabilistic algorithm analysis: descriptive statistics, classical probability laws, estimators,
statistical inference: frequentist or Bayesian paradigms, statistical tests,
causal inference, including a discussion of cause and correlation, Pearl's Structural Causal Model (SCM),
(summary) data visualization and (if time permits) a little topological data analysis,
ethics of data analysis (bias, experimental condition, ...).
Assessment of knowledge
Initial assessment / Main session - Tests
Type of assessment | Type of test | Duration (in minutes) | Number of tests | Test coefficient | Eliminatory mark in the test | Remarks |
---|---|---|---|---|---|---|
Integral Continuous Control | Continuous control | 1 |