Section Nom Description
SIGARRA Course Info UC info
URL Self-assessment #1
Fichier Introduction to Data Science (in Portuguese)
Assignment Fichier DATASET
Fichier Guideline for diagnostic (in Portuguese)
Fichier Table with BMI information for Portugal (in Portuguese)
Fichier Explanation about the PPA variable (in Portuguese)
Fichier Tables of blood pressure for children (in English)
URL Paper #1 with results on a similar dataset (169 subjects)
URL Paper #2 with statistical results on the same dataset (7199 subjects)
Fichier Data Analysis: some good practices
Homeworks URL To read #1: Apples-to-apples: the pitfals of cross-validation
URL To read #2: The relationship between ROC and Precision-Recall curves
URL To read #3: refutation of the second paper
Python Books URL Python Data Science Handbook
URL Suggestions from python.org
URL O'Reilly Python books
URL Python resources
Theoretical Classes Fichier Presentation
Fichier Introduction to Data Mining
Fichier Data understanding and manipulation
Fichier Distances and dimensionality reduction (till slide 41, inclusive)
URL Recorded class
Fichier Distances and dimensionality reduction (cont. from slide 42)
Fichier Distances and dimensionality reduction (cont. from slide 55)
Fichier Data imputation
Fichier Data visualization
Fichier Basic Concepts in Classification
Fichier Basic Concepts in Classification: Decision Trees (from slide 15)
Fichier Naive Bayes classifier
Fichier Naive Bayes classifier (from slide 9) and Belief Networks
Fichier Evaluating the Performance of a Classifier
Fichier Evaluation Metrics
Fichier Evaluating the Performance of a Classifier (from slide 10)
Fichier Evaluation Metrics (revisited)
Fichier Regression and KNN
Fichier Python code associated with the regression and KNN slides
Fichier Melbourne data associated with regression and KNN slides
Fichier Support Vector Machines (SVM)
Fichier A little more detail on SVMs

Section 2.6.1.4 of this dissertation has a detailed and nice explanation about SVMs.

Fichier Artificial Neural Networls
Fichier Clustering
Fichier Ensembles
Fichier Basic Association Analysis
Practical Classes Fichier Entropy revisited
Fichier Distances revisited
Fichier species.csv
URL Code for decision trees (iris, with pruning)
URL Code for naive Bayes (German credit dataset)
URL Decision boundaries
URL Performance Evaluation of Classifiers
URL Regression and Logistic Regression
URL SVM exercises
URL Hierarchical Clustering

Here it is some Python code applying hierarchical clustering to the iris dataset.

Explore the various options of clustering, including k-means, k-means++ and dbscan. Identify differences between these different clustering methods.

Apply these methods and evaluate the quality of the generated clusters using your favorite dataset.

Dossier PPT_to_PDF