Data Wrangling in Python (intermediate level)
Course goals
At the end of this course you will able to wrangle data
- Exploring: loading, creating reports, data visualizations,
- Transforming, filtering: structuring, normalizing, cleaning your datasets
- Validating: verifying that your data is consistent, of sufficient quality, and secure.
- Store: store or preserve the final product, along with all the steps and transformations that took place so it can be audited, understood, and repeated in the future.
Who should follow this course?
Every one: researcher, engineer, student, with already knowledge in python programming and statistics and have to analyse datasets.
Prerequisites:
-
Proficiency in manipulating file paths.
- file path
- permissions
-
Basic skills in programming in python
- data type (int, float, boolean, ...)
- basic data structures (tuple, list, dictionary, ...). How to create and manipulate them.
- Ability to create loops and nested loops.
- Understanding the concept of class and object in Python, and ability to use objects. All these concepts are covered in the following course https://hub-courses.pages.pasteur.fr/python_one_week_4_biologists/
-
Basic knowledge of statistics.
- ...
-
Familiarity with virtual environments, will be a plus.
-
During the course, you'll need to bring your own laptop, and setup the environment. If you are not able to setup the environment by yourself, When you will be enrolled, if you are not able to setup the environment by yourself, we could organize a the session dedicated to help you in this task.
What this course is not
This course is not an initiation to programming. You must have basic skills in Python (see Prerequisites)
This course is not an introduction to statistics. You have to know XXX
Program
For all topics we will spend 50% of our time for the lectures 50% with practicals.
Day | Topic |
---|---|
Day 1 |
|
Day 2 |
|
Day 3 | |
Day 4 | |
Day 5 |