Training 3: Pitfalls in data and data visualisation - introduction

Click on the bookmark above to view the slides for this live session

Welcome to the “Pitfalls in data and data visualisation” training!

In this introduction, I will walk you through the topics that will be covered in the training, so you will have a good idea what to expect. I will also explain the educational approach of the training and will give you all the practical information you need to follow and complete the training.

And of course, there is some time for questions and answers at the end of the session.

Educational approach

This training is set up to be a fully remote and asynchronous training.

A fully remote training means that the trainer and the participants do not meet physically during the training. Participants can follow the training from anywhere, as long as they have access to a computer and an internet connection.

An asynchronous training means that the trainer and the participants, apart from not sharing the same geographical space, are also separated in time. The training is set up in a way so that participants can follow the training at their own pace, and during the time that suits them best.

The training is not fully asynchronous, as you can tell from this live session: currently, we are in one of the few synchronous parts of the training. We will have a few more of these during the training.

But most of the training will be asynchronous. The training is split up into modules, which you can follow in your own time and at your own pace. The training contains the following types of modules:

We will discuss the training schedule and the modules of this training later in this session. But each module has an estimate of the duration to complete the module. All together, the duration of all modules in the training amounts to 16 hours. The training runs over 2 weeks, from the start of this live session to the end of the closing session on Friday afternoon of next week.

Participants can schedule the modules at their own likings, but apart from the schedule of the live sessions, there are some deadlines to be met during the training (see further).

Pitfalls?

When working with data, there are many pitfalls you want to avoid.

The first kind of pitfalls featured in this training, are pitfalls related to metadata. Metadata is data about the data, and in order to have a good understanding of your data, you should find and consult this fine print of your data. The way the data was collected, for what purpose it was collected and what definitions were used can have big implications for the conclusions you draw from your data.

From simple averages and percentages to more complicated concepts like the effect of sample sizes: when you perform any analysis on your data, or when you are preparing data for visualisation, chances are high that you will have to touch statistics in some way. Statistics are renowned for their pitfalls and paradoxes. But with a lot of examples and simple insights, you will learn how to avoid them.