The data for your data visualisations can come in many forms and formats.

CSV

One of the most used file formats to exchange data are CSV files. CSV stands for Comma Separated Values and it is a text based format, which means you can open and edit it with any text editor. CSV files have a .csv file extension.

Year,Make,Model
1997,Ford,E350
2000,Mercury,Cougar

<aside> 🔎 Below you can find the data we are going to use in the Assignment: Cleaning data in practice assignment. The data is a csv file, so you can try to download it and open it with a text editor.

</aside>

reactors.csv

CSV files contain data in a tabular format, with the first line of the file containing the column names separated by commas. Each subsequent line contains the values for a record in the data, also separated by comma’s.

CSV is an open file format, and most spreadsheet programs are able to open and export CSV files. Many data visualisation tools also accept the format as input.

However, the CSV file format is not standardised and some variations exist. Because in many countries, including many European ones, the comma is used as the decimal sign for numbers, a semicolon “;” is used as the delimiter instead of a comma “,”. This can create some confusion, as most of these files still carry the .csv extension.

Other delimiters are used too: TSV files use tabs as the delimiter, for example. Most software programs that are able to import or export CSV files have an option to set the delimiter to be used.

Some pitfalls to watch out for when working with CSV files are:

Year,Make,Model
1997,Ford,E350,
2000,Mercury
Year,Make,Model,Description
1997,Ford,E350,A very fast car
2000,Mercury,Cougar,A slow, but very nice car
Year,Make,Model,Description
1997,"Ford","E350","A very fast car"
2000,"Mercury","Cougar","A slow, but very nice car"

Excel files

Probably even more popular for exchanging data than CSV files, are spreadsheet files, of which Excel files are the most used. These files have an .xlsx extension (older files can have an .xls extension, and when a file contains macro’s, the extension is .xlsm).

XSLX files follow an open standard, and many other spreadsheet software like Apple Numbers, Google Sheets and OpenOffice can open and save XLSX files.