The Geometric objects in detail and the Aesthetics in detail modules introduced the geometric objects and their aesthetics. But

Scales

Scales are at the heart of the Grammar of Graphics. Scales are the functions that turn the values of input variables into values for the aesthetics of geometric objects. Or, formulated a bit less abstract in the Vega-Lite documentation:

Scales are functions that transform a domain of data values (numbers, dates, strings, etc.) to a range of visual values (pixels, colors, sizes).

A simple scale

In order to understand scales, let’s return to the plot we discussed in one of the previous modules:

Source: Maarten Lambrechts, CC BY SA 4.0

Source: Maarten Lambrechts, CC BY SA 4.0

This plot is based on a data set that looks like this:

country continent population life expectancy income
China Asia 1.420.000.000 76,9 16.000
India Asia 1.350.000.000 69,1 6.890
United States Americas 327.000.000 79,1 54.900
Indonesia Asia 267.000.000 72 11.700
Brazil Americas 211.000.000 75.7 14.300

Let’s focus on the life expectancy variable first. It is mapped to the y aesthetic of the circle geometry. In this data set, the minimum life expectancy is 51,1 years (Lesotho), and the maximum life expectancy is 84,2 (Japan). The range of values a variable has in a data set is often called the domain of the variable. So in this case the domain for the y scale ranges from 51,1 years to 84,2 years.

Let’s suppose the height of the plot area is 400 pixels (the plot area is the space enclosed between the x and the y axis). The y scale is the function that will calculate the values for the life expectancy of the countries over the distance of 400 pixels, with 0 pixels at the bottom of the y axis to 400 pixels at the top. This interval, between the start and and of an axis, is often called the range of a scale.

In the simplest approach, the minimum value of the variable domain is mapped to the minimum value of the range, and the maximum value of the domain is mapped to the maximum value of the range.

A visual representation of a scale that maps a domain of [0, 5000] to a range of [250 pixels, 550 pixels] Source: observablehq.com/@observablehq/plot-cheatsheets-layouts

A visual representation of a scale that maps a domain of [0, 5000] to a range of [250 pixels, 550 pixels] Source: observablehq.com/@observablehq/plot-cheatsheets-layouts

This simple approach has some small issues, however:

To overcome both of these issues, Grammar of Graphics tools allow users to configure scales and set some options on them. For position scales (x and y scales) for example, these options that you can configure for the scale include: