Lecture 6: Data Abstraction

Brian J. Smith

2026-02-03

Data Vis: What and Why?

This lecture is based on Chapter 2 of Visualization Analysis & Design.

“Data Abstraction”

Data Abstraction

The goal of this chapter is to understand what can be visualized.

Figure 2.1 (next slide) summarizes the topic.

Data Abstraction

Semantics and Types

Why do data semantics and types matter?

Semantics of the data is its real-world meaning.
Type of the data is its structural or mathematical interpretation.
- For example, numbers might represent a count of items. In this case, it makes sense to add them together to get a total count.
- Alternatively, a number might represent a postal code. In this case, it is really a name for a category that happens to be represented with numbers rather than alphabetical characters. It doesn’t make any sense to add them together.

Semantics and Types

Imagine the following data:

Basil, 7, S, Pear

What do these data mean?

Semantics and Types

Basil, 7, S, Pear

Maybe a food shipment of produce arrived in satsifactory condition on the 7th day of the month, containing basil and pears?
Maybe the Basil Point neighborhood had 7 inches of snow cleared by the Pear Creek Limited snow removal service?
Maybe the lab rat named Basil has made 7 attempts to navigate the south section of the maze and was given a pear as a reward?

Semantics and Types

Here’s the full table, including column titles that provide the intended semantics.

ID	Name	Age	Shirt.Size	Favorite.Fruit
1	Amy	8	S	Apple
2	Basil	7	S	Pear
3	Clara	9	M	Durian
4	Desmond	13	L	Elderberry
5	Ernest	12	L	Peach
6	Fanny	10	S	Lychee
7	George	9	M	Orange
8	Hector	8	L	Loquat
9	Ida	10	M	Pear
10	Amy	12	M	Orange

Semantics and Types

Sometimes, types and semantics can be correctly inferred from the syntax of a data file or from names of variables.

Often, this additional information must be provided along with the dataset in an additional format. This additional information is called metadata.

Data Types

Earlier, Munzner used the terminology type to refer to the structural interpretation of the data.
Now, she uses data type to mean something different.
The 5 basic data types discussed in this book are:

The 5 data types: items, attributes, links, positions, and grids

Data Types

The 5 data types: items, attributes, links, positions, and grids

An item is a discrete individual entity, such as a row in a table or a node in a network.

Data Types

The 5 data types: items, attributes, links, positions, and grids

An attribute is some specific property that can be measured, observed, or logged.

Data Types

The 5 data types: items, attributes, links, positions, and grids

A link is a relationship between items, typically within a network.

Data Types

The 5 data types: items, attributes, links, positions, and grids

A position is spatial data in 2D or 3D space.

Data Types

The 5 data types: items, attributes, links, positions, and grids

A grid is a sampling of continuous data in terms of both geometric and topological relationships between its cells.

Dataset Types

A dataset is any collection of information that is the target of analysis.
The 4 basic dataset types discussed in this book are: