What is the difference between long data and wide data?
There are many different ways that you can present the same dataset to the world.
Let's take a look at one of the most important and fundamental distinctions, whether a dataset is wide or long.
The difference between wide and long datasets boils down to whether we prefer to have more columns in our dataset or more rows.
Wide Data
A dataset that emphasizes putting additional data about a single subject in columns is called a wide dataset.
- Because, as we add more columns, the dataset becomes wider.
Long Data
A dataset that emphasizes including additional data about a subject in rows is called a long dataset.
Because, as we add more rows, the dataset becomes longer.
It's important to point out that there's nothing inherently good or bad about wide or long data.
In the world of data wrangling, we sometimes need to make a long dataset wider, and we sometimes need to make a wide dataset longer.
However, it is true that, as a general rule, data scientists who embrace the concept of tidy data usually prefer longer datasets over wider ones.