#45 Machine Learning & Data Science Challenge 45

#45 Machine Learning & Data Science Challenge 45

What is the difference between long data and wide data?

  • There are many different ways that you can present the same dataset to the world.

  • Let's take a look at one of the most important and fundamental distinctions, whether a dataset is wide or long.

  • The difference between wide and long datasets boils down to whether we prefer to have more columns in our dataset or more rows.

Wide Data

A dataset that emphasizes putting additional data about a single subject in columns is called a wide dataset.

  • Because, as we add more columns, the dataset becomes wider.

Long Data

A dataset that emphasizes including additional data about a subject in rows is called a long dataset.

  • Because, as we add more rows, the dataset becomes longer.

  • It's important to point out that there's nothing inherently good or bad about wide or long data.

In the world of data wrangling, we sometimes need to make a long dataset wider, and we sometimes need to make a wide dataset longer.

However, it is true that, as a general rule, data scientists who embrace the concept of tidy data usually prefer longer datasets over wider ones.