When learning the R programming language, you will often work with data. Data can come in many shapes and sizes, such as numbers, text, or lists. In R, there is a special way to organize data into tables. These tables are easy to work with and are called tibbles.

Tibbles are an improved version of data frames in R. If you already know what a data frame is, think of a tibble as a smarter and more user-friendly version. If you don’t know what a data frame is, don’t worry! We’ll explain tibbles from the beginning.


What Is a Tibble?

A tibble is like a spreadsheet. Imagine a table with rows and columns, where:

  • Each column has a name (like “Name”, “Age”, or “Salary”).
  • Each row is a single observation or entry (like information about one person).

For example, here is a simple table:

NameAgeSalary
Alice2550000
Bob3060000
Charlie3570000

This is how a tibble might look when printed in R.


Why Use Tibbles?

Tibbles are better than regular data frames because:

  1. They are easier to read.
    • Tibbles show only the first 10 rows and as many columns as fit on your screen. This avoids overwhelming you with too much information.
  2. They don’t change data types automatically.
    • With data frames, numbers might accidentally become text, or text might turn into factors (a type of category in R). Tibbles keep the data exactly as it is.
  3. They are more predictable.
    • Tibbles don’t surprise you with hidden rules. For example, if you select one column, it will stay as a tibble, not something else.

How to Create a Tibble

To create a tibble, you can use the tibble() function from the tibble package. Let’s make the table above as a tibble.

# Load the tibble package
library(tibble)

# Create a tibble
data <- tibble(
  Name = c("Alice", "Bob", "Charlie"),
  Age = c(25, 30, 35),
  Salary = c(50000, 60000, 70000)
)

# Print the tibble
data

When you run this code, R will display the tibble like this:

# A tibble: 3 × 3
  Name      Age Salary
  <chr>   <dbl>  <dbl>
1 Alice      25  50000
2 Bob        30  60000
3 Charlie    35  70000
  • means the column contains text (character data).
  • means the column contains numbers (double-precision values).

How to Work with Tibbles

1. Access Columns

You can get a single column using the $ symbol:

# Get the Age column
data$Age

This will show:

[1] 25 30 35

2. Add a New Column

You can add a new column by assigning values to it:

# Add a new column called Bonus
data$Bonus <- c(5000, 6000, 7000)

data

Now, the tibble looks like this:

# A tibble: 3 × 4
  Name      Age Salary Bonus
  <chr>   <dbl>  <dbl> <dbl>
1 Alice      25  50000  5000
2 Bob        30  60000  6000
3 Charlie    35  70000  7000

3. Filter Rows

To filter rows, use the filter() function from the dplyr package:

# Load dplyr package
library(dplyr)

# Filter rows where Age is greater than 25
filtered_data <- filter(data, Age > 25)

filtered_data

This will show:

# A tibble: 2 × 4
  Name      Age Salary Bonus
  <chr>   <dbl>  <dbl> <dbl>
1 Bob        30  60000  6000
2 Charlie    35  70000  7000