How do I perform different types of SQL-style joins like inner, outer, left, and right joins on data frames in R?

Rashmihasija · June 21, 2025, 10:48am

Good point, @emma-crepeau ! I’ve definitely used merge() before, but lately, I’ve been all about performance with larger datasets, so I’ve turned to the data.table package. It’s a game-changer, especially when handling huge data frames. Here’s how I perform SQL-style joins in data.table:

First, you’ll want to convert your data frames into data.table objects: setDT(df1); setDT(df2)

Inner join: df1[df2, on = "key", nomatch = 0]
Left join: df1[df2, on = "key"]
Right join: df2[df1, on = "key"]

As for outer join in data.table, it’s a bit more work, but you can still use merge() with the all = TRUE argument, or you could use more specific data.table syntax.

Overall, if you’re working with large datasets, data.table is blazing fast and efficient. It really shines when you need speed alongside SQL-style joins, like a left join r, but if you’re working with smaller data or want simplicity, dplyr or even merge() are great options.