This is a Notebook for the blog post about merging columns in R (https://www.marsja.se/how-to-concatenate-two-columns-or-more-in-r-stringr-tidyr/).
First, download the data here and place it in the same directory as this notebook (or change the path to where you have it).
# Example Data:
library(readxl)
dataf <- read_excel("combine_columns_in_R.xlsx")
str(dataf)
Classes 'tbl_df', 'tbl' and 'data.frame': 7 obs. of 5 variables: $ Date : num 15 11 11 10 14 10 12 $ Month: chr "Jun" "Jul" "Aug" "Sep" ... $ Year : num 2015 2016 2017 2018 2019 ... $ Snake: chr "Python" "Boa" "Python" "Boa" ... $ Size : chr "Small" "Large" "Medium" "Large" ...
head(dataf, 5)
Date | Month | Year | Snake | Size |
---|---|---|---|---|
15 | Jun | 2015 | Python | Small |
11 | Jul | 2016 | Boa | Large |
11 | Aug | 2017 | Python | Medium |
10 | Sep | 2018 | Boa | Large |
14 | Oct | 2019 | Python | Small |
dataf$MY <- paste(dataf$Month, dataf$Year)
dataf$MY <- paste(dataf$Month, "-", dataf$Year)
dataf
Date | Month | Year | Snake | Size | MY |
---|---|---|---|---|---|
15 | Jun | 2015 | Python | Small | Jun - 2015 |
11 | Jul | 2016 | Boa | Large | Jul - 2016 |
11 | Aug | 2017 | Python | Medium | Aug - 2017 |
10 | Sep | 2018 | Boa | Large | Sep - 2018 |
14 | Oct | 2019 | Python | Small | Oct - 2019 |
10 | Nov | 2020 | Python | Medium | Nov - 2020 |
12 | Dec | 2021 | Boa | Medium | Dec - 2021 |
If we want to have no whitespaces between values/characters
dataf$MY <- paste(dataf$Month, dataf$Year, sep= "-")
dataf
Date | Month | Year | Snake | Size | MY |
---|---|---|---|---|---|
15 | Jun | 2015 | Python | Small | Jun-2015 |
11 | Jul | 2016 | Boa | Large | Jul-2016 |
11 | Aug | 2017 | Python | Medium | Aug-2017 |
10 | Sep | 2018 | Boa | Large | Sep-2018 |
14 | Oct | 2019 | Python | Small | Oct-2019 |
10 | Nov | 2020 | Python | Medium | Nov-2020 |
12 | Dec | 2021 | Boa | Medium | Dec-2021 |
dataf$DMY <- paste(dataf$Date, dataf$Month, dataf$Year)
head(dataf, 2)
Date | Month | Year | Snake | Size | MY | DMY |
---|---|---|---|---|---|---|
15 | Jun | 2015 | Python | Small | Jun-2015 | 15 Jun 2015 |
11 | Jul | 2016 | Boa | Large | Jul-2016 | 11 Jul 2016 |
library(stringr)
dataf$SnakeNSize <- str_c(dataf$Snake," ", dataf$Size)
head(dataf)
Date | Month | Year | Snake | Size | MY | DMY | SnakeNSize |
---|---|---|---|---|---|---|---|
15 | Jun | 2015 | Python | Small | Jun-2015 | 15 Jun 2015 | Python Small |
11 | Jul | 2016 | Boa | Large | Jul-2016 | 11 Jul 2016 | Boa Large |
11 | Aug | 2017 | Python | Medium | Aug-2017 | 11 Aug 2017 | Python Medium |
10 | Sep | 2018 | Boa | Large | Sep-2018 | 10 Sep 2018 | Boa Large |
14 | Oct | 2019 | Python | Small | Oct-2019 | 14 Oct 2019 | Python Small |
10 | Nov | 2020 | Python | Medium | Nov-2020 | 10 Nov 2020 | Python Medium |
library(tidyverse) # or library(tidyr)
dataf <- dataf %>%
unite("DM", Date:Month)
dataf
Registered S3 methods overwritten by 'ggplot2': method from [.quosures rlang c.quosures rlang print.quosures rlang Registered S3 method overwritten by 'rvest': method from read_xml.response xml2 -- Attaching packages --------------------------------------- tidyverse 1.2.1 -- v ggplot2 3.1.1 v readr 1.3.1 v tibble 2.1.1 v purrr 0.3.2 v tidyr 0.8.3 v dplyr 0.8.0.1 v ggplot2 3.1.1 v forcats 0.4.0 -- Conflicts ------------------------------------------ tidyverse_conflicts() -- x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()
DM | Year | Snake | Size | MY | DMY | SnakeNSize |
---|---|---|---|---|---|---|
15_Jun | 2015 | Python | Small | Jun-2015 | 15 Jun 2015 | Python Small |
11_Jul | 2016 | Boa | Large | Jul-2016 | 11 Jul 2016 | Boa Large |
11_Aug | 2017 | Python | Medium | Aug-2017 | 11 Aug 2017 | Python Medium |
10_Sep | 2018 | Boa | Large | Sep-2018 | 10 Sep 2018 | Boa Large |
14_Oct | 2019 | Python | Small | Oct-2019 | 14 Oct 2019 | Python Small |
10_Nov | 2020 | Python | Medium | Nov-2020 | 10 Nov 2020 | Python Medium |
12_Dec | 2021 | Boa | Medium | Dec-2021 | 12 Dec 2021 | Boa Medium |