Exercises

In this exercise we will learn to use packages and write data to/from external sources


  1. Install the package called RSQLite using the console and skim the console-output.

To install packages we use the function install.packages

install.packages("RSQLite")

There is a lot of text in the output, including the website where the package is fetched from and the path it is installed to. If the package depended on other packages they would also be shown here. If the package is already installed and up-to-date it will not be installed again, but a short information will be shown here.


  1. Load the package dplyr and read the output in the console. What do you think it means?

We load a packages with the function library

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

It informs us that dplyr is masking the functions filter and lag from the package stats. This is because there is a naming conflict. Both dplyr and stats has functions by those names, so when you write filter in the console R does not know which one to choose. To avoid this the latest package you loaded masks the others, so in this case when you write filter it will use the function from the dplyr package.
To be helpful library prints this informational message to avoid confusion.


  1. Open the vignette for the package DBI. If you do not have it installed, do so. This might come in handy later
browseVignettes("DBI")

  1. Together with 3-4 coleagues make a dataframe with three columns. One column should be a character column with your names. The second column should be a numeric column with your shoe size. The third column should be a logical columnd with TRUE if you have used R before and FALSE if you have not. Assign the dataframe to a name you choose yourselves

To create the dataframe use the function data.frame. Speak with your colleagues to get values for your dataframe.

personal_data <- data.frame(names = c("Morten", "Philip", "Karen", "Emil"),
                            shoe_size = c(45, 46, 41, 42),
                            used_r = c(TRUE, TRUE, TRUE, TRUE))

Here it is possible to make problems with the names. Spaces should be avoided and it is a convention to use snakecase (words seperated by _).


  1. Write your dataframe to an excel-file (.xlsx).

Use the function write.xlsx from the package openxlsx.

library(openxlsx)
## Warning: package 'openxlsx' was built under R version 4.4.2
openxlsx::write.xlsx(personal_data,
                     "files/personal_file.xlsx")

  1. Open your file in Excel and edit a value. Save it and load it back into R. Does it look like you expected?

Hopefully you have excel installed, else this exercise will is difficult. But you can read the data into R using the function read.xlsx.

read.xlsx("files/personal_file.xlsx")
##    names shoe_size used_r
## 1 Morten        45   TRUE
## 2 Philip        46   TRUE
## 3  Karen        41   TRUE
## 4   Emil        42   TRUE

As you can see the logical column has switched from logical to character. This is a classical problem when switching between formats. In this case it is because Excel doesn’t recognise R’s logical datatype. It is always a good idea to check your data when reading from external sources.


  1. Open a connection to a SQLite database on the shared drive and give it a unique name. Warning: Be sure not to overwrite each others databases. Give it a name that makes it clear it is yours.

First you should find the path to the shared drive, we will determine this name together before we start the exercises. This might not be possible if you are working on your laptops. So work together with your colleagues who got a desktop. For this example we will assume the path is "H:/"

Once you have the path you need a unique name for your database. Could for instance be your name and two numbers: "peter23.sqlite".

The database will automatically be created when we try to connect to it. Use the function dbConnect to from the package DBI to open the connection:

library(DBI)
con <- dbConnect(drv = RSQLite::SQLite(),
                 dbname = "H:/peter23.sqlite")

  1. Write the dataset to your database using the functions from the lecture. Check that there is actually create a table in the database.

To write our dataset to the database we use the function dbWriteTable. Check the help page for the function to see the arguments.

dbWriteTable(conn = con,
             name = "my_tablename",
             value = personal_data)

dbWriteTable just returns a logical value TRUE if it succeeded. But to make sure that the table with our data has been created, use the function dbListTables

dbListTables(con)
## [1] "my_tablename"

If the name of your table does not show up, something went wrong.


  1. Read the dataset from the database back into R and inspect it. Does it look like you expected?

To read the full dataset we use the function dbReadTable. It takes two required arguments: A connection to a database and the name of the table you want to read:

personal_data_from_db <- dbReadTable(conn = con,
                                     name = "my_tablename")

Inspect the data either by printing it in the console or use the View-function. You will see that once again, the logical column has changed datatype. This time to a numeric/integer.


  1. Disconnect from your database

This is done using the function dbDisconnect which takes your connection as an argument.

dbDisconnect(con)

  1. Ask another group for the path to their database and open a connection to it. Use this connection to show the name of their table.

If the other group named their database "H:/anna93.sqlite" then you can connect to it just like if it was your own, so you can reuse your code from exercise 7:

con2 <- dbConnect(RSQLite::SQLite(),
                  dbname = "H:/anna93.sqlite")
dbListTables(con2)
## [1] "their_table_name"

  1. Append your own dataset to the other groups table. Appending is when you add your data the existing data. Read the help-file for dbWriteTable to see how it is done. Close the connection when you are done.

The first thing you have to do is make your dataset look like theirs. They may have used different names for their columns, so start by reading their table and inspecting it using the tablename you found in exercise 11.

their_data <- dbReadTable(con2, "their_table_name")
View(their_data)

Now rename your dataframe so it matches theirs. This can be done using the names function

names(personal_data) <- c("Their", "column", "names") ##Remember R is casesensitive!

Now you should be able to add your dataset to their table:

dbWriteTable(conn = con2, 
             name = "their_table_name", 
             value = personal_data, 
             append = TRUE)

Disconnect from their database

dbDisconnect(con2)

  1. Reopen the connection to your own database and read your table. Can you see the new data that has been added?

Use the same code you made in exercise 7 and 9. If the other group successfully appended their dataset, you should see more rows than you did earlier.


  1. If you have more time, try making an SQL-query to select only the data where shoe size is greater than 40

End of practical