Code
library(sqldf)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
sqldf::sqldf(sSQL)
# You can also store the results
out = sqldf::sqldf(sSQL)sqldf
RSQLie
duckdb
In this section, we will not connect to any standard RDBMS. We will execute SQL on an in-memory data.frame (table) that we have read into R and that are available in the R environment.
We read the following data in the R environment as a data.frame object tblStudent.
sqldf
The sqldf() function is typically passed a single argument which is an SQL select statement where the table names are ordinary R data.frame names.
It transparently sets up a database, imports the data frames into that database, performs the SQL select or other statement and returns the data.frame
The sqldf() or read.csv.sql() functions can also be used to read filtered files into R even if the original files are larger than R itself can handle.
It supports backend databases like RSQLite, RH2, RMySQL and RPostgreSQL
sqldf: Exampleinstall.packages("sqldf")
Read the data.frame
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent LIMIT 5"
Execute the statement as: sqldf::sqldf(sSQL)
You can also assign the executed object to an R data.frame object
library(sqldf)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
sqldf::sqldf(sSQL)
# You can also store the results
out = sqldf::sqldf(sSQL)| StudentID | SName | Gender | Age | ClassID | Attendance |
|---|---|---|---|---|---|
| 1 | Alice | Female | 15 | 1 | 95 |
| 2 | Bob | Male | 16 | 1 | 88 |
| 3 | Eve | Female | 16 | 2 | 96 |
| 4 | Charlie | Male | 15 | 2 | 85 |
| 5 | David | Male | 16 | 3 | 90 |
RSQLite
The library embeds the SQLite database engine in R and provides an interface compliant with the DBI package.
RSQLite is the easiest way to use a database from R because the package itself contains SQLite; no external software is needed.
RSQLite is a DBI-compatible interface which means you primarily use functions defined in the DBI package.
It can also run many complex queries like batch queries and multiple parameterised queries.
RSQLite: Exampleinstall.packages("RSQLite")
Read the data.frame
For running RSQLite query, we need to set a temporary database connection:
"" to connect to a database from the current environment.
dbConnect(RSQLite::SQLite(), "")
":memory:" or "file::memory:" for an in-memory database.
dbConnect(RSQLite::SQLite(), "mydatabase.db")
Set the connection using one of the above options.
Write the table on the database using dbWriteTable function.
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent"
Execute the statement as: dbGetQuery(conn, sSQL)
Disconnect the connection
library(DBI)
library(RSQLite)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
conn = dbConnect(RSQLite::SQLite(), "")
dbWriteTable(conn, name = "tblStudent", value = tblStudent)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
dbGetQuery(conn, sSQL)
# You can also store the results
out = dbGetQuery(conn, sSQL)
dbDisconnect(conn, shutdown = TRUE)| StudentID | SName | Gender | Age | ClassID | Attendance |
|---|---|---|---|---|---|
| 1 | Alice | Female | 15 | 1 | 95 |
| 2 | Bob | Male | 16 | 1 | 88 |
| 3 | Eve | Female | 16 | 2 | 96 |
| 4 | Charlie | Male | 15 | 2 | 85 |
| 5 | David | Male | 16 | 3 | 90 |
duckdb
duckdb: DBI Package for the DuckDB Database Management System
The DuckDB project is an embedded analytical data management system with support for the SQL.
The package includes all of DuckDB and a R Database Interface (DBI) connector.
DuckDB is an in-process analytical database and supports a feature-rich SQL dialect complemented with deep integration into client application programming interfaces (APIs).
Besides R, it offers client APIs for several languages like C, C++, Python, Java, Julia, Rust. See the DuckDB website for further details.
duckdb: Exampleinstall.packages("duckdb")
Read the data.frame
For running duckdb query, we need to set a temporary database connection:
Register the connection using the function: duckdb_register. The function registers a data.frame as a virtual table in the DuckDB environment.
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent"
Execute the statement as: dbGetQuery(conn, sSQL)
Disconnect the connection
library(DBI)
library(duckdb)
conn <- dbConnect(duckdb())
duckdb_register(conn = conn, name = "tblStudent", df = tblStudent)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
dbGetQuery(conn, sSQL)
# You can also store the results
out = dbGetQuery(conn, sSQL)
dbDisconnect(conn, shutdown = TRUE)| StudentID | SName | Gender | Age | ClassID | Attendance |
|---|---|---|---|---|---|
| 1 | Alice | Female | 15 | 1 | 95 |
| 2 | Bob | Male | 16 | 1 | 88 |
| 3 | Eve | Female | 16 | 2 | 96 |
| 4 | Charlie | Male | 15 | 2 | 85 |
| 5 | David | Male | 16 | 3 | 90 |