Code
library(sqldf)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
sqldf::sqldf(sSQL)
# You can also store the results
out = sqldf::sqldf(sSQL)
sqldf
RSQLie
duckdb
In this section, we will not connect to any standard RDBMS. We will execute SQL on an in-memory data.frame (table) that we have read into R and that are available in the R environment.
We read the following data in the R environment as a data.frame object tblStudent
.
sqldf
The sqldf()
function is typically passed a single argument which is an SQL select statement where the table names are ordinary R data.frame names.
It transparently sets up a database, imports the data frames into that database, performs the SQL select or other statement and returns the data.frame
The sqldf()
or read.csv.sql()
functions can also be used to read filtered files into R even if the original files are larger than R itself can handle.
It supports backend databases like RSQLite
, RH2
, RMySQL
and RPostgreSQL
sqldf
: Exampleinstall.packages("sqldf")
Read the data.frame
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent LIMIT 5"
Execute the statement as: sqldf::sqldf(sSQL)
You can also assign the executed object to an R data.frame object
library(sqldf)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
sqldf::sqldf(sSQL)
# You can also store the results
out = sqldf::sqldf(sSQL)
StudentID | SName | Gender | Age | ClassID | Attendance |
---|---|---|---|---|---|
1 | Alice | Female | 15 | 1 | 95 |
2 | Bob | Male | 16 | 1 | 88 |
3 | Eve | Female | 16 | 2 | 96 |
4 | Charlie | Male | 15 | 2 | 85 |
5 | David | Male | 16 | 3 | 90 |
RSQLite
The library embeds the SQLite
database engine in R and provides an interface compliant with the DBI
package.
RSQLite
is the easiest way to use a database from R because the package itself contains SQLite
; no external software is needed.
RSQLite
is a DBI-compatible interface which means you primarily use functions defined in the DBI
package.
It can also run many complex queries like batch queries and multiple parameterised queries.
RSQLite
: Exampleinstall.packages("RSQLite")
Read the data.frame
For running RSQLite query, we need to set a temporary database connection:
""
to connect to a database from the current environment.
dbConnect(RSQLite::SQLite(), "")
":memory:"
or "file::memory:"
for an in-memory database.
dbConnect(RSQLite::SQLite(), "mydatabase.db")
Set the connection using one of the above options.
Write the table on the database using dbWriteTable
function.
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent"
Execute the statement as: dbGetQuery(conn, sSQL)
Disconnect the connection
library(DBI)
library(RSQLite)
tblStudent = read.table(file = "tblStudent.txt", sep = "\t", header = TRUE)
conn = dbConnect(RSQLite::SQLite(), "")
dbWriteTable(conn, name = "tblStudent", value = tblStudent)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
dbGetQuery(conn, sSQL)
# You can also store the results
out = dbGetQuery(conn, sSQL)
dbDisconnect(conn, shutdown = TRUE)
StudentID | SName | Gender | Age | ClassID | Attendance |
---|---|---|---|---|---|
1 | Alice | Female | 15 | 1 | 95 |
2 | Bob | Male | 16 | 1 | 88 |
3 | Eve | Female | 16 | 2 | 96 |
4 | Charlie | Male | 15 | 2 | 85 |
5 | David | Male | 16 | 3 | 90 |
duckdb
duckdb: DBI Package for the DuckDB Database Management System
The DuckDB
project is an embedded analytical data management system with support for the SQL.
The package includes all of DuckDB
and a R Database Interface (DBI
) connector.
DuckDB
is an in-process analytical database and supports a feature-rich SQL dialect complemented with deep integration into client application programming interfaces (APIs).
Besides R, it offers client APIs for several languages like C, C++, Python, Java, Julia, Rust. See the DuckDB website for further details.
duckdb
: Exampleinstall.packages("duckdb")
Read the data.frame
For running duckdb
query, we need to set a temporary database connection:
Register the connection using the function: duckdb_register
. The function registers a data.frame
as a virtual table in the DuckDB environment.
Write down the SQL statement as a character string and assign it to a variable
Example: sSQL = "SELECT * FROM tblStudent"
Execute the statement as: dbGetQuery(conn, sSQL)
Disconnect the connection
library(DBI)
library(duckdb)
conn <- dbConnect(duckdb())
duckdb_register(conn = conn, name = "tblStudent", df = tblStudent)
sSQL = "SELECT * FROM tblStudent LIMIT 5"
dbGetQuery(conn, sSQL)
# You can also store the results
out = dbGetQuery(conn, sSQL)
dbDisconnect(conn, shutdown = TRUE)
StudentID | SName | Gender | Age | ClassID | Attendance |
---|---|---|---|---|---|
1 | Alice | Female | 15 | 1 | 95 |
2 | Bob | Male | 16 | 1 | 88 |
3 | Eve | Female | 16 | 2 | 96 |
4 | Charlie | Male | 15 | 2 | 85 |
5 | David | Male | 16 | 3 | 90 |