All examples in this chapter are presented using duckdb
For running the SQL statements with other Python libraries, see the previous chapter.
In principle, just copy and paste the SQL statement (assigned as sSQL) in the appropriate function.
The SQL commands and clauses are not case sensitive; in general, they are written in upper case.
Code
import pandas as pdimport duckdbtblStudent = pd.read_csv('./tblStudent.txt', sep ='\t')# Note: We write the SQL statement in multiple lines# This will be easy to read for long SQL statements# Note for Python: We need \ to write multiple lines# The following is equivalent to: sSQL = "SELECT * FROM tblStudent"sSQL ="\SELECT * FROM tblStudent \"duckdb.sql(sSQL)
15.3SELECT
Example 1
Select ALL columns from tblStudent
Code
sSQL="SELECT * FROM tblStudent"
Example 2
Select SName, Gender, Age columns from tblStudent
Code
sSQL="SELECT SName, Gender, Age FROM tblStudent"
Example 3
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Code
sSQL="SELECT SName AS StudentName, Gender, Age FROM tblStudent"
Example 4
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Retrieve the first FIVE rows
Code
sSQL="SELECT SName AS StudentName, Gender, Age FROM tblStudent LIMIT 5"
15.4WHERE
Example 1
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age > 16
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE Age > 16"
Example 2
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age > 16ANDAttendance greater than 85%
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE Age > 16 AND Attendance > 85"
Example 3
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age > 16ANDAttendanceBETWEEN 85% and 95%
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE Age > 16 AND Attendance BETWEEN 85 AND 90"
Example 4
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age values include 15 and 16 years
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE Age IN (15, 16)"
Example 5
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where StudentName starts with Al
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE StudentName LIKE 'Al%'"
Example 6
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where StudentName ends with an e
Code
sSQL="SELECT SName AS StudentName, Gender, AgeFROM tblStudentWHERE StudentName LIKE '%e'"
Example 7
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Attendance is greater than any attendance from ClassID1
Code
sSQL="SELECT SName AS StudentName, Gender, Age, AttendanceFROM tblStudentWHERE Attendance > ANY (SELECT Attendance FROM tblStudent WHERE ClassID=1)"
15.5ORDER BY
Example 1
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age is greater than 16 years and Attendance is greater than 80%
Order records in ascending order
Code
sSQL="SELECT SName AS StudentName, Gender, Age, AttendanceFROM tblStudentWHERE Age > 16 AND Attendance > 80ORDER BY Attendance ASC"
Example 2
Select SName, Gender, Age columns from tblStudent
Rename SName as StudentName
Match records where Age is greater than 16 years and Attendance is greater than 80%
Retrieve first two highest attendance records
Code
sSQL="SELECT SName AS StudentName, Gender, Age, AttendanceFROM tblStudentWHERE Age > 16 AND Attendance > 80ORDER BY Attendance DESCLIMIT 2"
15.6SELECT with Aggregate functions
Example 1
From tblStudent, find all unique ClassID
Code
sSQL="SELECT DISTINCT ClassIDFROM tblStudent"
Example 2
From tblStudent, find total number of unique ClassID
From tblStudent, find all unique Age values in the data
Code
sSQL="SELECT DISTINCT AgeFROM tblStudent"
Example 4
From tblStudent, find total number of unique Age values in the data
Code
sSQL="SELECT COUNT(DISTINCT Age)FROM tblStudent"
Example 5
From tblStudent, find total number of students with Attendancegreater than 85%
Code
sSQL="SELECT COUNT(StudentID) AS NFROM tblStudentWHERE Attendance > 85"
Example 6
From tblStudent, find total number of students where Attendancebetween 85% and 95%
Code
sSQL="SELECT COUNT(StudentID) AS NFROM tblStudent WHERE Attendance BETWEEN 85 AND 95"
Example 7
From tblStudent, find total number of students and their average Age where Attendancebetween 85% and 95%
Code
sSQL="SELECT COUNT(StudentID) AS N, AVG(Age) as AvgAgeFROM tblStudent WHERE Attendance BETWEEN 85 AND 95"
Example 8
From tblStudent, find minimum, maximum, sum, average of Age and total number of students
Code
sSQL="SELECT MIN(Age) AS Min, MAX(Age) AS Max, SUM(Age) AS Sum, AVG(Age) AS Avg, COUNT(Age) AS NFROM tblStudent "
15.7GROUP BY
Example 1
From tblStudent, find total number of students in each class
Code
sSQL="SELECT ClassID, COUNT(StudentID) AS NFROM tblStudentGROUP BY ClassID"
Example 2
From tblStudent, find total number of students in each class where Attendance was between 85% and 95%
Code
sSQL="SELECT ClassID, COUNT(StudentID) AS NFROM tblStudentWHERE Attendance BETWEEN 85 AND 95GROUP BY ClassID"
Example 3
From tblStudent, find total number of male and female students where Attendancebetween 85% and 95%
Code
sSQL="SELECT Gender, COUNT(StudentID) AS NFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY Gender"
Example 4
From tblStudent, find total number of male and female students by age where Attendance was between 85% and 95%
Get the records in ascending order by Gender and Age
Code
sSQL="SELECT Gender, Age, COUNT(StudentID) AS NFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY Gender, AgeORDER BY Gender, Age ASC"
Example 5
From tblStudent, find total number of male and female students and their average age where Attendance was between 85% and 95%
Get the records in ascending order by Gender
Code
sSQL="SELECT Gender, COUNT(StudentID) AS N, AVG(Age) AS AvgAgeFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY GenderORDER BY Gender ASC"
Example 6
From tblStudent, find minimum, maximum, sum, average of Age and total number of students where Attendance was between 85% and 95%
Get the records in ascending order by Gender
Code
sSQL="SELECT Gender, COUNT(StudentID) AS N, MIN(Age) AS Min, MAX(Age) AS Max, SUM(Age) AS Sum, AVG(Age) AS AvgFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY GenderORDER BY Gender ASC"
15.8HAVING
Example 1
From tblStudent, find total number of male and female students where Attendancegreater than 85%
Present records where the count is greater than 3
Code
sSQL="SELECT Gender, COUNT(StudentID) AS NFROM tblStudentWHERE Attendance > 85GROUP BY Gender"sSQL="SELECT Gender, COUNT(StudentID) AS NFROM tblStudentWHERE Attendance > 85GROUP BY GenderHAVING N > 3"
Example 2
From tblStudent, find total number of male and female students where Attendancebetween 85% and 95%
Present records where the average age is greater than 16 years
Code
sSQL="SELECT Gender, COUNT(ClassID) AS N, AVG(Age) as AvgAgeFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY Gender"sSQL="SELECT Gender, COUNT(ClassID) AS N, AVG(Age) as AvgAgeFROM tblStudent WHERE Attendance BETWEEN 85 AND 95GROUP BY GenderHAVING AvgAge > 16"
15.9JOIN
STEPS
Read all tables as data.frame
Register the connection with duckdb
Remember to provide separate table names for each table while registering it
The following script assumes that all data are in your current working directory
LEFT JOIN: Everything on the LEFTAND those matches on the RIGHT
Code
sSQL="SELECT * FROM tblClassLEFT JOIN tblTeacher ON tblClass.TeacherID = tblTeacher.TeacherID"sSQL="SELECT * FROM tblClass AS CLEFT JOIN tblTeacher AS TON C.TeacherID = T.TeacherID"sSQL="SELECT tblClass.ClassID, tblClass.Subject, tblClass.TeacherID FROM tblClassLEFT JOIN tblTeacher AS TON tblClass.TeacherID = T.TeacherID"sSQL="SELECT C.ClassID, C.Subject, C.TeacherID FROM tblClass AS CLEFT JOIN tblTeacher AS TON C.TeacherID = T.TeacherID"sSQL="SELECT C.ClassID, C.Subject, C.TeacherID, T.TeacherID, T.TName, T.ExperienceFROM tblClass AS CLEFT JOIN tblTeacher AS TON C.TeacherID = T.TeacherID"sSQL="SELECT T.TeacherID, T.TName, T.Experience, C.ClassID, C.Subject, C.TeacherIDFROM tblTeacher AS TLEFT JOIN tblClass AS C ON T.TeacherID = C.TeacherID"
15.9.2 Anti LEFT JOIN
Anti LEFT JOIN
Anti LEFT JOIN or LEFT JOIN Exclusive
Everything on the LEFT that is NOT on the RIGHT
Code
sSQL="SELECT T.TeacherID, T.TName, T.Experience, C.ClassID, C.Subject, C.TeacherIDFROM tblTeacher AS TLEFT JOIN tblClass AS C ON T.TeacherID = C.TeacherIDWHERE C.TeacherID IS NULL"
15.9.3RIGHT JOIN
RIGHT JOIN
RIGHT JOIN: Everything on the RIGHTAND those matches on the LEFT
Code
sSQL="SELECT C.ClassID, C.Subject, S.StudentID, S.SName, S.ClassIDFROM tblClass AS C RIGHT JOIN tblStudent AS SON C.ClassID = S.ClassIDORDER BY S.StudentID"
15.9.4 Anti RIGHT JOIN
Anti RIGHT JOIN
Anti RIGHT JOIN or RIGHT JOIN Exclusive
Everything on the RIGHT that is NOT on the LEFT
Code
sSQL="SELECT C.ClassID, C.Subject, S.StudentID, S.SName, S.ClassIDFROM tblClass AS C RIGHT JOIN tblStudent AS SON C.ClassID = S.ClassIDWHERE C.ClassID IS NULLORDER BY S.StudentID"
15.9.5INNER JOIN
INNER JOIN
INNER JOIN: Only those matches on the LEFTAND the RIGHT
Code
sSQL="SELECT S.StudentID, S.SName, S.ClassID, M.StudentID, M.MarksFROM tblStudent AS SINNER JOIN tblMarks AS MON S.StudentID = M.StudentIDORDER BY S.StudentID"
15.9.6FULL JOIN
FULL JOIN
FULL JOIN or OUTER JOIN: Everything on the LEFTAND everything on the RIGHT
Code
sSQL="SELECT S.StudentID, S.SName, S.ClassID, M.StudentID, M.MarksFROM tblStudent AS SFULL JOIN tblMarks AS MON S.StudentID = M.StudentIDORDER BY S.StudentID"