R is the choice of language for many statisticians. While R supports a comprehensive range of statistical tools in diverse disciplines, we recognise the extensive development of Python in recent times, particularly in the data science world of machine learning and deep learning.
The target audience of this workshop will be for both R and Python users who wish to embrace both languages and enhance their productivity by leveraging the best of both worlds. Using bite-sized simple R and Python scripts, the workshop will highlight similarities and differences between these two programming environments – its syntax, semantics, and implementation framework. It will demonstrate how understanding these basic and subtle concepts could benefit the efficient usage of both programming languages.
We will organise the workshop in four sections in two parts (one-hour each):
Section 1: An overview of both languages and introduce objects like vector, matrix, array, list, dictionary, and data frame.
Section 2: Essential data manipulation functionalities covering the R data.frame and Python pandas library.
Section 3: Plotting functionalities using well-known libraries in both languages, such as ggplot2, matplotlib, plotnine and seaborn.
Section 4: Implementation of standard statistical models; the framework of the scikit-learn library in Python and the principles of training and evaluating machine-learning models.
Part 1: 09:00-10:00 Thursday, 15 September, 2022
Part 2: 14:00-15:00 Thursday, 15 September, 2022
Check the room details at the conference site.
The workshop will also be streaming live for the online audience.
Click here to access the workshop resources
Please see the University of Aberdeen policy statements on Privacy and Cookies about this website.
Installation of different programming environments may need more time and attention.
So, don’t get distracted by technical issues.
Relax, join the workshop and listen to us.
All resources are available online to follow at your leisure.
Select ONE of the following options and run R and Python scripts without installing anything on your system.
Go to Google Collaboratory.
You will need a Google account to use Google Collaboratory.
Create a Google account when prompted or log into your account
Click New Notebook
if the prompt shows or
File > New Notebook
if presented with the Welcome to
Colaboratory notebook.
You now can run Python code directly in the
cells, press the Run
button (Play
icon) to
execute the code cell or Shift + Enter
.
To run R code insert a new code cell, write
%load_ext rpy2.ipython
and run the cell to load the R
kernel
Once the kernel is loaded, write %%R
at the top of
every new code cell that includes the R code.
Execute R codes in each cell as the standard way.
You can run R and Python code in the same notebook but add the
%%R
when you wish to run R scripts.
Running Python scripts
Go to Jupyter Notebooks website.
Click Jupyter Notebook image (under Applications) to start a demo Python Notebook page.
Once loaded, it will present an example Python notebook.
Click File > New > Notebook
to create a new
notebook with Python kernel.
Running R scripts
Go to Jupyter Notebooks website.
Scroll down to ‘Kernels’ and click the R image to start a demo R Notebook page.
Note it may take some time to start the R repository.
Once loaded, it will present an example R notebook
Click File > New Notebook > R
to create a new
notebook with R kernel (check the right-hand corner that it shows the R
kernel)
Note unlike Google Collaboratory, you cannot run Python and R in the same notebook.
You can now move between two tabs on the browser to use both R and Python notebooks.
You do not need an account to use Jupyter Notebook
Go to CoCalc website.
You will need a CoCalc account to use this resource
Create a new account or log into your account
Head to the link, and give your project a name, then click
Create Project
Next click + New
from the tabs at the top of the
page, then select Jupyter Notebook
from the list.
Then choose either Python
or R
kernel.
You can enter your code in the code cells and execute that cell
by pressing the Play
button or
Shift + Enter
.
Follow Steps 1 to 4 depending on your preferences.
Depending on your systems, it may still require additional configurations to integrate all environments that talk with each other seamlessly.
Please note we cannot cover or support technical issues related to installation or configuration during the workshop.
Download and install R.
Optionally, download and install Python.
For the workshop, however, download and install Anaconda Distribution which is a user-friendly popular Python distribution platform.
Anaconda includes an integrated and typical Python environment and comes with several important Python packages.
Go to Step 3 or 4 depending on your familiarity with one of the following integrated development environments (IDE):
RStudio (Go to Step 3)
Jupyter Notebook (Go to Step 4)
This step is probably applicable to users who work predominantly in R.
(A) Download RStudio
Download and install RStudio; it is the most popular integrated development environment (IDE) for R. If you already use RStudio, skip this step. Go to Step 3B.
(B) Install the R package
reticulate
Install the package reticulate, an interface to ‘Python’ from RStudio IDE.
install.packages('reticulate')
library(reticulate)
use_python("/usr/local/bin/python")
(C) Install the R package ggplot2
In your current R environment:
install.packages('ggplot2')
Alternatively, from RStudio:
Tools > Install Packages
and select ggplot2
in the package name text box
(D) Install Python packages
If you installed Anaconda Distribution, most packages used in the workshop will be pre-installed.
Install plotnine
:
pip install plotnine
It will be simple and easy to Install plotnine
from
Anaconda
environment
This step is probably applicable to users who work predominantly in Python.
Steps to follow with standard Anaconda distribution (from Step 2)
Using the Anaconda distribution is the simple and easier option for the workshop content.
• Open the Anaconda navigator select the Environments
tab and click Create
• Give your environment a name and select either Python or R.
• Once created install Jupyter Notebook from the panel on the right.
• Now click the Play
button on the environment you just
created and select Open with Jupyter Notebook
• If you wish to install any other packages like
plotnine
, you can select Open with Terminal
from the previous step and use the pip
command to install
packages.
pip install plotnine
Here is a quick summary of Jupyter Notebook shortcuts.
Optional: Steps to follow with standard Python distribution (from Step 2)
Not required for the workshop
(A) Install Jupyter Notebook
jupyterlab is an interactive environment for many programming languages including R and Python.
pip install jupyterlab
Here is a quick summary of Jupyter Notebook shortcuts.
(B) Install required Python packages
pip install numpy
pip install numpy
pip install plotnine
Optional: Other Python IDE
Not required for the workshop
A list of other excellent integrated development environments (IDE) for Python:
IDLE: IDLE is Python’s Integrated Development and Learning Environment bundled with Python installation.
spyder: The Scientific Python Development Environment
PyCharm: A free version of IDE available from jetbrains.com.
Visual Studio Code A versatile IDE for multiple languages from Microsoft. Check this link to run R in Visual Studio Code.
Note: The above IDEs do not have direct R integration. We will not use these IDEs in the workshop
Check resources, packages, manuals, FAQs, vignettes at the Comprehensive R Archive Network (CRAN)
Many R blogs, dedicated websites, YouTube videos
Online community: StackOverflow
And of course, your friend Google
An Introduction to R from CRAN. Web link
Contributed Documentation from CRAN: It includes several documents with more than 100 pages. Some interesting ones for the beginners are mentioned below. Web link.
Using R for Data Analysis and Graphics - Introduction, Examples and Commentary by John Maindonald: This book covers basics of R as well as graphics and simple to advanced statistical tools. Weblink
Kickstarting R by Jim Lemon is a perfect book for the beginner. The book covered many of the basic statistical principles covered in this course. Web link
Introductory Statistics with R by Peter Dalgaard: The first two chapters cover the basics of R and the rest of the book includes several topics in statistics using R.
R Programming for Data Science by Roger D Peng covers several aspects of R programming in a greater depth. Web link
And many more…
Check resources, packages, documents at the Python website
Python Package Index is a helpful site to learn individual packages.
Many Python blogs, dedicated websites, YouTube videos
Online community: StackOverflow
And of course, your friend Google
Real Python includes excellent blog posts regarding all things python, novice to advanced users
Youtube series outlining python tutorials all the way up to implementing python websites and applications.
Deep learning with Python by Francois Chollet is a strong introductory guide to implementing deep learning models.
The Hitchhiker’s Guide to Python! is about using and deploying Python
A GitHub repository with a list of interesting Jupyter notebooks.
And many more…
Director, Interdisciplinary Centre for Data and AI & Reader in Machine Learning, University of Aberdeen
PhD Student in Machine Learning
Email: a.durrant.20@abdn.ac.uk