Home




About the Workshop

R is the choice of language for many statisticians. While R supports a comprehensive range of statistical tools in diverse disciplines, we recognise the extensive development of Python in recent times, particularly in the data science world of machine learning and deep learning.

The target audience of this workshop will be for both R and Python users who wish to embrace both languages and enhance their productivity by leveraging the best of both worlds. Using bite-sized simple R and Python scripts, the workshop will highlight similarities and differences between these two programming environments – its syntax, semantics, and implementation framework. It will demonstrate how understanding these basic and subtle concepts could benefit the efficient usage of both programming languages.

We will organise the workshop in four sections in two parts (one-hour each):

  • Section 1: An overview of both languages and introduce objects like vector, matrix, array, list, dictionary, and data frame.

  • Section 2: Essential data manipulation functionalities covering the R data.frame and Python pandas library.

  • Section 3: Plotting functionalities using well-known libraries in both languages, such as ggplot2, matplotlib, plotnine and seaborn.

  • Section 4: Implementation of standard statistical models; the framework of the scikit-learn library in Python and the principles of training and evaluating machine-learning models.


Workshop Time
  • Part 1: 09:00-10:00 Thursday, 15 September, 2022

  • Part 2: 14:00-15:00 Thursday, 15 September, 2022

Check the room details at the conference site.

The workshop will also be streaming live for the online audience.


Workshop Resources

Click here to access the workshop resources




Please see the University of Aberdeen policy statements on Privacy and Cookies about this website.




Option 1

Option 1: Relax and just join us


Installation of different programming environments may need more time and attention.

So, don’t get distracted by technical issues.

Relax, join the workshop and listen to us.

All resources are available online to follow at your leisure.




Option 2

Option 2: No installation; watch and work on a browser


Select ONE of the following options and run R and Python scripts without installing anything on your system.




  1. Google Collaboratory
  • Go to Google Collaboratory.

  • You will need a Google account to use Google Collaboratory.

  • Create a Google account when prompted or log into your account

  • Click New Notebook if the prompt shows or File > New Notebook if presented with the Welcome to Colaboratory notebook.

  • You now can run Python code directly in the cells, press the Run button (Play icon) to execute the code cell or Shift + Enter.

  • To run R code insert a new code cell, write %load_ext rpy2.ipython and run the cell to load the R kernel

  • Once the kernel is loaded, write %%R at the top of every new code cell that includes the R code.

  • Execute R codes in each cell as the standard way.

  • You can run R and Python code in the same notebook but add the %%R when you wish to run R scripts.




  1. Jupyter Notebooks


Running Python scripts

  • Go to Jupyter Notebooks website.

  • Click Jupyter Notebook image (under Applications) to start a demo Python Notebook page.

  • Once loaded, it will present an example Python notebook.

  • Click File > New > Notebook to create a new notebook with Python kernel.


Running R scripts

  • Go to Jupyter Notebooks website.

  • Scroll down to ‘Kernels’ and click the R image to start a demo R Notebook page.

  • Note it may take some time to start the R repository.

  • Once loaded, it will present an example R notebook

  • Click File > New Notebook > R to create a new notebook with R kernel (check the right-hand corner that it shows the R kernel)

  • Note unlike Google Collaboratory, you cannot run Python and R in the same notebook.

  • You can now move between two tabs on the browser to use both R and Python notebooks.

  • You do not need an account to use Jupyter Notebook


Here is a quick summary of Jupyter Notebook shortcuts.




  1. CoCalc
  • Go to CoCalc website.

  • You will need a CoCalc account to use this resource

  • Create a new account or log into your account

  • Head to the link, and give your project a name, then click Create Project

  • Next click + New from the tabs at the top of the page, then select Jupyter Notebook from the list.

  • Then choose either Python or R kernel.

  • You can enter your code in the code cells and execute that cell by pressing the Play button or Shift + Enter.




Option 3

Option 3: Install bells & whistles; watch and work on your system


Follow Steps 1 to 4 depending on your preferences.

Depending on your systems, it may still require additional configurations to integrate all environments that talk with each other seamlessly.

Please note we cannot cover or support technical issues related to installation or configuration during the workshop.


Step 1: Download R

Download and install R.


Step 2: Download Python

Optionally, download and install Python.

For the workshop, however, download and install Anaconda Distribution which is a user-friendly popular Python distribution platform.

Anaconda includes an integrated and typical Python environment and comes with several important Python packages.

Go to Step 3 or 4 depending on your familiarity with one of the following integrated development environments (IDE):

  • RStudio (Go to Step 3)

  • Jupyter Notebook (Go to Step 4)


Step 3: R Users

This step is probably applicable to users who work predominantly in R.

(A) Download RStudio

Download and install RStudio; it is the most popular integrated development environment (IDE) for R. If you already use RStudio, skip this step. Go to Step 3B.

(B) Install the R package reticulate

Install the package reticulate, an interface to ‘Python’ from RStudio IDE.

install.packages('reticulate')

library(reticulate)

use_python("/usr/local/bin/python")

(C) Install the R package ggplot2

  • In your current R environment: install.packages('ggplot2')

  • Alternatively, from RStudio: Tools > Install Packages and select ggplot2 in the package name text box

(D) Install Python packages

  • If you installed Anaconda Distribution, most packages used in the workshop will be pre-installed.

  • Install plotnine: pip install plotnine

It will be simple and easy to Install plotnine from Anaconda environment


Step 4: Python Users

This step is probably applicable to users who work predominantly in Python.


Steps to follow with standard Anaconda distribution (from Step 2)

Using the Anaconda distribution is the simple and easier option for the workshop content.

• Open the Anaconda navigator select the Environments tab and click Create

• Give your environment a name and select either Python or R.

• Once created install Jupyter Notebook from the panel on the right.

• Now click the Play button on the environment you just created and select Open with Jupyter Notebook

• If you wish to install any other packages like plotnine, you can select Open with Terminal from the previous step and use the pip command to install packages.

  • Install plotnine: pip install plotnine

Here is a quick summary of Jupyter Notebook shortcuts.


Optional: Steps to follow with standard Python distribution (from Step 2)

Not required for the workshop

(A) Install Jupyter Notebook

jupyterlab is an interactive environment for many programming languages including R and Python.

pip install jupyterlab

Here is a quick summary of Jupyter Notebook shortcuts.

(B) Install required Python packages

  • pip install numpy

  • pip install numpy

  • pip install plotnine


Optional: Other Python IDE

Not required for the workshop

A list of other excellent integrated development environments (IDE) for Python:

  • IDLE: IDLE is Python’s Integrated Development and Learning Environment bundled with Python installation.

  • spyder: The Scientific Python Development Environment

  • PyCharm: A free version of IDE available from jetbrains.com.

  • Visual Studio Code A versatile IDE for multiple languages from Microsoft. Check this link to run R in Visual Studio Code.

Note: The above IDEs do not have direct R integration. We will not use these IDEs in the workshop




R Resources


General Help




Websites, Books & Manuals
  • An Introduction to R from CRAN. Web link

  • Contributed Documentation from CRAN: It includes several documents with more than 100 pages. Some interesting ones for the beginners are mentioned below. Web link.

  • Using R for Data Analysis and Graphics - Introduction, Examples and Commentary by John Maindonald: This book covers basics of R as well as graphics and simple to advanced statistical tools. Weblink

  • Kickstarting R by Jim Lemon is a perfect book for the beginner. The book covered many of the basic statistical principles covered in this course. Web link

  • Introductory Statistics with R by Peter Dalgaard: The first two chapters cover the basics of R and the rest of the book includes several topics in statistics using R.

  • R Programming for Data Science by Roger D Peng covers several aspects of R programming in a greater depth. Web link

  • And many more…




Python Resources


General Help




Websites, Books & Manuals




Other Info


Presentation


Dr Mintu Nath

Senior Lecturer in Medical Statistics, University of Aberdeen

Email:

Website


Dr Georgios Leontidis

Director, Interdisciplinary Centre for Data and AI & Reader in Machine Learning, University of Aberdeen

Email:

Website


Aiden Durrant

PhD Student in Machine Learning

Email:


Organising Committee:
RSS Highlands Local Group


Supported by:
Royal Statistical Society