15 Data: Overview
-
Set the current working directory to read the data file
-
Check all objects in the current working directory
-
R base format of data is
data.frame
-
Python format of data is
DataFrame
based onpandas
library -
For Python,
import pandas as pd
is a conventional import statement -
Here we present functions to explore R and Python dataframe
15.1 R
Set the working directory to the data folder and read the iris dataset as an R object DF
.
DF = read.csv('iris.csv')
Function | Explanation | Example |
---|---|---|
dim |
Dimension of the data.frame | dim(DF) |
nrow |
Number of rows in the data.frame | nrow(DF) |
ncol |
Number of columns in the data.frame | ncol(DF) |
head |
First n (default = 6) rows of the data.frame | head(DF) |
tail |
Last n (default = 6) rows of the data.frame | tail(DF) |
rownames |
Rownames of the data.frame | rownames(DF) |
colnames, names |
Column names of the data.frame | names(DF) |
str |
Structure of the data.frame | str(DF) |
15.2 Python
Set the working directory to the data folder and read the iris dataset as an R object DF
.
import pandas as pd
DF = pd.read_csv('iris.csv')
Method | Explanation | Example |
---|---|---|
shape |
Dimension of the data.frame | DF.shape |
shape with index 0 |
Number of rows in the data.frame | DF.shape[0] |
shape with index 1 |
Number of columns in the data.frame | DF.shape[1] |
head |
First n (default = 5) rows of the data.frame | DF.head() |
tail |
Last n (default = 5) rows of the data.frame | DF.tail() |
index |
Rownames of the data.frame | DF.index |
columns |
Column names of the data.frame | DF.columns |
info, dtypes |
Types of different columns in the data.frame | DF.info(), DF.dtypes |
15.3 Note
Read several excellent User Guides of pandas DataFrame.
Read a nice 10 minutes guide to pandas
Get more information on the complete API reference of pandas DataFrame.