Prerequisites
datatable
package should be upgraded to the latest version (or v1.0.0+).numpy
, pandas
and pyarrow
should be installed.!python3 -m pip install -U pip
!python3 -m pip install -U datatable
!python3 -m pip install numpy
!python3 -m pip install pandas
!python3 -m pip install pyarrow
!wget https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.zip
Requirement already satisfied: pip in /opt/conda/lib/python3.7/site-packages (21.1.2) Collecting pip Downloading pip-21.1.3-py3-none-any.whl (1.5 MB) |████████████████████████████████| 1.5 MB 873 kB/s Installing collected packages: pip Attempting uninstall: pip Found existing installation: pip 21.1.2 Uninstalling pip-21.1.2: Successfully uninstalled pip-21.1.2 Successfully installed pip-21.1.3 WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv Requirement already satisfied: datatable in /opt/conda/lib/python3.7/site-packages (1.0.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (1.19.5) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Requirement already satisfied: pandas in /opt/conda/lib/python3.7/site-packages (1.2.4) Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages (from pandas) (2.8.1) Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.7/site-packages (from pandas) (2021.1) Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.7/site-packages (from pandas) (1.19.5) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Requirement already satisfied: pyarrow in /opt/conda/lib/python3.7/site-packages (4.0.0) Requirement already satisfied: numpy>=1.16.6 in /opt/conda/lib/python3.7/site-packages (from pyarrow) (1.19.5) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv --2021-07-13 06:14:50-- https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.zip Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1911 (1.9K) [application/zip] Saving to: ‘datatableton_sample.zip’ datatableton_sample 100%[===================>] 1.87K --.-KB/s in 0s 2021-07-13 06:14:50 (28.2 MB/s) - ‘datatableton_sample.zip’ saved [1911/1911]
import datatable as dt
import numpy as np
import pandas as pd
import pyarrow as pa
dtable = dt.Frame(f1=range(10), f2=['Y', 'O', 'U', 'C', 'A', 'N', 'D', 'O', 'I', 'T'])
dframe = pd.DataFrame({'v1': range(11), 'v2': ['N', 'E', 'V', 'E', 'R', 'G', 'I', 'V', 'E', 'U', 'P']})
nparray = np.array([[0, 'C'], [1, 'O'], [2, 'D'], [3, 'E']])
patable = pa.Table.from_pandas(dframe)
dlist = [range(4), ['D', 'A', 'T', 'A']]
ddict = {'x': range(6), 'y': ['P', 'Y', 'T', 'H', 'O', 'N']}
dtup = [(0, 'L'), (1, 'E'), (2, 'A'), (3, 'R'), (4, 'N')]
Exercise 11: Write dtable
to a csv named data.csv
, to a compressed gz csv named data.gz
and to a jay named data.jay
Exercise 12: Read data.csv
and assign it to data_csv
, read data.gz
and assign it to data_gz
, read data.jay
and assign it to data_jay
Exercise 13: Read data from this URL: https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.csv and assign it to data_url
Exercise 14: Read users.csv
from datatableton_sample.zip
and assign it to data_zip
Exercise 15: Create a dt.Frame data_pd
from the pd.DataFrame dframe
and create a pd.DataFrame pd_data
from the dt.Frame data_pd
Exercise 16: Create a dt.Frame data_np
from the np.array nparray
and create a np.array np_data
from the dt.Frame data_np
Exercise 17: Create a dt.Frame data_ls
from the list dlist
and create a list ls_data
from the dt.Frame data_ls
Exercise 18: Create a dt.Frame data_dc
from the dictionary ddict
and create a dictionary dc_data
from the dt.Frame data_dc
Exercise 19: Create a dt.Frame data_tp
from the tuples dtup
and create tuples tp_data
from the dt.Frame data_tp
Exercise 20: Create a dt.Frame data_pa
from the pyarrow.Table patable
and create a pyarrow.Table pa_data
from the dt.Frame data_pa