DatatableTon

πŸ’― datatable exercises

This is Set 2: Files and Formats (Exercises 11-20) of DatatableTon: πŸ’― datatable exercises
You can find all the exercises and solutions on GitHub

Prerequisites

  • The datatable package should be upgraded to the latest version (or v1.0.0+).
  • Additional packages numpy, pandas and pyarrow should be installed.
  • Small example values along with the sample dataset datatableton_sample.zip will be used for the exercises.
In [1]:
!python3 -m pip install -U pip
!python3 -m pip install -U datatable
!python3 -m pip install numpy
!python3 -m pip install pandas
!python3 -m pip install pyarrow
!wget https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.zip
Requirement already satisfied: pip in /opt/conda/lib/python3.7/site-packages (21.1.2)
Collecting pip
  Downloading pip-21.1.3-py3-none-any.whl (1.5 MB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.5 MB 873 kB/s 
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 21.1.2
    Uninstalling pip-21.1.2:
      Successfully uninstalled pip-21.1.2
Successfully installed pip-21.1.3
WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv
Requirement already satisfied: datatable in /opt/conda/lib/python3.7/site-packages (1.0.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (1.19.5)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: pandas in /opt/conda/lib/python3.7/site-packages (1.2.4)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages (from pandas) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.7/site-packages (from pandas) (2021.1)
Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.7/site-packages (from pandas) (1.19.5)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: pyarrow in /opt/conda/lib/python3.7/site-packages (4.0.0)
Requirement already satisfied: numpy>=1.16.6 in /opt/conda/lib/python3.7/site-packages (from pyarrow) (1.19.5)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--2021-07-13 06:14:50--  https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1911 (1.9K) [application/zip]
Saving to: β€˜datatableton_sample.zip’

datatableton_sample 100%[===================>]   1.87K  --.-KB/s    in 0s      

2021-07-13 06:14:50 (28.2 MB/s) - β€˜datatableton_sample.zip’ saved [1911/1911]

In [2]:
import datatable as dt
import numpy as np
import pandas as pd
import pyarrow as pa

dtable = dt.Frame(f1=range(10), f2=['Y', 'O', 'U', 'C', 'A', 'N', 'D', 'O', 'I', 'T'])
dframe = pd.DataFrame({'v1': range(11), 'v2': ['N', 'E', 'V', 'E', 'R', 'G', 'I', 'V', 'E', 'U', 'P']})
nparray = np.array([[0, 'C'], [1, 'O'], [2, 'D'], [3, 'E']])
patable = pa.Table.from_pandas(dframe)
dlist = [range(4), ['D', 'A', 'T', 'A']]
ddict = {'x': range(6), 'y': ['P', 'Y', 'T', 'H', 'O', 'N']}
dtup = [(0, 'L'), (1, 'E'), (2, 'A'), (3, 'R'), (4, 'N')]

Exercise 11: Write dtable to a csv named data.csv, to a compressed gz csv named data.gz and to a jay named data.jay

In [ ]:
 

Exercise 12: Read data.csv and assign it to data_csv, read data.gz and assign it to data_gz, read data.jay and assign it to data_jay

In [ ]:
 
In [ ]:
 
In [ ]:
 

Exercise 13: Read data from this URL: https://raw.githubusercontent.com/vopani/datatableton/main/data/datatableton_sample.csv and assign it to data_url

In [ ]:
 

Exercise 14: Read users.csv from datatableton_sample.zip and assign it to data_zip

In [ ]:
 

Exercise 15: Create a dt.Frame data_pd from the pd.DataFrame dframe and create a pd.DataFrame pd_data from the dt.Frame data_pd

In [ ]:
 
In [ ]:
 

Exercise 16: Create a dt.Frame data_np from the np.array nparray and create a np.array np_data from the dt.Frame data_np

In [ ]:
 
In [ ]:
 

Exercise 17: Create a dt.Frame data_ls from the list dlist and create a list ls_data from the dt.Frame data_ls

In [ ]:
 
In [ ]:
 

Exercise 18: Create a dt.Frame data_dc from the dictionary ddict and create a dictionary dc_data from the dt.Frame data_dc

In [ ]:
 
In [ ]:
 

Exercise 19: Create a dt.Frame data_tp from the tuples dtup and create tuples tp_data from the dt.Frame data_tp

In [ ]:
 
In [ ]:
 

Exercise 20: Create a dt.Frame data_pa from the pyarrow.Table patable and create a pyarrow.Table pa_data from the dt.Frame data_pa

In [ ]:
 
In [ ]:
 

βœ… This completes Set 2: Files and Formats (Exercises 11-20) of DatatableTon: πŸ’― datatable exercises

Set 03 • Data Selection • Beginner • Exercises 21-30

Style Colab Kaggle Binder GitHub
Exercises Open in Colab Open in Kaggle Open in Binder Open in GitHub
Solutions Open in Colab Open in Kaggle Open in Binder Open in GitHub

You can find all the exercises and solutions on GitHub