How to use Charty

Preparation

Load libraries

At first, we need to load libraries to be used here. We use the following libraries.

  • charty for data visualization
  • datasets-pandas for loading open datasets provided by red-datasets and using it with Pandas's data frame
  • numo/narray for some numerical array operations

You can execute the following code cell by selecting the cell and then hit Shift+Enter.

In [1]:
require "charty"
require "datasets-pandas"  # This loads "datasets" and "pandas"
require "numo/narray"
Out[1]:
false
In [2]:
{
  charty: Charty::VERSION,
  datasets_pandas: DatasetsPandas::VERSION,
  numo_narray: Numo::NArray::VERSION
}
Out[2]:
{:charty=>"0.2.10", :datasets_pandas=>"0.0.1", :numo_narray=>"0.9.2.0"}

Select visualization backend

In this notebook, we use plotly backend to create plots.

In [3]:
Charty::Backends.use(:plotly)
Out[3]:
:plotly

Load dataset

Datasets::Penguins is a Ruby port of palmerpenguins dataset. This dataset includes measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), sex, and year.

We will use this dataset in this notebook.

In [4]:
penguins = Datasets::Penguins.new.to_pandas
Out[4]:
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007
3 Adelie Torgersen NaN NaN NaN NaN None 2007
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007
... ... ... ... ... ... ... ... ...
339 Gentoo Biscoe NaN NaN NaN NaN None 2009
340 Gentoo Biscoe 46.8 14.3 215.0 4850.0 female 2009
341 Gentoo Biscoe 50.4 15.7 222.0 5750.0 male 2009
342 Gentoo Biscoe 45.2 14.8 212.0 5200.0 female 2009
343 Gentoo Biscoe 49.9 16.1 213.0 5400.0 male 2009

344 rows × 8 columns

And, we will use the fmri dataset provided in seaborn for the line plot examples. red-datasets also provides this dataset.

In [5]:
fmri = Datasets::SeabornData.new("fmri").to_pandas
Out[5]:
subject timepoint event region signal
0 s13 18 stim parietal -0.017552
1 s5 14 stim parietal -0.080883
2 s12 18 stim parietal -0.081033
3 s11 18 stim parietal -0.046134
4 s10 18 stim parietal -0.037970
... ... ... ... ... ...
1059 s0 8 cue frontal 0.018165
1060 s13 7 cue frontal -0.029130
1061 s12 7 cue frontal -0.004939
1062 s11 7 cue frontal -0.025367
1063 s0 0 cue parietal -0.006899

1064 rows × 5 columns

Scatter plot

Simple scatter plot to show the relationship between bill_length_mm and bill_depth_mm.

In [6]:
Charty.scatter_plot(
  data: penguins,      # input table data
  x: :bill_length_mm,  # the column name for x-axis
  y: :bill_depth_mm    # the column name for y-axis
)