**Lets-Plot** is an open-source plotting library for statistical data. It is implemented using the
Kotlin programming language that has a multi-platform nature.
That's why Lets-Plot provides the plotting functionality that
is packaged as a JavaScript library, a JVM library, and a native Python extension.

The design of the Lets-Plot library is heavily influenced by ggplot2 library.

When installing the Lets-Plot library, consider the following requirements.

Supported operating systems:

- macOS
- Linux
- Windows

Supported Python versions:

- 3.6
- 3.7
- 3.8
- 3.9

The `lets-plot`

package is available in the pypi.org repository.
Execute the following command to install the `lets-plot`

package on your Python interpreter:

`pip install lets-plot`

In `lets-plot`

, the **plot** is represented at least by one
**layer**. It can be built based on the default dataset with the aesthetics mappings, set of scales, or additional
features applied.

The **Layer** is responsible for creating the objects painted on the ‘canvas’ and it contains the following elements:

**Data**- the set of data specified either once for all layers or on a per layer basis.

One plot can combine multiple different datasets (one per layer).

**Aesthetic mapping**- describes how variables in the dataset are mapped to the visual properties of the layer, such as color, shape, size, or position.**Geometric object**- a geometric object that represents a particular type of plots.**Statistical transformation**- computes some kind of statistical summary on the raw input data.

For example, `bin`

statistics is used for histograms and `smooth`

is used for regression lines.
Most stats take additional parameters to specify details of the statistical transformation of data.

**Position adjustment**- a method used to compute the final coordinates of geometry.

Used to build variants of the same `geom`

object or to avoid overplotting.

The typical code fragment that renders a plot looks as follows:

```
from lets_plot import *
p = ggplot(<dataf>)
p + geom_<plot_type>(mapping=aes('x', 'y', <other>='<data column name>'), stat=<stat>, position=<adjustment>)
```

`geom`

¶You can add a new geometric object (or plot layer) by creating it using the `geom_xxx()`

function and then adding this object to `ggplot`

:

```
p = ggplot(data=df)
p + geom_point()
```

The following plots are supported:

- Area plot:
`geom_area()`

- Discrete plot:
`geom_bar()`

,`geom_pie()`

- Boxplot:
`geom_boxplot()`

- Contours:
`geom_contour()`

,`geom_contourf()`

- Connectors
`geom_path()`

,`geom_line()`

,`geom_segment()`

,`geom_step()`

- Density plot:
`geom_density()`

,`geom_area_ridges()`

,`geom_violin()`

and`geom_density2d()`

,`geom_density2df()`

- Error-bar plot:
`geom_errorbar()`

,`geom_crossbar()`

,`geom_linerange()`

,`geom_pointrange()`

- Histogram:
`geom_freqpoly()`

,`geom_histogram()`

and`geom_bin2d()`

- Jitter plot:
`geom_jitter()`

- Line plot:
`geom_line()`

- Reference lines:
`geom_abline()`

,`geom_hline()`

,`geom_vline()`

- Polygons:
`geom_polygon`

- Rectangles, Tiles, Raster:
`geom_rect()`

,`geom_tile()`

,`geom_raster()`

- Ribbons:
`geom_ribbon()`

- Scatter plot:
`geom_point()`

- Dot plot:
`geom_dotplot()`

,`geom_ydotplot()`

- Regression lines:
`geom_smooth()`

- Q-Q plot:
`geom_qq()`

,`geom_qq_line()`

,`geom_qq2()`

,`geom_qq2_line()`

- Text:
`geom_text()`

,`geom_label()`

- Map:
`geom_map()`

- Image:
`geom_imshow()`

See the geom reference for more information about the supported geometric methods, their arguments, and default values.

With the `GGBunch()`

method, you can
render a collection of plots.
Use the `add_plot()`

method to add plot to the bunch and set an arbitrary location and size for plots inside the grid:

```
bunch = GGBunch()
bunch.add_plot(plot1, 0, 0)
bunch.add_plot(plot2, 0, 200)
```

See the GGBunch example for more information.

`stat`

¶Add `stat`

as an argument to `geom_xxx()`

function to define statistical data transformations:

`geom_point(stat='count')`

Supported transformations:

`identity`

: leave the data unchanged`count`

: calculate the number of points with same x-axis coordinate`bin`

: calculate the number of points falling in each of adjacent equally sized ranges along the x-axis`bin2d`

: calculate the number of points falling in each of adjacent equal sized rectangles on the plot plane`smooth`

: perform smoothing`contour`

,`contourf`

: calculate contours of 3D data`boxplot`

: calculate components of a box plot.`density`

,`density2d`

,`density2df`

: perform a kernel density estimation for 1D and 2D data

`mapping`

¶With mappings, you can define how variables in dataset are mapped to the visual elements of the plot.
Pass the result of the `aes(x, y, other)`

function to `geom`

, where:

`x`

: the dataframe column to map to the x axis.`y`

: the dataframe column to map to the y axis.`other`

: other visual properties of the plot, such as color, shape, size, or position.

`geom_bar(x='cty', y='hwy', color='cyl')`

you can use a simplified form:
`geom_bar('cty', 'hwy', color='cyl')`

`position`

¶All layers have a position adjustment that computes the final coordinates of geometry.
Position adjustment is used to build variances of the same plots and resolve overlapping.
Override the default settings by using the `position`

argument in the `geom`

functions:

`geom_bar(position='dodge')`

Available adjustments:

`dodge`

`jitter`

`jitterdodge`

`nudge`

`identity`

`fill`

`stack`

See the position reference for more information about position adjustments.

Enables choosing a reasonable scale for each mapped variable depending on the variable attributes. Override default scales to tweak details like the axis labels or legend keys, or to use a completely different translation from data to aesthetic. For example, to override the fill color on the histogram:

`p + geom_histogram() + scale_fill_brewer(name="Trend", palette="RdPu")`

See the list of the available `scale`

methods in the scale reference

The coordinate system determines how the x and y aesthetics combine to position elements in the plot. For example, to override the default X and Y ratio:

`p + coord_fixed(ratio=2)`

See the list of the available methods in coordinates reference

The axes and legends help users interpret plots.
Use the `guide`

methods or the `guide`

argument of the `scale`

method to customize the legend.
For example, to define the number of columns in the legend:

`p + scale_color_discrete(guide=guide_legend(ncol=2))`

See more information in the guide reference

Adjust legend location on plot using the `theme`

legend_position, legend_justification and legend_direction methods, see:
[TBD]

Sampling is a special technique of data transformation built into Lets-Plot and it is applied after stat transformation.
Sampling helps prevents UI freezes and out-of-memory crashes when attempting to plot an excessively large number of geometries.
By default, the technique applies automatically when the data volume exceeds a certain threshold.
The `none`

value disables any sampling for the given layer. The sampling methods can be chained together using the + operator.

Available methods:

`sampling_random_stratified`

: randomly selects points from each group proportionally to the group size but also ensures

that each group is represented by at least a specified minimum number of points.

`sampling_random`

: selects data points at randomly chosen indices without replacement.`sampling_pick`

: analyses X-values and selects all points which X-values get in the set of first`n`

X-values found in the population.`sampling_systematic`

: selects data points at evenly distributed indices.`sampling_vertex_dp`

,`sampling_vertex_vw`

: simplifies plotting of polygons.

There is a choice of two implementation algorithms: Douglas-Peucker (`_dp`

) and
Visvalingam-Whyatt (`_vw`

).

For more details, see the sampling reference.

Let's plot a point plot built using the mpg dataset.

Create the `DataFrame`

object and retrieve the data.

In [1]:

```
# Data set
import pandas as pd
mpg = pd.read_csv("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv")
mpg.head()
```

Out[1]:

Unnamed: 0 | manufacturer | model | displ | year | cyl | trans | drv | cty | hwy | fl | class | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | 1 | audi | a4 | 1.8 | 1999 | 4 | auto(l5) | f | 18 | 29 | p | compact |

1 | 2 | audi | a4 | 1.8 | 1999 | 4 | manual(m5) | f | 21 | 29 | p | compact |

2 | 3 | audi | a4 | 2.0 | 2008 | 4 | manual(m6) | f | 20 | 31 | p | compact |

3 | 4 | audi | a4 | 2.0 | 2008 | 4 | auto(av) | f | 21 | 30 | p | compact |

4 | 5 | audi | a4 | 2.8 | 1999 | 6 | auto(l5) | f | 16 | 26 | p | compact |

Plot the basic point plot.

In [2]:

```
# Basic plotting
from lets_plot import *
# Load Lets-Plot JS library
LetsPlot.setup_html()
```

Perform the following aesthetic mappings:

`x`

= displ (the**displ**column of the dataframe)`y`

= hwy (the**hwy**column of the dataframe)`color`

= cyl (the**cyl**column of the dataframe)

In [3]:

```
p = ggplot(mpg)
p + geom_point(aes('displ', 'hwy', color='cyl'))
```

Out[3]:

Apply statistical data transformation to count the number of cases at each x position.

In [4]:

```
p + geom_point(aes('displ', size='..count..', col='..count..'), stat='count')
```

Out[4]:

Change the pallete and the legend, add the title.

In [5]:

```
p += scale_color_continuous("blue", "pink", guide=guide_legend(ncol=2)) \
+ ggtitle('Highway MPG by displacement')
p + geom_point(aes('displ', 'hwy', color='cyl'), position='jitter')
```

Out[5]:

In [6]:

```
p + geom_point(
aes('displ', 'hwy', color='cyl'),
position='jitter',
sampling=sampling_random_stratified(40))
```

Out[6]: