Apache ECharts is easy-to-use, highly interactive and highly performant JavaScript visualization library under Apache license. Since its first public release in 2013, it now dominates over 74% of Chinese web front-end market. Yet Python is an expressive language and is loved by data science community. Combining the strength of both technologies, developers create the interaction tools, pyecharts, a Python package embedded in Python for using the power visualization tool conveniently in Python.
Here I introduce some basic charts in pyecharts and render all of them in the jupyter notebook. I also changed the fake data into real world data we use for better understanding. Pyecharts is good for its interaction and enjoy yourself for adjusting the dynamic chart.
!pip install pyecharts # Python package, version == 1.9.0
Requirement already satisfied: pyecharts in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (1.9.0) Requirement already satisfied: jinja2 in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from pyecharts) (3.0.0) Requirement already satisfied: simplejson in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from pyecharts) (3.17.5) Requirement already satisfied: prettytable in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from pyecharts) (2.2.1) Requirement already satisfied: MarkupSafe>=2.0.0rc2 in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from jinja2->pyecharts) (2.0.0) Requirement already satisfied: wcwidth in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from prettytable->pyecharts) (0.2.5) Requirement already satisfied: importlib-metadata in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from prettytable->pyecharts) (3.10.0) Requirement already satisfied: zipp>=0.5 in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from importlib-metadata->prettytable->pyecharts) (3.4.1) Requirement already satisfied: typing-extensions>=3.6.4 in /Users/mac/opt/anaconda3/envs/my_env/lib/python3.7/site-packages (from importlib-metadata->prettytable->pyecharts) (3.7.4.3)
import pyecharts
import pandas as pd
pyecharts.__version__
'1.9.0'
I use the data from PSet3 for visualization
df = pd.read_csv("https://data.ny.gov/api/views/ca8h-8gjq/rows.csv")
df.head()
County | Agency | Year | Months Reported | Index Total | Violent Total | Murder | Rape | Robbery | Aggravated Assault | Property Total | Burglary | Larceny | Motor Vehicle Theft | Region | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Albany | Albany City PD | 1990 | NaN | 6635.0 | 1052.0 | 9.0 | 82.0 | 386.0 | 575.0 | 5583.0 | 1884.0 | 3264.0 | 435.0 | Non-New York City |
1 | Albany | Albany City PD | 1991 | NaN | 7569.0 | 1201.0 | 11.0 | 71.0 | 487.0 | 632.0 | 6368.0 | 1988.0 | 3878.0 | 502.0 | Non-New York City |
2 | Albany | Albany City PD | 1992 | NaN | 7791.0 | 1150.0 | 8.0 | 77.0 | 467.0 | 598.0 | 6641.0 | 2246.0 | 3858.0 | 537.0 | Non-New York City |
3 | Albany | Albany City PD | 1993 | NaN | 7802.0 | 1238.0 | 6.0 | 59.0 | 481.0 | 692.0 | 6564.0 | 2063.0 | 4030.0 | 471.0 | Non-New York City |
4 | Albany | Albany City PD | 1994 | NaN | 8648.0 | 1380.0 | 13.0 | 79.0 | 542.0 | 746.0 | 7268.0 | 2227.0 | 4502.0 | 539.0 | Non-New York City |
df_bar = df[(df['Year']==2020) & (df['Agency']=='County Total') & (df['Region']=='New York City')]
from pyecharts.charts import Bar
from pyecharts import options as opts
bar = (
Bar() # defines a chart type
.add_xaxis(list(df_bar['County'].values))
.add_yaxis("Murder", list(df_bar['Murder'].values))
.add_yaxis("Rape", list(df_bar['Rape'].values))
.add_yaxis("Robbery", list(df_bar['Robbery'].values))
.add_yaxis("Burglary", list(df_bar['Burglary'].values))
.set_global_opts(
title_opts=opts.TitleOpts(title="Crime Count in New York City",
subtitle="2020 data",
title_textstyle_opts=opts.TextStyleOpts(font_size=20
,font_weight='bold'),
subtitle_textstyle_opts=opts.TextStyleOpts(font_style='normal'
,font_weight='bold'
,font_size=16)),
legend_opts=opts.LegendOpts(pos_left='right'),
yaxis_opts=opts.AxisOpts(
name='Crime Count',
name_location='middle',
name_gap=50,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold')
),
xaxis_opts=opts.AxisOpts(
name='Location',
name_location='middle',
name_gap=30,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold'))
)
)
bar.render_notebook()
It is interactive! Try to click the legends or put your mouse on any bar!
I choose the total crime number sum over year
df_barslider = df[(df['Agency']=='County Total') & (df['Region']=='New York City')].groupby('Year').sum()
df_barslider['Year'] = df_barslider.index
from pyecharts import options as opts
from pyecharts.charts import Bar
barslider = (
Bar()
.add_xaxis(list(df_barslider.Year))
.add_yaxis("Violent Total Number", list(df_barslider['Violent Total']))
.set_global_opts(
title_opts=opts.TitleOpts(title="Crime Count in New York - DataZoom"),
datazoom_opts=opts.DataZoomOpts(),
yaxis_opts=opts.AxisOpts(
name='Crime Count',
name_location='middle',
name_gap=50,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold')),
xaxis_opts=opts.AxisOpts(
name='Year',
name_location='middle',
name_gap=30,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold'))
)
)
barslider.render_notebook()
Try to slide the year range!
import pyecharts.options as opts
from pyecharts.charts import Bar, Line
bar_line = (
Bar()
.add_xaxis(xaxis_data=list(df_barslider.Year))
.add_yaxis(
"Murder",
list(df_barslider.Murder),
label_opts=opts.LabelOpts(is_show=False),
)
.add_yaxis(
"Rape",
list(df_barslider.Rape),
label_opts=opts.LabelOpts(is_show=False),
)
.extend_axis(
yaxis=opts.AxisOpts(
name="Crime count",
type_="value"
)
)
.set_global_opts(
tooltip_opts=opts.TooltipOpts(
is_show=True,
trigger="axis",
axis_pointer_type="cross"
),
xaxis_opts=opts.AxisOpts(
type_="category",
axispointer_opts=opts.AxisPointerOpts(is_show=True, type_="shadow"),
),
yaxis_opts=opts.AxisOpts(
name="Crime Count",
type_="value",
axistick_opts=opts.AxisTickOpts(is_show=True),
splitline_opts=opts.SplitLineOpts(is_show=True),
),
)
)
line = (
Line()
.add_xaxis(list(map(str, list(df_barslider.Year))))
.add_yaxis(
series_name="Violent Total",
yaxis_index=1,
y_axis=list(df_barslider['Violent Total']),
label_opts=opts.LabelOpts(is_show=False),
)
)
bar_line.overlap(line).render_notebook()
Put your mouse on the chart or select a legend by clicking it.
df_boxplot = df[(df['Year']==2020) & (df['Agency']=='County Total')]
from pyecharts import options as opts
from pyecharts.charts import Boxplot
v1 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Rape']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Rape'])
]
v2 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Murder']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Murder'])
]
v3 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Robbery']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Robbery'])
]
v4 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Burglary']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Burglary'])
]
v5 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Larceny']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Larceny'])
]
v6 = [
list(df_boxplot[df_boxplot['Region']=='New York City']['Motor Vehicle Theft']),
list(df_boxplot[df_boxplot['Region']=='Non-New York City']['Motor Vehicle Theft'])
]
c = Boxplot()
c.add_xaxis(["New York City", "Non-New York City"])
c.add_yaxis("Rape", c.prepare_data(v1))
c.add_yaxis("Murder", c.prepare_data(v2))
c.add_yaxis("Robbery", c.prepare_data(v3))
c.add_yaxis("Burglary", c.prepare_data(v4))
c.add_yaxis("Larceny", c.prepare_data(v5))
c.add_yaxis("Motor Vehicle Theft", c.prepare_data(v6))
c.set_global_opts(
title_opts=opts.TitleOpts(title="Crime Count boxplot",
subtitle="2020 data",
title_textstyle_opts=opts.TextStyleOpts(font_size=20
,font_weight='bold'),
subtitle_textstyle_opts=opts.TextStyleOpts(font_style='normal'
,font_weight='bold'
,font_size=16)),
legend_opts=opts.LegendOpts(pos_left='right'),
yaxis_opts=opts.AxisOpts(
name='Crime Count',
name_location='middle',
name_gap=50,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold')
),
xaxis_opts=opts.AxisOpts(
name='Location',
name_location='middle',
name_gap=30,
name_textstyle_opts=opts.TextStyleOpts(font_size=16, font_weight='bold'))
)
c.render_notebook()
The above interactive boxplot solves the problem that different crime count is significantly variant by easily applied filters. If you want to filter one of the crimes, just click it on the legend and then it will disappear in the chart.
import datetime
import random
from pyecharts import options as opts
from pyecharts.charts import Calendar
begin = datetime.date(2021, 1, 1)
end = datetime.date.today()
data = [
[str(begin + datetime.timedelta(days=i)), random.randint(10000, 20000)]
for i in range((end - begin).days + 1)
]
for i in range(20, 30):
data[(datetime.date(2021, 10, i) - begin).days][1] = random.randint(30000, 50000)
c = (
Calendar()
.add(
"My pressure value within 2021",
data,
calendar_opts=opts.CalendarOpts(
pos_top="120",
pos_left="30",
pos_right="30",
range_="2021",
yearlabel_opts=opts.CalendarYearLabelOpts(is_show=False),
daylabel_opts=opts.CalendarDayLabelOpts(name_map="en"),
monthlabel_opts=opts.CalendarMonthLabelOpts(name_map="en"),
),
)
.set_global_opts(
title_opts=opts.TitleOpts(
pos_top="30",
pos_left="center",
title="2021 Calendar",
title_textstyle_opts=opts.TextStyleOpts(font_size=30 ,font_weight='bold')
),
visualmap_opts=opts.VisualMapOpts(
max_=50000, min_=10000, orient="horizontal", is_piecewise=False
),
legend_opts=opts.LegendOpts(pos_left='right', pos_bottom='bottom'),
)
)
c.render_notebook()
Try to put your mouse on any block of the Calendar chart and you can see corresponding number in left down legend. In addition, you can select certain range of pressure value in the left down legend, then it can be a filter for the heatmap.
I listed several common graphs in data visualization, including line plot, bar plot, heatmap and boxplot. Actually, given more complicated data and relationship between data, there exists much more beautiful and amazing graphs for data visualization. And that is why ECharts is a powerful and dominating visualization tool. Pyecharts can duplicate lots of graphs of Apache ECharts like geoplot, Sankey, and even K-line plot. Another amazing plot is relationship graph in the link (https://gallery.pyecharts.org/#/Graph/graph_weibo). Unfortunately, our dataset, the crime data cannot be plotted in that way. For more interesting graphs, the reference link can be used for reference.