import plotly.tools as tls
tls.embed('https://plotly.com/~cufflinks/8')
import plotly.plotly as py
import cufflinks as cf
import pandas as pd
import numpy as np
print cf.__version__
0.8.2
With Plotly's Python library, you can describe figures with DataFrame's series and index's
df = cf.datagen.lines()
py.iplot([{
'x': df.index,
'y': df[col],
'name': col
} for col in df.columns], filename='cufflinks/simple-line')
But with cufflinks
, you can plot them directly
df.iplot(kind='scatter', filename='cufflinks/cf-simple-line')
Almost every chart that you make in cufflinks
will be created with just one line of code.
df = pd.DataFrame(np.random.randn(1000, 4), columns=['a', 'b', 'c', 'd'])
df.scatter_matrix(filename='cufflinks/scatter-matrix', world_readable=True)
Charts created with cufflinks
are synced with your online Plotly account. You'll need to configure your credentials to get started. cufflinks
can also be configured to work offline in IPython notebooks with Plotly Offline. To get started with Plotly Offline, download a trial library and run cf.go_offline()
.
cf.go_online() # switch back to online mode, where graphs are saved on your online plotly account
By default, plotly graphs are public. Make them private by setting world_readable
to False
df.a.iplot(kind='histogram', world_readable=False)
Only you (the creator) will be able to see this chart, or change the global, default settings with cf.set_config_file
cf.set_config_file(offline=False, world_readable=True, theme='ggplot')
df = pd.DataFrame(np.random.randn(1000, 2), columns=['A', 'B']).cumsum()
df.iplot(filename='cufflinks/line-example')
Plot one column vs another with x
and y
keywords
df.iplot(x='A', y='B', filename='cufflinks/x-vs-y-line-example')
Download some civic data. A time series log of the 311 complaints in NYC.
df = pd.read_csv('https://raw.githubusercontent.com/plotly/widgets/master/ipython-examples/311_150k.csv', parse_dates=True, index_col=1)
df.head(3)
series = df['Complaint Type'].value_counts()[:20]
series.head(3)
Plot a series
directly
series.iplot(kind='bar', yTitle='Number of Complaints', title='NYC 311 Complaints',
filename='cufflinks/categorical-bar-chart')
Plot a dataframe row as a bar
df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D'])
row = df.ix[5]
row.iplot(kind='bar', filename='cufflinks/bar-chart-row')
Call iplot(kind='bar')
on a dataframe to produce a grouped bar chart
df.iplot(kind='bar', filename='cufflinks/grouped-bar-chart')
df.iplot(kind='bar', barmode='stack', filename='cufflinks/grouped-bar-chart')
Remember: plotly charts are interactive. Click on the legend entries to hide-and-show traces, click-and-drag to zoom, double-click to autoscale, shift-click to drag.
Make your bar charts horizontal with kind='barh'
df.iplot(kind='barh',barmode='stack', bargap=.1, filename='cufflinks/barh')
cufflinks ships with a few themes. View available themes with cf.getThemes
, apply them with cf.set_config_file
cf.getThemes()
['pearl', 'white', 'ggplot', 'solar', 'space']
cf.set_config_file(theme='pearl')
df = pd.DataFrame({'a': np.random.randn(1000) + 1,
'b': np.random.randn(1000),
'c': np.random.randn(1000) - 1})
df.iplot(kind='histogram', filename='cufflinks/basic-histogram')
Customize your histogram with
barmode
(overlay
| group
| stack
)bins
(int
)histnorm
('' | 'percent' | 'probability' | 'density' | 'probability density'
)histfunc
('count' | 'sum' | 'avg' | 'min' | 'max'
)df.iplot(kind='histogram', barmode='stack', bins=100, histnorm='probability', filename='cufflinks/customized-histogram')
Like every chart type, split your traces into subplots or small-multiples with subplots
and optionally shape
. More on subplots
below.
df.iplot(kind='histogram', subplots=True, shape=(3, 1), filename='cufflinks/histogram-subplots')
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df.iplot(kind='box', filename='cufflinks/box-plots')
To produce stacked area plot, each column must be either all positive or all negative values.
When input data contains NaN
, it will be automatically filled by 0. If you want to drop or fill by different values, use dataframe.dropna()
or dataframe.fillna()
before calling plot.
df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.iplot(kind='area', fill=True, filename='cuflinks/stacked-area')
For non-stacked area charts, set kind=scatter
with fill=True
df.iplot(fill=True, filename='cuflinks/filled-area')
Set x
and y
as column names. If x
isn't supplied, df.index
will be used.
import pandas as pd
df = pd.read_csv('http://www.stat.ubc.ca/~jenny/notOcto/STAT545A/examples/gapminder/data/gapminderDataFiveYear.txt', sep='\t')
df2007 = df[df.year==2007]
df1952 = df[df.year==1952]
df2007.iplot(kind='scatter', mode='markers', x='gdpPercap', y='lifeExp', filename='cufflinks/simple-scatter')
Plotting multiple column scatter plots isn't as easy with cufflinks. Here is an example with Plotly's native syntax
fig = {
'data': [
{'x': df2007.gdpPercap, 'y': df2007.lifeExp, 'text': df2007.country, 'mode': 'markers', 'name': '2007'},
{'x': df1952.gdpPercap, 'y': df1952.lifeExp, 'text': df1952.country, 'mode': 'markers', 'name': '1952'}
],
'layout': {
'xaxis': {'title': 'GDP per Capita', 'type': 'log'},
'yaxis': {'title': "Life Expectancy"}
}
}
py.iplot(fig, filename='cufflinks/multiple-scatter')
Grouping isn't as easy either. But, with Plotly's native syntax:
py.iplot(
{
'data': [
{
'x': df[df['year']==year]['gdpPercap'],
'y': df[df['year']==year]['lifeExp'],
'name': year, 'mode': 'markers',
} for year in [1952, 1982, 2007]
],
'layout': {
'xaxis': {'title': 'GDP per Capita', 'type': 'log'},
'yaxis': {'title': "Life Expectancy"}
}
}, filename='cufflinks/scatter-group-by')
Add size
to create a bubble chart. Add hover text with the text
attribute.
df2007.iplot(kind='bubble', x='gdpPercap', y='lifeExp', size='pop', text='country',
xTitle='GDP per Capita', yTitle='Life Expectancy',
filename='cufflinks/simple-bubble-chart')
subplots=True
partitions columns into separate subplots. Specify rows and columns with shape=(rows, cols)
and share axes with shared_xaxes=True
and shared_yaxes=True
.
df=cf.datagen.lines(4)
df.iplot(subplots=True, shape=(4,1), shared_xaxes=True, fill=True, filename='cufflinks/simple-subplots')
Add subplot titles with subplot_titles
as a list of titles or True
to use column names.
df.iplot(subplots=True, subplot_titles=True, legend=False)
df.scatter_matrix(filename='cufflinks/scatter-matrix-subplot', world_readable=True)
cf.datagen.heatmap(20,20).iplot(kind='heatmap',colorscale='spectral',
filename='cufflinks/simple-heatmap')
Use hline
and vline
for horizontal and vertical lines.
df=cf.datagen.lines(3,columns=['a','b','c'])
df.iplot(hline=[2,4],vline=['2015-02-10'])
Draw shaded regions with hspan
df.iplot(hspan=[(-1,1),(2,5)], filename='cufflinks/shaded-regions')
Extra parameters can be passed in the form of dictionaries, width, fill, color, fillcolor, opacity
df.iplot(vspan={'x0':'2015-02-15','x1':'2015-03-15','color':'rgba(30,30,30,0.3)','fill':True,'opacity':.4},
filename='cufflinks/custom-regions')
cufflinks
is designed for simple one-line charting with Pandas and Plotly. All of the Plotly chart attributes are not directly assignable in the df.iplot
call signature.
To update attributes of a cufflinks
chart that aren't available, first convert it to a figure (asFigure=True
), then tweak it, then plot it with plotly.plotly.iplot
.
Here is an example of a simple plotly figure. You can find more examples in our online python documentation.
from plotly.graph_objs import *
py.iplot({
'data': [
Bar(**{
'x': [1, 2, 3],
'y': [3, 1, 5],
'name': 'first trace',
'type': 'bar'
}),
Bar(**{
'x': [1, 2, 3],
'y': [4, 3, 6],
'name': 'second trace',
'type': 'bar'
})
],
'layout': Layout(**{
'title': 'simple example'
})
}, filename='cufflinks/simple-plotly-example')
cufflinks
generates these figure's that describe plotly graphs. For example, this graph:
df.iplot(kind='scatter', filename='cufflinks/simple-scatter-example')
has this description:
figure = df.iplot(kind='scatter', asFigure=True)
print figure.to_string()
Figure( data=Data([ Scatter( x=['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '..' ], y=array([ 5.35393544e-01, -3.51020567e-01, -1.34207933e+00, .., mode='lines', name='a', line=Line( color='rgba(255, 153, 51, 1.0)', width='1.3' ) ), Scatter( x=['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '..' ], y=array([ -2.58404773, -1.91629648, -1.88997988, -1.09846618,.., mode='lines', name='b', line=Line( color='rgba(55, 128, 191, 1.0)', width='1.3' ) ), Scatter( x=['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '..' ], y=array([ 0.46611148, 1.06107695, 1.06206594, -0.56030965, -0..., mode='lines', name='c', line=Line( color='rgba(50, 171, 96, 1.0)', width='1.3' ) ) ]), layout=Layout( legend=Legend( font=Font( color='#4D5663' ), bgcolor='#F5F6F9' ), paper_bgcolor='#F5F6F9', plot_bgcolor='#F5F6F9', xaxis1=XAxis( title='', titlefont=Font( color='#4D5663' ), tickfont=Font( color='#4D5663' ), gridcolor='#E1E5ED', zerolinecolor='#E1E5ED' ), yaxis1=YAxis( title='', titlefont=Font( color='#4D5663' ), zeroline=False, tickfont=Font( color='#4D5663' ), gridcolor='#E1E5ED', zerolinecolor='#E1E5ED' ) ) )
So, if you want to edit any attribute of a Plotly graph from cufflinks, first convert it to a figure and then edit the figure objects. Let's add a yaxis title, tick suffixes, and new legend names to this example:
figure['layout']['yaxis1'].update({'title': 'Price', 'tickprefix': '$'})
for i, trace in enumerate(figure['data']):
trace['name'] = 'Trace {}'.format(i)
py.iplot(figure, filename='cufflinks/customized-chart')
Cufflinks is open source on github!
help(df.iplot)
Help on method _iplot in module cufflinks.plotlytools: _iplot(self, data=None, layout=None, filename='', world_readable=None, kind='scatter', title='', xTitle='', yTitle='', zTitle='', theme=None, colors=None, colorscale=None, fill=False, width=None, mode='lines', symbol='dot', size=12, barmode='', sortbars=False, bargap=None, bargroupgap=None, bins=None, histnorm='', histfunc='count', orientation='v', boxpoints=False, annotations=None, keys=False, bestfit=False, bestfit_colors=None, categories='', x='', y='', z='', text='', gridcolor=None, zerolinecolor=None, margin=None, subplots=False, shape=None, asFrame=False, asDates=False, asFigure=False, asImage=False, dimensions=(1116, 587), asPlot=False, asUrl=False, online=None, **kwargs) method of pandas.core.frame.DataFrame instance Returns a plotly chart either as inline chart, image of Figure object Parameters: ----------- data : Data Plotly Data Object. If not entered then the Data object will be automatically generated from the DataFrame. data : Data Plotly Data Object. If not entered then the Data object will be automatically generated from the DataFrame. layout : Layout Plotly layout Object If not entered then the Layout objet will be automatically generated from the DataFrame. filename : string Filename to be saved as in plotly account world_readable : bool If False then it will be saved as a private file kind : string Kind of chart scatter bar box spread ratio heatmap surface histogram bubble bubble3d scatter3d title : string Chart Title xTitle : string X Axis Title yTitle : string Y Axis Title zTitle : string zTitle : string Z Axis Title Applicable only for 3d charts theme : string Layout Theme solar pearl white see cufflinks.getThemes() for all available themes colors : list or dict {key:color} to specify the color for each column [colors] to use the colors in the defined order colorscale : str Color scale name If the color name is preceded by a minus (-) then the scale is inversed Only valid if 'colors' is null See cufflinks.colors.scales() for available scales fill : bool Filled Traces width : int Line width mode : string Plotting mode for scatter trace lines markers lines+markers lines+text markers+text lines+markers+text symbol : string The symbol that is drawn on the plot for each marker Valid only when mode includes markers dot cross diamond square triangle-down triangle-left triangle-right triangle-up x size : string or int Size of marker Valid only if marker in mode barmode : string Mode when displaying bars group stack overlay * Only valid when kind='bar' sortbars : bool Sort bars in descending order * Only valid when kind='bar' bargap : float Sets the gap between bars [0,1) * Only valid when kind is 'histogram' or 'bar' bargroupgap : float Set the gap between groups [0,1) * Only valid when kind is 'histogram' or 'bar' bins : int Specifies the number of bins * Only valid when kind='histogram' histnorm : string '' (frequency) percent probability density probability density Sets the type of normalization for an histogram trace. By default the height of each bar displays the frequency of occurrence, i.e., the number of times this value was found in the corresponding bin. If set to 'percent', the height of each bar displays the percentage of total occurrences found within the corresponding bin. If set to 'probability', the height of each bar displays the probability that an event will fall into the corresponding bin. If set to 'density', the height of each bar is equal to the number of occurrences in a bin divided by the size of the bin interval such that summing the area of all bins will yield the total number of occurrences. If set to 'probability density', the height of each bar is equal to the number of probability that an event will fall into the corresponding bin divided by the size of the bin interval such that summing the area of all bins will yield 1. * Only valid when kind='histogram' histfunc : string count sum avg min max Sets the binning function used for an histogram trace. * Only valid when kind='histogram' orientation : string h v Sets the orientation of the bars. If set to 'v', the length of each | bar will run vertically. If set to 'h', the length of each bar will | run horizontally * Only valid when kind is 'histogram','bar' or 'box' boxpoints : string Displays data points in a box plot outliers all suspectedoutliers False annotations : dictionary Dictionary of annotations {x_point : text} keys : list of columns List of columns to chart. Also can be usded for custom sorting. bestfit : boolean or list If True then a best fit line will be generated for all columns. If list then a best fit line will be generated for each key on the list. bestfit_colors : list or dict {key:color} to specify the color for each column [colors] to use the colors in the defined order categories : string Name of the column that contains the categories x : string Name of the column that contains the x axis values y : string Name of the column that contains the y axis values z : string Name of the column that contains the z axis values text : string Name of the column that contains the text values gridcolor : string Grid color zerolinecolor : string Zero line color margin : dict or tuple Dictionary (l,r,b,t) or Tuple containing the left, right, bottom and top margins subplots : bool If true then each trace is placed in subplot layout shape : (rows,cols) Tuple indicating the size of rows and columns If omitted then the layout is automatically set * Only valid when subplots=True asFrame : bool If true then the data component of Figure will be of Pandas form (Series) otherwise they will be index values asDates : bool If true it truncates times from a DatetimeIndex asFigure : bool If True returns plotly Figure asImage : bool If True it returns Image * Only valid when asImage=True dimensions : tuple(int,int) Dimensions for image (width,height) asPlot : bool If True the chart opens in browser asUrl : bool If True the chart url is returned. No chart is displayed. online : bool If True then the chart is rendered on the server even when running in offline mode.
#!pip install git+https://github.com/plotly/publisher.git --upgrade
import publisher
publisher.publish(
'cufflinks.ipynb', 'ipython-notebooks/cufflinks/', 'Cufflinks - Easy Pandas DataFrame Graphing with Plotly | plotly',
'An overview of cufflinks, a library for easy interactive Pandas charting with Plotly.',
title = 'Cufflinks - Easy Pandas DataFrame Graphing with Plotly | plotly',
name = 'Cufflinks',
thumbnail='thumbnail/line-plot.jpg', language='python',
ipynb= '~notebook_demo/3')
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/nbconvert.py:13: ShimWarning: The `IPython.nbconvert` package has been deprecated since IPython 4.0. You should import from nbconvert instead. /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/publisher/publisher.py:53: UserWarning: Did you "Save" this notebook before running this command? Remember to save, always save.