I've recently wrapped up these snippets into an easy-to-use Python package you can import anywhere. That means no more cutting and pasting or modifying your IPython/Jupyter config files.
Check it out on github.com/HHammond/PrettyPandas and prettypandas.readthedocs.org.
I love IPython and Pandas, but using them to build reports requires lots of little tricks. Here I've compiled a few things that make building reports much nicer:
The code is also available as a gist here.
Pretty Tables
There are a few ways to update the CSS of the IPython Notebook. (You can follow my instructions here for a guide on how to format your notebook). If you don't already have a css theme which modifies your table formats, you can use just the following code to beautify your tables.
Open ~/.ipython/profile_default/static/custom/custom.css
and add the following lines:
/* Pretty Pandas Dataframes */
.dataframe * {border-color: #c0c0c0 !important;}
.dataframe th{background: #eee;}
.dataframe td{
background: #fff;
text-align: right;
min-width:5em;
}
/* Format summary rows */
.dataframe-summary-row tr:last-child,
.dataframe-summary-col td:last-child{
background: #eee;
font-weight: 500;
}
which formats DataFrames from:
to:
Adding a Summary Row/Column
A common request I get when generating reports with IPython and Pandas is to have a subtotal or summary row at the bottom.
Google yielded lots of StackOverflow questions and some messy answers, so I ended up writing my own (which you can use however you want):
import numpy as np
import pandas as pd
from functools import partial
def summary(df, fn=np.sum, axis=0, name='Total',
table_class_prefix='dataframe-summary'):
"""Append a summary row or column to DataFrame.
Input:
------
df : Dataframe to be summarized
fn : Summary function applied over each column
axis : Axis to summarize on (1: by row, 0: by column)
name : Index or column label for summary
table_class_prefix : Custom css class for dataframe
Returns:
--------
Dataframe with applied summary.
"""
total = df.apply(fn, axis=axis).to_frame(name)
table_class = ""
if axis == 0:
total = total.T
table_class = "{}-row".format(table_class_prefix)
elif axis == 1:
table_class = "{}-col".format(table_class_prefix)
out = pd.concat([df, total], axis=axis)
# Patch to_html function to use custom css class
out.to_html = partial(out.to_html, classes=table_class)
return out
Result:
summary(df, axis=0)
This snippet also uses the above CSS so you don't need to edit anything to get table formatting. If you want to use your own CSS you just need to edit the .dataframe-summary-row tr:last-child
and .dataframe-summary-col td:last-child
selectors.
Percentage Format
Another common request is for a column to represented as percentages. Again, SA answers suggest setting the DataFrame's float format or other workarounds. Here I just use a format string and convert values to percentage strings (which breaks usability but looks pretty). This works on any Python's standard numeric types and Numpy types.
from numbers import Number
def as_percent(v, precision='0.2'):
"""Convert number to percentage string."""
if isinstance(v, Number):
return "{{:{}%}}".format(precision).format(v)
else:
raise TypeError("Numeric type required")
Result:
>>>as_percent(0.5)
'50.00%'
df['Observation Proportion'] = df['Observation Proportion'].apply(as_percent)
df
I hope these snippets are as helpful to you as they have been to me! Feel free to contact me if you have any formatting tips that you want to add to this list.