gpplot package

Submodules

gpplot.plots module

Plots module - contains functions to generate plots.

gpplot.plots.add_correlation(data, x, y, method='pearson', signif=2, loc='upper left', fontfamily='Arial', ax=None, **kwargs)[source]

Add correlation to a scatterplot

Parameters:
  • data (DataFrame) – DataFrame with columns x and y, same data used to create the plot
  • x (str) – x variable to correlate
  • y (str) – y variable to correlate
  • method (str, optional) – pearson or spearman
  • signif (int, optional) – number of significant figures
  • loc (string, optional) – location of label, passed to matplotlib.offsetbox.AnchoredText
  • size (int, optional) – text size
  • fontfamily (str, optional) – text font family
  • ax (Axis object, optional) – Plot to add correlation to
  • **kwargs – Other key word arguments passed to text object
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.add_reg_line(data, x, y, ax=None, linestyle='dashed', linecolor='black', **kwargs)[source]

Add regression line to a scatter plot using seaborn.regplot

Parameters:
  • data (DataFrame) – DataFrame with columns x and y, same data used to create the scatter plot
  • x (str) – x variable to regress
  • y (str) – y variable to regress
  • ax (Axis object, optional) – Plot to add regression line to
  • linestyle (str, optional) – Style of regression line
  • linecolor (str, optional) – Color of regression line
  • **kwargs – Other keyword arguments that are passed through to seaborn.regplot
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.add_xy_line(slope=1, intercept=0, ax=None, linestyle='dashed', linecolor='black')[source]

Add line with specified slope and intercept to a scatter plot; Default: y=x line

Parameters:
  • slope (float) – Value of slope of line to be drawn
  • intercept (float) – Value of intercept of line to be drawn
  • ax (Axis object, optional) – Plot to add line to
  • linestyle (str, optional) – Style of line
  • linecolor (str, optional) – Color of line
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.calculate_correlation(data, x, y, type)[source]
Parameters:
  • data (DataFrame) – DataFrame with columns x and y
  • x (str) – x variable to correlate
  • y (str) – y variable to correlate
  • type (str) – pearson or spearman
Returns:

(correlation between x and y, significance)

Return type:

tuple

gpplot.plots.dark_boxplot(data, x, y, boxprops=None, medianprops=None, whiskerprops=None, capprops=None, flierprops=None, **kwargs)[source]

Wrapper for seaborn.boxplot, which defaults to black lines for boxplot elements

Parameters:
  • data (DataFrame) – Data to create boxplot
  • x (str) – x value of boxplot
  • y (str) – y value of boxplot
  • boxprops (dict, optional) – Style of box, passed to matplotlib.pyplot.boxplot
  • medianprops (dict, optional) – Style of median line, passed to matplotlib.pyplot.boxplot
  • whiskerprops (dict, optional) – Style of whiskers, passed to matplotlib.pyplot.boxplot
  • capprops (dict, optional) – Sytle of cap on top of whiskers, passed to matplotlib.pyplot.boxplot
  • flierprops (dict, optional) – Style of outlier points, passed to matplotlib.pyplot.boxplot
  • **kwargs – Other keyword arguments are passed through to seaborn.boxplot
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.density_rugplot(data, x, y, y_values, density_height=2, rug_height=1, density_color='black', rug_color='black', rug_alpha=0.5, figsize=[6.4, 4.8], ref_line=None, ref_line_color='black', **kwargs)[source]

Creates a density rugplot

first subplot is a distribution of values and subsequent subplots are rugplots of values for some discrete number of variables

Parameters:
  • data (DataFrame) – DataFrame with columns x and y
  • x (str) – Column in data of continuous values
  • y (str) – Column in data of discrete values
  • y_values (list) – List of y values to include as subplots
  • density_height (int, optional) – Relative height of density plot
  • rug_height (int, optional) – Relative height of rug plot
  • density_color (str, optional) – Color of density plot
  • rug_color (str, optional) – Color of rug plot
  • rug_alpha (float, optional) – Opacity of rug plot
  • figsize (tuple, optional) – Size of entire figure
  • ref_line (int, optional) – x value of reference line to include for all plots
  • ref_line_color (str, optional) – Color of reference line
  • **kwargs – Other keyword arguments are passed through to sns.rugplot
Returns:

  • matplotlib.figure.Figure – figure
  • numpy.ndarray of matplotlib.axes.Axes – individual subplots

gpplot.plots.label_axes(x, color, label, text_xpos, text_ypos, text_ha, text_va)[source]

For use with ridgeplot, define and use a simple function to label the kde plots in axes coordinates

gpplot.plots.label_points(data, x, y, label, label_col, arrowstyle='-', arrow_color='black', arrow_lw=1, ax=None, **kwargs)[source]

Label points in a scatterplot

Parameters:
  • data (DataFrame) – Data to create labels
  • x (str) – x position of labels
  • y (str) – y position of labels
  • label (list) – DataFrame elements to label
  • label_col (str) – Column to match ‘label’ points
  • arrowstyle (str, optional) – Style of arrow
  • arrow_color (str, optional) – Color of arrow
  • arrow_lw (float, optional) – Line weight of arrow
  • ax (matplotlib.axes.Axes) – Plot to label
  • **kwargs – Other keyword arguments are passed through to matplotlib.plt.text
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.pandas_barplot(data, x, hue, y, x_order=None, hue_order=None, horizontal=True, stacked=True, **kwargs)[source]

Create a barplot using pandas plot functionality

Mainly allows for stacked barplots

Parameters:
  • data (DataFrame) – DataFrame with columns x and y
  • x (str) – x defines the discrete variable that will be plotted on the x-axis.
  • hue (str) – hue defines the variable that will separate variables with the same x value.
  • y (str) – y is the continuous variable defining the height of each bar
  • x_order (list, optional) – order of x axis
  • hue_order (list, optional) – order of colors
  • horizontal (bool, optional) – whether to lay the bar plot out horizontally
  • stacked (bool, optional) – whether to stack barplots
  • **kwargs – passed on to Pandas’ plot function or matplotlib’s bar function
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.point_densityplot(data, x, y, bins=None, alpha=0.6, edgecolor=None, marker='o', rasterized=True, palette='viridis', legend=False, **kwargs)[source]

Scatter plot with points colored by density

Rasterized for easy illustrator import

Parameters:
  • data (DataFrame) – DataFrame with columns x and y
  • x (str) – Variable to plot on the x axis
  • y (str) – Variable to plot on the y axis
  • bins (list of ints, optional) – Binsize for density estimate. Defaults to [20, 20]
  • alpha (float, optional) – Opacity of points
  • edgecolor (str, optional) – Point edge color
  • marker (str, optional) – Point shape
  • rasterized (bool, optional) – Whether to rasterize scatterplot
  • palette (str, optional) – Color map
  • legend (bool, optional) – Whether to include legend for density
  • **kwargs – Additional aruments passed to scatterplot function
Returns:

Return type:

matplotlib.axes.Axes

gpplot.plots.ridgeplot(data, x, hue, aspect=5, height=1, alpha=0.7, text_xpos=0, text_ypos=0.2, text_ha='left', text_va='center', lw=0.5, **kwargs)[source]

Creates a ridgeplot of overlapping kde plots

Parameters:
  • data (DataFrame) – Dataframe with columns x and hue
  • x (str) – Defines the variable that will be plotted on the x-axis
  • hue (str) – Hue defines the variable that will be plotted for the rows
  • aspect (int, optional) – Defines aspect ratio of FacetGrid
  • height (float, optional) – Defines height of each row in FacetGrid
  • alpha (float, optional) – Defines opacity of kde plot
  • text_xpos (float, optional) – Specify the horizontal position of text labels
  • text_ypos (float, optional) – Specify the vertical position of text labels
  • text_ha (str, optional) – Specify the horizontal alignment of text labels
  • text_va (str, optional) – Specify the vertical alignment of text labels
  • lw (float, optional) – Specifies the linewidth for kdeplot
  • **kwargs – Other keyword arguments are passed through to sns.FacetGrid
Returns:

seaborn FacetGrid with ridges

Return type:

sns.FacetGrid

Notes

This code is slightly modified from https://seaborn.pydata.org/examples/kde_ridgeplot

Examples

>>> iris = sns.load_dataset('iris')
>>> g = gpplot.ridgeplot(iris, 'sepal_width', 'species')

gpplot.style module

style module. Contains functions to standardize styles for matplotlib-based plots

gpplot.style.discrete_palette(palette='Set2', n=8)[source]

Default discrete palette

gpplot.style.diverging_cmap(cmap='RdBu_r')[source]

Default diverging colormap

gpplot.style.savefig(path, fig=None, bbox_inches='tight', transparent=True, **kwargs)[source]

Wrapper function to save figures

Parameters:
  • fig (matplotlib.figure.Figure) – Figure to be saved
  • path (str) – Location to save figure
  • bbox_inches (str, optional) – Bounding box of figure
  • transparent (bool, optional) – Whether to include a background for the plot
  • **kwargs – Other keyword arguments are passed through to matplotlib.pyplot.savefig
gpplot.style.sequential_cmap(cmap='viridis')[source]

Default sequential colormap

gpplot.style.set_aesthetics(style='ticks', context='notebook', font='Arial', font_scale=1, palette=None, rc=None)[source]

Set aesthetics for plotting, using seaborn.set_style and matplotlib.rcParams

Parameters:
  • style (str, optional) – One of darkgrid, whitegrid, dark, white, ticks
  • context (str, optional) – One of paper, notebook, talk, poster
  • font (str, optional) – Font family
  • font_scale (int, optional) – Scaling factor to scale the size of font elements
  • palette (str or seaborn.color_palette, optional) – Discrete color palette to use in plots, defaults to gpplot.discrete_palette
  • rc (dict, optional) – Mappings to pass to matplotlib.rcParams

Module contents

import gpplot imports both the plots and style modules