gpplot package¶
Submodules¶
gpplot.plots module¶
Plots module - contains functions to generate plots.
-
gpplot.plots.add_correlation(data, x, y, method='pearson', signif=2, loc='upper left', fontfamily='Arial', ax=None, **kwargs)[source]¶ Add correlation to a scatterplot
Parameters: - data (DataFrame) – DataFrame with columns x and y, same data used to create the plot
- x (str) – x variable to correlate
- y (str) – y variable to correlate
- method (str, optional) – pearson or spearman
- signif (int, optional) – number of significant figures
- loc (string, optional) – location of label, passed to matplotlib.offsetbox.AnchoredText
- size (int, optional) – text size
- fontfamily (str, optional) – text font family
- ax (Axis object, optional) – Plot to add correlation to
- **kwargs – Other key word arguments passed to text object
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.add_reg_line(data, x, y, ax=None, linestyle='dashed', linecolor='black', **kwargs)[source]¶ Add regression line to a scatter plot using seaborn.regplot
Parameters: - data (DataFrame) – DataFrame with columns x and y, same data used to create the scatter plot
- x (str) – x variable to regress
- y (str) – y variable to regress
- ax (Axis object, optional) – Plot to add regression line to
- linestyle (str, optional) – Style of regression line
- linecolor (str, optional) – Color of regression line
- **kwargs – Other keyword arguments that are passed through to seaborn.regplot
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.add_xy_line(slope=1, intercept=0, ax=None, linestyle='dashed', linecolor='black')[source]¶ Add line with specified slope and intercept to a scatter plot; Default: y=x line
Parameters: - slope (float) – Value of slope of line to be drawn
- intercept (float) – Value of intercept of line to be drawn
- ax (Axis object, optional) – Plot to add line to
- linestyle (str, optional) – Style of line
- linecolor (str, optional) – Color of line
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.calculate_correlation(data, x, y, type)[source]¶ Parameters: - data (DataFrame) – DataFrame with columns x and y
- x (str) – x variable to correlate
- y (str) – y variable to correlate
- type (str) – pearson or spearman
Returns: (correlation between x and y, significance)
Return type: tuple
-
gpplot.plots.dark_boxplot(data, x, y, boxprops=None, medianprops=None, whiskerprops=None, capprops=None, flierprops=None, **kwargs)[source]¶ Wrapper for seaborn.boxplot, which defaults to black lines for boxplot elements
Parameters: - data (DataFrame) – Data to create boxplot
- x (str) – x value of boxplot
- y (str) – y value of boxplot
- boxprops (dict, optional) – Style of box, passed to matplotlib.pyplot.boxplot
- medianprops (dict, optional) – Style of median line, passed to matplotlib.pyplot.boxplot
- whiskerprops (dict, optional) – Style of whiskers, passed to matplotlib.pyplot.boxplot
- capprops (dict, optional) – Sytle of cap on top of whiskers, passed to matplotlib.pyplot.boxplot
- flierprops (dict, optional) – Style of outlier points, passed to matplotlib.pyplot.boxplot
- **kwargs – Other keyword arguments are passed through to seaborn.boxplot
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.density_rugplot(data, x, y, y_values, density_height=2, rug_height=1, density_color='black', rug_color='black', rug_alpha=0.5, figsize=[6.4, 4.8], ref_line=None, ref_line_color='black', **kwargs)[source]¶ Creates a density rugplot
first subplot is a distribution of values and subsequent subplots are rugplots of values for some discrete number of variables
Parameters: - data (DataFrame) – DataFrame with columns x and y
- x (str) – Column in data of continuous values
- y (str) – Column in data of discrete values
- y_values (list) – List of y values to include as subplots
- density_height (int, optional) – Relative height of density plot
- rug_height (int, optional) – Relative height of rug plot
- density_color (str, optional) – Color of density plot
- rug_color (str, optional) – Color of rug plot
- rug_alpha (float, optional) – Opacity of rug plot
- figsize (tuple, optional) – Size of entire figure
- ref_line (int, optional) – x value of reference line to include for all plots
- ref_line_color (str, optional) – Color of reference line
- **kwargs – Other keyword arguments are passed through to sns.rugplot
Returns: - matplotlib.figure.Figure – figure
- numpy.ndarray of matplotlib.axes.Axes – individual subplots
-
gpplot.plots.label_axes(x, color, label, text_xpos, text_ypos, text_ha, text_va)[source]¶ For use with ridgeplot, define and use a simple function to label the kde plots in axes coordinates
-
gpplot.plots.label_points(data, x, y, label, label_col, arrowstyle='-', arrow_color='black', arrow_lw=1, ax=None, **kwargs)[source]¶ Label points in a scatterplot
Parameters: - data (DataFrame) – Data to create labels
- x (str) – x position of labels
- y (str) – y position of labels
- label (list) – DataFrame elements to label
- label_col (str) – Column to match ‘label’ points
- arrowstyle (str, optional) – Style of arrow
- arrow_color (str, optional) – Color of arrow
- arrow_lw (float, optional) – Line weight of arrow
- ax (matplotlib.axes.Axes) – Plot to label
- **kwargs – Other keyword arguments are passed through to matplotlib.plt.text
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.pandas_barplot(data, x, hue, y, x_order=None, hue_order=None, horizontal=True, stacked=True, **kwargs)[source]¶ Create a barplot using pandas plot functionality
Mainly allows for stacked barplots
Parameters: - data (DataFrame) – DataFrame with columns x and y
- x (str) – x defines the discrete variable that will be plotted on the x-axis.
- hue (str) – hue defines the variable that will separate variables with the same x value.
- y (str) – y is the continuous variable defining the height of each bar
- x_order (list, optional) – order of x axis
- hue_order (list, optional) – order of colors
- horizontal (bool, optional) – whether to lay the bar plot out horizontally
- stacked (bool, optional) – whether to stack barplots
- **kwargs – passed on to Pandas’ plot function or matplotlib’s bar function
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.point_densityplot(data, x, y, bins=None, alpha=0.6, edgecolor=None, marker='o', rasterized=True, palette='viridis', legend=False, **kwargs)[source]¶ Scatter plot with points colored by density
Rasterized for easy illustrator import
Parameters: - data (DataFrame) – DataFrame with columns x and y
- x (str) – Variable to plot on the x axis
- y (str) – Variable to plot on the y axis
- bins (list of ints, optional) – Binsize for density estimate. Defaults to [20, 20]
- alpha (float, optional) – Opacity of points
- edgecolor (str, optional) – Point edge color
- marker (str, optional) – Point shape
- rasterized (bool, optional) – Whether to rasterize scatterplot
- palette (str, optional) – Color map
- legend (bool, optional) – Whether to include legend for density
- **kwargs – Additional aruments passed to scatterplot function
Returns: Return type: matplotlib.axes.Axes
-
gpplot.plots.ridgeplot(data, x, hue, aspect=5, height=1, alpha=0.7, text_xpos=0, text_ypos=0.2, text_ha='left', text_va='center', lw=0.5, **kwargs)[source]¶ Creates a ridgeplot of overlapping kde plots
Parameters: - data (DataFrame) – Dataframe with columns x and hue
- x (str) – Defines the variable that will be plotted on the x-axis
- hue (str) – Hue defines the variable that will be plotted for the rows
- aspect (int, optional) – Defines aspect ratio of FacetGrid
- height (float, optional) – Defines height of each row in FacetGrid
- alpha (float, optional) – Defines opacity of kde plot
- text_xpos (float, optional) – Specify the horizontal position of text labels
- text_ypos (float, optional) – Specify the vertical position of text labels
- text_ha (str, optional) – Specify the horizontal alignment of text labels
- text_va (str, optional) – Specify the vertical alignment of text labels
- lw (float, optional) – Specifies the linewidth for kdeplot
- **kwargs – Other keyword arguments are passed through to sns.FacetGrid
Returns: seaborn FacetGrid with ridges
Return type: sns.FacetGrid
Notes
This code is slightly modified from https://seaborn.pydata.org/examples/kde_ridgeplot
Examples
>>> iris = sns.load_dataset('iris') >>> g = gpplot.ridgeplot(iris, 'sepal_width', 'species')
gpplot.style module¶
style module. Contains functions to standardize styles for matplotlib-based plots
-
gpplot.style.savefig(path, fig=None, bbox_inches='tight', transparent=True, **kwargs)[source]¶ Wrapper function to save figures
Parameters: - fig (matplotlib.figure.Figure) – Figure to be saved
- path (str) – Location to save figure
- bbox_inches (str, optional) – Bounding box of figure
- transparent (bool, optional) – Whether to include a background for the plot
- **kwargs – Other keyword arguments are passed through to matplotlib.pyplot.savefig
-
gpplot.style.set_aesthetics(style='ticks', context='notebook', font='Arial', font_scale=1, palette=None, rc=None)[source]¶ Set aesthetics for plotting, using seaborn.set_style and matplotlib.rcParams
Parameters: - style (str, optional) – One of darkgrid, whitegrid, dark, white, ticks
- context (str, optional) – One of paper, notebook, talk, poster
- font (str, optional) – Font family
- font_scale (int, optional) – Scaling factor to scale the size of font elements
- palette (str or seaborn.color_palette, optional) – Discrete color palette to use in plots, defaults to gpplot.discrete_palette
- rc (dict, optional) – Mappings to pass to matplotlib.rcParams
Module contents¶
import gpplot imports both the plots and style modules