dcs.view

Language: Python

dcs.view is a module of the dcs package that contains functions for generating visualizations from pandas.DataFrame objects. All functions return a StringIO buffer holding a Base64 encoded PNG image of the request chart as generated by matplotlib.

dcs.view.date(df, xIndex, yIndices, options={})

Uses matplotlib to generate a time-series chart of the specified pandas.DataFrame column(s)

This function uses matplotlib.axes.Axes.plot() function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.

Note

The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.

Note

The options kwarg can be used to customize the plot and may have the following key-value pairs:

  • axis : a dict specifying the axis/window settings for the plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}.

    The values in the axis dictionary should be strings that are parseable using dateutil.parser.parse()

The function returns a dictionary with the following key-value pairs:

  • image : StringIO.StringIO – StringIO.StringIO object containing Base64 encoded PNG image of generated plot

  • axis : dict – dictionary containing axis/window settings for the generated plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}

    The values in the axis dictionary are date strings formatted using the ISO8601 date format

Parameters:
  • df (pandas.DataFrame) – data frame
  • xIndex (int) – index of column to plot on x-axis
  • yIndices (list<int>, optional) – indices of columns to plot on y-axis
  • options (dict, optional) – options dictionary
Returns:

dictionary containing image and axis settings

Return type:

dict

dcs.view.filterWithSearchQuery(df, columnIndices, query, isRegex=False)

Filters the rows of pandas.DataFrame object that match a pattern in the specified column(s), returning a pandas.DataFrame object containing the search results

The search can be performed with a regular expression.

Note

If filtering with multiple columns, a row is considered a match if the pattern occurs in any of the specified columns.

Note

The search is performed on the string representation of the column, meaning a floating point column with value 2 will match the pattern '2.0'

Parameters:
  • df (pandas.DataFrame) – data frame
  • columnIndices (list<int>) – indices of columns to include in search
  • query (str) – search query or regular expression
  • isRegex (bool, optional) – must be set to True if searching using regular expression
Returns:

data frame containing search results (all columns included, not just search columns)

Return type:

pandas.DataFrame

dcs.view.frequency(df, columnIndex, options={})

Uses matplotlib to generate a horizontal frequency bar chart of the specified pandas.DataFrame column

This function uses the pandas.Series.value_counts() method (or dcs.analyze.textAnalysis`['word_frequencies'] if plotting word frequency) to get the (value, frequency) tuples for the specified column. A horizontal bar chart is generated with the :func:`matplotlib.axes.Axes.barh() function, and the chart is exported to a PNG image and then encoded into a string using Base64.

Note

The options kwarg can be used to customize the plot and may have the following key-value pairs:

  • useWords : a bool flag which may be set to True to plot word frequencies instad of row value frequencies for a string column
  • cutoff : an int specifying the top n values by frequency to plot, default is 50, maximum is 50

The function returns a dictionary with the following key-value pairs:

  • image : StringIO.StringIO – StringIO.StringIO object containing Base64 encoded PNG image of generated plot
Parameters:
  • df (pandas.DataFrame) – data frame
  • columnIndices (list<int>) – indices of columns to plot
  • options (dict, optional) – options dictionary
Returns:

dictionary containing image

Return type:

dict

dcs.view.histogram(df, columnIndices, options={})

Uses matplotlib to generate a histogram of the specified pandas.DataFrame column(s)

The function supports multiple columns. The function uses the matplotlib.axes.Axes.hist() function to plot a histogram, exports the generated chart to a PNG image and encodes the image into a string using Base64.

Note

The options kwarg can be used to customize the plot and may have the following key-value pairs:

  • numberOfBins : an int directly passed to bins argument of matplotlib.axes.Axes.hist()
  • axis : a dict specifying the axis/window settings for the plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}

The function returns a dictionary with the following key-value pairs:

  • image : StringIO.StringIO – StringIO.StringIO object containing Base64 encoded PNG image of generated plot
  • axis : dict – dictionary containing axis/window settings for the generated plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters:
  • df (pandas.DataFrame) – data frame
  • columnIndices (list<int>) – ‘indices of columns to plot’
  • options (dict, optional) – options dictionary
Returns:

dictionary containing image and axis settings

Return type:

dict

dcs.view.line(df, xIndex, yIndices, options={})

Uses matplotlib to generate a line chart of the specified pandas.DataFrame column(s)

This function uses matplotlib.axes.Axes.plot() function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.

Note

The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.

Note

The options kwarg can be used to customize the plot and may have the following key-value pairs:

  • axis : a dict specifying the axis/window settings for the plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}

The function returns a dictionary with the following key-value pairs:

  • image : StringIO.StringIO – StringIO.StringIO object containing Base64 encoded PNG image of generated plot
  • axis : dict – dictionary containing axis/window settings for the generated plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters:
  • df (pandas.DataFrame) – data frame
  • xIndex (int) – index of column to plot on x-axis
  • yIndices (list<int>, optional) – indices of columns to plot on y-axis
  • options (dict, optional) – options dictionary
Returns:

dictionary containing image and axis settings

Return type:

dict

dcs.view.scatter(df, xIndex, yIndices, options={})

Uses matplotlib to generate a scatter plot of the specified pandas.DataFrame column(s)

This function uses matplotlib.axes.Axes.scatter() function to generate a scatter plot, exports the chart to a PNG image and encodes the image into a string using Base64.

The function also performs linear regression using scipy.stats.linregress() to plot a linear trend-line and compute an R2 value for each y-axis column. The R2 value, along with the Pearson correlation p-value computed with scipy.stats.pearsonr(), is then rendered next to the trend-line.

Note

The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.

Note

The options kwarg can be used to customize the plot and may have the following key-value pairs:

  • axis : a dict specifying the axis/window settings for the plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}

The function returns a dictionary with the following key-value pairs:

  • image : StringIO.StringIO – StringIO.StringIO object containing Base64 encoded PNG image of generated plot
  • axis : dict – dictionary containing axis/window settings for the generated plot with the structure {'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters:
  • df (pandas.DataFrame) – data frame
  • xIndex (int) – index of column to plot on x-axis
  • yIndices (list<int>, optional) – indices of columns to plot on y-axis
  • options (dict, optional) – options dictionary
Returns:

dictionary containing image and axis settings

Return type:

dict