dcs.view¶
Language: Python
dcs.view
is a module of the dcs
package that contains functions for generating visualizations from pandas.DataFrame
objects. All functions return a StringIO
buffer holding a Base64 encoded PNG image of the request chart as generated by matplotlib
.
-
dcs.view.
date
(df, xIndex, yIndices, options={})¶ Uses
matplotlib
to generate a time-series chart of the specifiedpandas.DataFrame
column(s)This function uses
matplotlib.axes.Axes.plot()
function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
axis : a
dict
specifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
.The values in the axis dictionary should be strings that are parseable using
dateutil.parser.parse()
The function returns a dictionary with the following key-value pairs:
image : StringIO.StringIO –
StringIO.StringIO
object containing Base64 encoded PNG image of generated plotaxis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The values in the axis dictionary are date strings formatted using the
ISO8601 date format
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type:
-
dcs.view.
filterWithSearchQuery
(df, columnIndices, query, isRegex=False)¶ Filters the rows of
pandas.DataFrame
object that match a pattern in the specified column(s), returning apandas.DataFrame
object containing the search resultsThe search can be performed with a regular expression.
Note
If filtering with multiple columns, a row is considered a match if the pattern occurs in any of the specified columns.
Note
The search is performed on the string representation of the column, meaning a floating point column with value
2
will match the pattern'2.0'
Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – indices of columns to include in search
- query (str) – search query or regular expression
- isRegex (bool, optional) – must be set to
True
if searching using regular expression
Returns: data frame containing search results (all columns included, not just search columns)
Return type:
-
dcs.view.
frequency
(df, columnIndex, options={})¶ Uses
matplotlib
to generate a horizontal frequency bar chart of the specifiedpandas.DataFrame
columnThis function uses the
pandas.Series.value_counts()
method (ordcs.analyze.textAnalysis`['word_frequencies'] if plotting word frequency) to get the (value, frequency) tuples for the specified column. A horizontal bar chart is generated with the :func:`matplotlib.axes.Axes.barh()
function, and the chart is exported to a PNG image and then encoded into a string using Base64.Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- useWords : a
bool
flag which may be set toTrue
to plot word frequencies instad of row value frequencies for a string column - cutoff : an
int
specifying the top n values by frequency to plot, default is 50, maximum is 50
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIO
object containing Base64 encoded PNG image of generated plot
Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – indices of columns to plot
- options (dict, optional) – options dictionary
Returns: dictionary containing image
Return type: - useWords : a
-
dcs.view.
histogram
(df, columnIndices, options={})¶ Uses
matplotlib
to generate a histogram of the specifiedpandas.DataFrame
column(s)The function supports multiple columns. The function uses the
matplotlib.axes.Axes.hist()
function to plot a histogram, exports the generated chart to a PNG image and encodes the image into a string using Base64.Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- numberOfBins : an
int
directly passed to bins argument ofmatplotlib.axes.Axes.hist()
- axis : a
dict
specifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIO
object containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – ‘indices of columns to plot’
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - numberOfBins : an
-
dcs.view.
line
(df, xIndex, yIndices, options={})¶ Uses
matplotlib
to generate a line chart of the specifiedpandas.DataFrame
column(s)This function uses
matplotlib.axes.Axes.plot()
function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- axis : a
dict
specifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIO
object containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - axis : a
-
dcs.view.
scatter
(df, xIndex, yIndices, options={})¶ Uses
matplotlib
to generate a scatter plot of the specifiedpandas.DataFrame
column(s)This function uses
matplotlib.axes.Axes.scatter()
function to generate a scatter plot, exports the chart to a PNG image and encodes the image into a string using Base64.The function also performs linear regression using
scipy.stats.linregress()
to plot a linear trend-line and compute an R2 value for each y-axis column. The R2 value, along with the Pearson correlation p-value computed withscipy.stats.pearsonr()
, is then rendered next to the trend-line.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- axis : a
dict
specifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIO
object containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - axis : a