dcs.view¶
Language: Python
dcs.view is a module of the dcs package that contains functions for generating visualizations from pandas.DataFrame objects. All functions return a StringIO buffer holding a Base64 encoded PNG image of the request chart as generated by matplotlib.
-
dcs.view.date(df, xIndex, yIndices, options={})¶ Uses
matplotlibto generate a time-series chart of the specifiedpandas.DataFramecolumn(s)This function uses
matplotlib.axes.Axes.plot()function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
axis : a
dictspecifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}.The values in the axis dictionary should be strings that are parseable using
dateutil.parser.parse()
The function returns a dictionary with the following key-value pairs:
image : StringIO.StringIO –
StringIO.StringIOobject containing Base64 encoded PNG image of generated plotaxis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}The values in the axis dictionary are date strings formatted using the
ISO8601 date format
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type:
-
dcs.view.filterWithSearchQuery(df, columnIndices, query, isRegex=False)¶ Filters the rows of
pandas.DataFrameobject that match a pattern in the specified column(s), returning apandas.DataFrameobject containing the search resultsThe search can be performed with a regular expression.
Note
If filtering with multiple columns, a row is considered a match if the pattern occurs in any of the specified columns.
Note
The search is performed on the string representation of the column, meaning a floating point column with value
2will match the pattern'2.0'Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – indices of columns to include in search
- query (str) – search query or regular expression
- isRegex (bool, optional) – must be set to
Trueif searching using regular expression
Returns: data frame containing search results (all columns included, not just search columns)
Return type:
-
dcs.view.frequency(df, columnIndex, options={})¶ Uses
matplotlibto generate a horizontal frequency bar chart of the specifiedpandas.DataFramecolumnThis function uses the
pandas.Series.value_counts()method (ordcs.analyze.textAnalysis`['word_frequencies'] if plotting word frequency) to get the (value, frequency) tuples for the specified column. A horizontal bar chart is generated with the :func:`matplotlib.axes.Axes.barh()function, and the chart is exported to a PNG image and then encoded into a string using Base64.Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- useWords : a
boolflag which may be set toTrueto plot word frequencies instad of row value frequencies for a string column - cutoff : an
intspecifying the top n values by frequency to plot, default is 50, maximum is 50
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIOobject containing Base64 encoded PNG image of generated plot
Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – indices of columns to plot
- options (dict, optional) – options dictionary
Returns: dictionary containing image
Return type: - useWords : a
-
dcs.view.histogram(df, columnIndices, options={})¶ Uses
matplotlibto generate a histogram of the specifiedpandas.DataFramecolumn(s)The function supports multiple columns. The function uses the
matplotlib.axes.Axes.hist()function to plot a histogram, exports the generated chart to a PNG image and encodes the image into a string using Base64.Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- numberOfBins : an
intdirectly passed to bins argument ofmatplotlib.axes.Axes.hist() - axis : a
dictspecifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIOobject containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- columnIndices (list<int>) – ‘indices of columns to plot’
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - numberOfBins : an
-
dcs.view.line(df, xIndex, yIndices, options={})¶ Uses
matplotlibto generate a line chart of the specifiedpandas.DataFramecolumn(s)This function uses
matplotlib.axes.Axes.plot()function to plot a line chart, exports the chart to a PNG image and encodes the image into a string using Base64.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- axis : a
dictspecifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIOobject containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - axis : a
-
dcs.view.scatter(df, xIndex, yIndices, options={})¶ Uses
matplotlibto generate a scatter plot of the specifiedpandas.DataFramecolumn(s)This function uses
matplotlib.axes.Axes.scatter()function to generate a scatter plot, exports the chart to a PNG image and encodes the image into a string using Base64.The function also performs linear regression using
scipy.stats.linregress()to plot a linear trend-line and compute an R2 value for each y-axis column. The R2 value, along with the Pearson correlation p-value computed withscipy.stats.pearsonr(), is then rendered next to the trend-line.Note
The function supports plotting multiple columns with respect to one axis, but the number of columns should be limited to 6 for optimal color assignment of the plot points.
Note
The options kwarg can be used to customize the plot and may have the following key-value pairs:
- axis : a
dictspecifying the axis/window settings for the plot with the structure{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
The function returns a dictionary with the following key-value pairs:
- image : StringIO.StringIO –
StringIO.StringIOobject containing Base64 encoded PNG image of generated plot - axis : dict – dictionary containing axis/window settings for the generated plot with the structure
{'x': {'start': x-axis min, 'end': x-axis max}, 'y': {'start': y-axis min, 'end': y-axis max}}
Parameters: - df (pandas.DataFrame) – data frame
- xIndex (int) – index of column to plot on x-axis
- yIndices (list<int>, optional) – indices of columns to plot on y-axis
- options (dict, optional) – options dictionary
Returns: dictionary containing image and axis settings
Return type: - axis : a