cart

A scenario discovery oriented implementation of CART. It essentially is a wrapper around scikit-learn’s version of CART.

class ema_workbench.analysis.cart.CART(x, y, mass_min=0.05, mode=RuleInductionType.BINARY)

CART algorithm

can be used in a manner similar to PRIM. It provides access to the underlying tree, but it can also show the boxes described by the tree in a table or graph form similar to prim.

Parameters
  • x (DataFrame) –

  • y (1D ndarray) –

  • mass_min (float, optional) – a value between 0 and 1 indicating the minimum fraction of data points in a terminal leaf. Defaults to 0.05, identical to prim.

  • mode ({BINARY, CLASSIFICATION, REGRESSION}) – indicates the mode in which CART is used. Binary indicates binary classification, classification is multiclass, and regression is regression.

boxes

list of DataFrame box lims

Type

list

stats

list of dicts with stats

Type

list

Notes

This class is a wrapper around scikit-learn’s CART algorithm. It provides an interface to CART that is more oriented towards scenario discovery, and shared some methods with PRIM

See also

prim

property boxes

rtype: list with boxlims for each terminal leaf

build_tree()

train CART on the data

show_tree(mplfig=True, format='png')

return a png (defaults) or svg of the tree

On Windows, graphviz needs to be installed with conda.

Parameters
  • mplfig (bool, optional) – if true (default) returns a matplotlib figure with the tree, otherwise, it returns the output as bytes

  • format ({'png', 'svg'}, default 'png') – Gives a format of the output.

property stats

rtype: list with scenario discovery statistics for each terminal leaf

ema_workbench.analysis.cart.setup_cart(results, classify, incl_unc=None, mass_min=0.05)

helper function for performing cart in combination with data generated by the workbench.

Parameters
  • results (tuple of DataFrame and dict with numpy arrays) – the return from perform_experiments().

  • classify (string, function or callable) – either a string denoting the outcome of interest to use or a function.

  • incl_unc (list of strings, optional) –

  • mass_min (float, optional) –

Raises

TypeError – if classify is not a string or a callable.