cart

A scenario discovery oriented implementation of CART. It essentially is a wrapper around scikit-learn’s version of CART.

ema_workbench.analysis.cart.setup_cart(results, classify, incl_unc=None, mass_min=0.05)

helper function for performing cart in combination with data generated by the workbench.

Parameters:
  • results (tuple of DataFrame and dict with numpy arrays) – the return from perform_experiments().
  • classify (string, function or callable) – either a string denoting the outcome of interest to use or a function.
  • incl_unc (list of strings, optional) –
  • mass_min (float, optional) –
Raises:

TypeError – if classify is not a string or a callable.

class ema_workbench.analysis.cart.CART(x, y, mass_min=0.05, mode=<RuleInductionType.BINARY: 'binary'>)

CART algorithm

can be used in a manner similar to PRIM. It provides access to the underlying tree, but it can also show the boxes described by the tree in a table or graph form similar to prim.

Parameters:
  • x (DataFrame) –
  • y (1D ndarray) –
  • mass_min (float, optional) – a value between 0 and 1 indicating the minimum fraction of data points in a terminal leaf. Defaults to 0.05, identical to prim.
  • mode ({BINARY, CLASSIFICATION, REGRESSION}) – indicates the mode in which CART is used. Binary indicates binary classification, classification is multiclass, and regression is regression.
boxes

list of DataFrame box lims

Type:list
stats

list of dicts with stats

Type:list

Notes

This class is a wrapper around scikit-learn’s CART algorithm. It provides an interface to CART that is more oriented towards scenario discovery, and shared some methods with PRIM

See also

prim

boxes

Property for getting a list of box limits

build_tree()

train CART on the data

show_tree(mplfig=True, format='png')

return a png of the tree

Parameters:
  • mplfig (bool, optional) – if true (default) returns a matplotlib figure with the tree, otherwise, it returns the output as bytes
  • format ({'png', 'svg'}, default 'png') – Gives a format of the output.
stats

property for getting a list of dicts containing the statistics for each box