cart

A scenario discovery oriented implementation of CART. It essentially is a wrapper around scikit-learn’s version of CART.

class ema_workbench.analysis.cart.CART(x, y, mass_min=0.05, mode=RuleInductionType.BINARY)

CART algorithm

can be used in a manner similar to PRIM. It provides access to the underlying tree, but it can also show the boxes described by the tree in a table or graph form similar to prim.

Parameters:
  • x (DataFrame) –

  • y (1D ndarray) –

  • mass_min (float, optional) – a value between 0 and 1 indicating the minimum fraction of data points in a terminal leaf. Defaults to 0.05, identical to prim.

  • mode ({BINARY, CLASSIFICATION, REGRESSION}) – indicates the mode in which CART is used. Binary indicates binary classification, classification is multiclass, and regression is regression.

boxes

list of DataFrame box lims

Type:

list

stats

list of dicts with stats

Type:

list

Notes

This class is a wrapper around scikit-learn’s CART algorithm. It provides an interface to CART that is more oriented towards scenario discovery, and shared some methods with PRIM

See also

prim

property boxes

rtype: list with boxlims for each terminal leaf

build_tree()

train CART on the data

show_tree(mplfig=True, format='png')

return a png (defaults) or svg of the tree

On Windows, graphviz needs to be installed with conda.

Parameters:
  • mplfig (bool, optional) – if true (default) returns a matplotlib figure with the tree, otherwise, it returns the output as bytes

  • format ({'png', 'svg'}, default 'png') – Gives a format of the output.

property stats

rtype: list with scenario discovery statistics for each terminal leaf

ema_workbench.analysis.cart.setup_cart(results, classify, incl_unc=None, mass_min=0.05)

helper function for performing cart in combination with data generated by the workbench.

Parameters:
  • results (tuple of DataFrame and dict with numpy arrays) – the return from perform_experiments().

  • classify (string, function or callable) – either a string denoting the outcome of interest to use or a function.

  • incl_unc (list of strings, optional) –

  • mass_min (float, optional) –

Raises:

TypeError – if classify is not a string or a callable.