next up previous contents
Next: BCW and K Up: Model spaces Previous: RPDAGs   Contents


C&RTs

Classification and Regression trees are binary decision trees that predict a response $Y$ from a set of features $X$. If $Y$ is qualitative, then the tree is a classification tree, whereas for quantitative $Y$ we get a regression tree.

The programs implementing experiments over C&RTs are in carts/. We have implemented two priors based on work done independently by [Chipman HChipman H1998], [Denison H. A.Denison H. A.1998] and later by further developed in [Denison, Holmes, Mallick, SmithDenison et al.2002]. Three datasets from the two papers are included: The Wisconsin breast cancer data (BCW), the Kyphosis dataset (K) and the Pima Indians (PIMA) Diabetes dataset (ftp://ftp.ics.uci.edu/pub/machine-learning-databases/pima-indians-diabetes/). BCW contains 16 missing data values. Following [Chipman HChipman H1998] we have simply deleted datapoints which contain missing values. In both cases the machine learning task is binary classification using integer-valued predictors--nine predictors in the case of BCW, and three for K. All splits are binary: made by splitting on some threshold. There are 683 datapoints for BCW and 81 for K.



Subsections

Nicos Angelopoulos 2008-06-02