TRANCE is a system for generating rough models of data. A rough model consists a partition of the data set into a number of clusters witch are labelled with decision. The system automatically searches for participation witch optimize a pre-defined performance measure. The conceptual simplicity of the underlying models makes their interpretation very simple. They also used for extracting relevant rules. Low complexity of the involved algorithms allows for automatic (or semi-automatic) discovery of relevant attributes.
The current version of TRANCE can handle only numerical data tables which are provided as text files.
The system generates rough data models which are represented in the form of tables (list of clusters), plots (various performance measures, e.g., gain and response curves), and lists of rules.
TRANCE is a system that supports the process of constructing and evaluating rough data models. Basically, it can be used for processing numeric-values data tables. It has been implemented in the MATLAB system-a general purpose system for matrix computations and visualization. TRANCE contains a number of modules that cover different stages of model construction process. The Data Pre-Processing Module contains numerous procedures (in MATLAB’s terminology: m-files) that are useful for data pre-processing: procedures for outlines, for discretizing attributes, for statistical data analysis, for processing time series (rescaling, smoothing, aggregation), etc. The central part of the system is the Search Module. It is responsible for searching through the pre-defined space of models for an optimal one. Results of the module are passed to the various performance measures. Finally, the Rule Extraction Module generates rules from the best model. All modules are integrated by the MATLAB command interpreter which takes care of handling procedure calls and parameter passing. Small data sets can be processed interactively; bigger sets are processed in batch mode.
TRANCE tires to find partitions of the universe which optimize certain criterion. The objective function is usually expressed in terms of local properties of the resulting classification model (e.g., the highest classification rate of decision rules which cover at least 5% of all cases). The system uses either systematic or local search. It makes no use of background knowledge.
The user of the system should know basic concepts of data mining and rough data models. Moreover, (s)he should be able to write simple scripts in MATLAB.
TRANCE is suitable for processing huge data sets with millions of records. It has been successfully applied in the field of marketing and finance to tasks such as: fraud detection, modelling customer behavior, retention, attrition. Due to the conceptual simplicity of the underlying models, results generated with TRANCE are easy to interpret. Low complexity of the involved algorithms allows for automatic (or semi-automatic) discovery of relevant attributes. On the other hand, models constructed with TRANCE are usually based on a few (3-5) attributes which may result in a poor accuracy. Another serious limitation of TRANCE is its inability of processing non-numerical data. Current research focuses on the elimination of these two drawbacks. The system and some of its applications are described in this book.