LERS is a knowledge discovery system based on rough set theory. The system induces rules from databases. Rule induction is based on four different algorithms. Two algorithms are use induction rules in a minimal discriminant from, remaining two are used for induction of all potential rules hidden in a database. Input data may be imperfect: data may contain errors, numerical attributes, missing attribute values and inconsistencies. LERS classifies new, unseen cases using principles similar to the bucket brigade algorithm. LERS is also equipped with a tool for multiple-fold cross validation. The system has been used in a medical area, nursing, global warming, environmental protection, natural language and data transmission. LERS may process big datasets and frequently outperforms not only other rule induction systems but also human experts.
The system LERS (Learning from Examples based on Rough Sets) was developed under the guidance of Jerzy Grzymala-Busse at the University of Kansas. The first implementation of the system was ready in 1988.
Input data to LERS is any database in the format of text (i.e., saved as an ASCII file). The only modification is the addition of a header. The header contains two lists. The first list is a declaration of variables. It contains as many symbols as variables in the database, and two additional characters: "<" at he beginning of the list and ">" at the end of the list. Attributes are denoted by as and decisions by ds. The first list may contain the letter x (corresponding variable and variable values will be ignored in all subsequent computations). The second line, starting from "[" and ending with "]", contains names of all variables. Missing values of attributes may be denoted by any special symbols, e.g., by question marks.
LERS is based on rough set theory, for inconsistent data it induces two sets of rules: certain rule set and possible rule set. The first set is computed from lower approximations of concepts, the second from upper approximations. Every rule is preceded by three numbers: specificity (the total number of attribute-value pairs on the left-hand side of the rule), strength (the total number of examples correctly classified by the rule during training), and the total number of training cases matching the left-hand side of the rule.
The first option available for the LERS user is using a tool for checking errors in the input data. Recognized errors are: numerical values out of the range, mixing symbolic values with numbers in the domain of the same attribute, and missing values (not denoted by special characters).
The next possibility of preprocessing is using one of several possible approaches to handling missing attribute values. For example, cases with missing attribute values may be ignored during rule induction, may be replaced by all existing attribute values for the attribute, or may be treated as special symbols.
If the input data have numerical attributes, one of many LERS tools for discretization should be used. LERS uses global approaches for discretization, i.e., all numerical attributes are discretized simultaneously. One may use a method based on cluster analysis, or global versions of methods based on equal interval width, equal frequency, or minimal entropy.
In general, LERS uses two different approaches to rule induction: one is used in machine learning, the other in knowledge acquisition. In machine learning, or more specifically, in learning from examples, the usual task is to learn discriminant description, i.e., to learn the smallest set of minimal rules, describing the concept. To accomplish this goal LERS uses two algorithms: LEM1 and LEM2 (LEM1 and LEM2 stand for Learning from Examples Module, version 1 and 2, respectively). Both algorithms are described in. The option LEM2 of LERS is most frequently used since-in most cases-it gives best results.
On the other hand, if the user wants to induce rules using an approach based on knowledge acquisition, two additional options of LERS are available. The first option computes global coverings of required size and then rules from all of these coverings using similar ideas as in LEM1. The other option is similar to LEM2 (both are local). All rules in minimal form that can be derived from the input data are computed. In both options used algorithms are of exponential time complexity, with respect to the number of attributes.
A classification system uses the rule set to classify new cases. In general, the classification system classifies testing data using the rule set induced from training data. The new classification system of LERS is a modification of the bucket brigade algorithm [m]. The decision to which concept an example belongs is made on the basis of four factors: strength, specificity, matching factor, and support. Support is defined as the sum of scores of all matching rules from the concept. The concept C for which the support, i.e., the following expression is the largest is the winner and the example is classified as being a member of C. LERS is fully automated. However, the user may have some input first by selecting options of the system, and in some cases, e.g., during error recognition or inducing rules from all global coverings, by a direct interaction with the system.
In LERS it is assumed that the rule set should be used automatically by a classification subsystem of LERS. Every new case is classified by this subsystem. On the other hand, induced rules are available and comprehensible by the user. Thus, it is possible to use rules manually, like in other systems.
The performance of the LERS system is fully comparable with performance of AQ15 and C4.5. Advantages of the system are: a big family of discretization methods and many possible approaches for missing attribute values. Another advantage is a sound approach (based on rough set theory) to inconsistencies in input data. A disadvantage of the system is that-for time being-it is dispersed among a few programs. Moreover, the code was written having a specific platform: DEC Alpha ASP machines in mind.