Error message

  • Notice: Undefined variable: _SESSION in tracking_init() (line 27 of C:\xampp\htdocs\rsds\sites\all\modules\rsds\tracking\tracking.module).
  • Warning: file_get_contents(http://user-agent-string.info/rpc/get_data.php?key=free&format=ini&ver=y): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in UASparser->get_contents() (line 247 of C:\xampp\htdocs\rsds\sites\all\modules\rsds\tracking\UASparser\UASparser.php).
  • Warning: file_get_contents(http://user-agent-string.info/rpc/get_data.php?key=free&format=ini): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in UASparser->get_contents() (line 247 of C:\xampp\htdocs\rsds\sites\all\modules\rsds\tracking\UASparser\UASparser.php).

SOFTWARE

Browse software:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Templates
Chair of Computer Science Foundations, University of Information Technology and Management, Rzeszow, POLAND


Abstract

The Templates system is a software system, to support a kind of analysis of temporal data. During the analysis temporal templates are discovered in temporal input data as well as regularities between them expressed in the form of production rules IF THEN. It is possible to simulate both unsupervised and supervised lerning processes. To estimate quality of discovered knowledge a classifier is implemented.

Introduction

One of the aspects of data mining process is the analysis of the data that change in time, that is so-called temporal data. The need of knowledge discovering from such type of data sets arises in many fields of human activity, e.g. in computer science, economy, medicine, etc. The Templates system supports such kind of data analysis by means of temporal templates finding and reasoning from temporal templates what has a form of producion rules IF THEN. The system is prepared to make possible new algorithms implementation and verification of their relevance in case of real-life data. The Templates system is implemented in Delphi 7.0 language. It is running on IBM PC platform under MS Windows operating system.

System features and possibilites

The Templates system supports analysis of data collected in the form of information table or decision table including temporal information table and temporal decision table. Temporal table is a table whose rows (containing information about values of attributes of one or more objects) are ordered in time. The main use of the Templates system is the exploration of temporal tables and discovering temporal templates or sequential dependencies between those templates. Sequential dependency is understood as describing order of occurrence. The system makes also possible the analysis of static data. The dependencies discovered by the system have a form of production rules IF THEN. A rule-based classifier is implemented to enable checking usefulness of the rules in the process of the prediction of new cases. The system presents the outcome of rules testing in the form of a confusion matrix. Generally, the system allows to execute automatically the following types of data exploration:
  • process of supervised learning from static data which is based on discovering decision rules IF THEN in decision tables;
  • process of unsupervised learning from data gathered in temporal information tables which is based on discovering temporal templates and rules IF THEN between them;
  • process of supervised learning from data gathered in temporal decision tables by discovering templates that are characteristic for objects with fixed decision, or by discovering temporal sequences of templates;
  • verification of discovered knowledge in each of the above processes.
Process of supervised learning from static data
This kind of data exploration is accessory in the system. Hence only an inconspicuous set of tools is available. They are limited to the algorithm of decision rules generation which benefits from the rough set methodology. It is an exhaustive algorithm based on Boolean reasoning. Furthermore, it is possible to check (estimate) the usefulness of the rules in predicting new objects or describing known ones. The outcome is presented in the form of a confusion matrix. Additionally, generation of decision rules for incomplete decision systems is available, however lacking values are then treated as not belonging to the domains of attributes. The lacking values do not occur in the rules discovered. The algorithm of decision rules induction for the type of data described works correctly if the number of columns in the input file does not exceed 31. However, its speed is its advantage in comparison with the algorithms generating the same set of rules which are implemented in the Rosetta system or the RSES system.

Process of unsupervised learning from temporal data
The kind of data analysis considered here is appropriate for data describing values of attributes of one or more objects changing in time. In the latter case, recorded states of individual objects have to be ordered according to global time, i.e., time common for all objects. That means that the first m rows of the input file with data collected in the time interval [ts, te] and concerning m objects include information about each of the m objects recorded at time ts. Similarly, the next m rows include information about m objects recorded at time ts+1 and so on until te. In the form it is implemented in the system one can distinguish two steps of the process of unsupervised learning from temporal data: discovering temporal templates among input data and discovering knowledge about templates discovered. The result of the former stage is a time series of templates discovered by the system in the relevant parts of input data. The user influences the outcome of the stage by establishing values of some parameters that characterize discovered templates. During the second stage of the process described the IF THEN rules are generated. They reflect regularities that occur in the sequence of temporal templates. At this stage of the learning process the user has to decide how many consecutive terms of the time series of templates should be considered during rules induction. The system makes it possible to check the quality of the rules induced. The quality coefficient expresses their usefulness to predict a type of template following some sequences of templates. It is estimated using of the test data chosen by the user. The usefulness is presented in the form of confusion matrix. Rules generated during an unsupervised learning process may predict template occurrence correctly, incorrectly or partly correctly. A situation is regarded as a partly correct prediction if the predicted template is included in the actual template. The confusion matrix implemented for this kind of data analysis presents both correct and partly correct prediction. If a need arises conflicts of templates prediction are solved with the use of support and match coefficients that characterize every rule found.

Process of supervised learning from temporal data
The input file has to contain information about objects from different decision classes. In contrast to the format of the input file for the many-object version of the unsupervised learning process, in this case the states (changing in time) of objects should be ordered in a decision table according to the local time for every object. It means that data in the input file concerning individual objects has to be placed sequentially for every object. The penultimate column of the input file must contain numbers of objects and the last column must contain values of decision. The process of learning starts in the same way as in the case of unsupervised learning, i.e., with discovering temporal templates among input data and ordering them as a time series. The system abbreviates the templates according to the following rule:
  • if a template (discovered among input data) is characteristic of one (or some) decision classes (i.e., it does not occur for objects from another decision classes) then every template obtainded by abbreviating the original is characteristic of the same decision classes, too;
  • abbreviated templates are minimal (with respect to inclusion) with this property.
The user may use those templates to recognize decision classes of new objects in the prcess of testing. Apart from that kind of knowledge the system allows to choose the option of finding sequences of templates that are characteristic for one (or some) decision classes. Such sequences can be tested from the point of view of their ability to predict decision of unknown objects. The outcome of tests is presented in a confusion matrix.

GUI

Advantages of the Templates system include an intuitive and friendly interface patterned on the popular Rosetta system.

Input data

The system enables loading data from .txt files containing an information or decision table. The data have to be integer numbers. Moreover the input files have to contain the following information enabling the system to choose a proper way of analysis:
  • completeness or incompleteness of the data;
  • temporal or static character of the data;
  • whether the table is an information one or a decision one;
  • type of time (global or local) according to which data are ordered.
Applications

The Templates system is primarily intended to research and education purposes.

Future plans

The Templates system will be extended on possibility of rules or templates filtering, to increase interactive character of the system. Moreover, it is planned to make possibility to work with the system in the off-line mode what is understood as designing of a tree of consecutive algorithms and then single execution.

Acknowledgements

Development of the Templates system has been partially supported by the grant No. 3 T11C 005 28 from Ministry of Scientific Research and Information Technology of the Republic of Poland.