Demands on models for performance and reliability analysis are growing constantly, since due to the increasing complexity, experiments with real systems are either not realisable or would require an unacceptable effort. Moreover the demands on the quality of the results are rising simultaneously, so that real processes and the associated correlations have to be modelled sufficiently accurate. Existing approaches for fitting the parameters of a MAP to real traces still have some shortcomings impeding their use in practical applications. The key activities described below aim at eliminating or at least reducing these shortcomings with the objective to make parameter fitting of MAPs similarly efficient and robust as it is possible with methods developed in the recent years for the fitting of phase type distributions.
The second project term is devoted to the extension of parameter fitting methods for MAPs to RAPs. The goal is to extend the class of numerically analyzable models beyond Markovian models on the basis of known analysis techniques.
Characterisation of statistical measures for MAP fitting
Two main approaches are known for modelling real processes on the basis of traces. Either the entire trace can be used or some statistical measures can be computed from the trace that should be captured by a MAP as exactly as possible. When working with the entire trace all information available is used. EM-algorithms that maximize the likelihood-function belong to this class of techniques. Such an approach is not practicable though, as real traces from computer networks have to contain several millions of entries to capture relevant characteristics. An EM-algorithm using the entire trace has to consider all entries of the trace to compute the likelihood-function in each iteration step. Apart from numerical problems this leads to a tremendous run-time even for one iteration step. As an alternative solution only statistical measures are computed from a trace. Moments, values of the distribution function and quantiles are used for the fitting of the distribution function. For capturing correlation essentially the autocorrelation coefficient is fitted. For this type of approach the trace is only processed once in order to compute the statistical measures, but is afterwards no longer needed for fitting.
Aggregation techniques can reduce the number of entries in a trace significantly while preserving the statistical properties. Thus there is a methodology available for the fitting of the distribution that can represent the necessary information of a trace in a compact form.
For MAP fitting we have a different situation. Experiments show, that the fitting of low order autocorrelation coefficients is not sufficient. Moreover no aggregation procedure exists that preserves the correlations of the trace.
Therefore we intend to analyse available traces in an empirical study, in order to find suitable statistical measures describing the effects of correlation. For this purpose the behaviour of models with real processes as input should be analysed via trace driven simulation. Subsequently the same models are run with MAPs that capture some statistical measures of the trace. There is hope that in this way, similar to the fitting of the distribution, relevant parameters can be found and approaches for the aggregation of traces may be developed.
Algorithms for fitting parameters of a MAP
By now there are various methods to fit the parameters of a MAP to measured data. Performing the fitting of distribution function and correlation in two steps has shown to yield good results: The definition of a general optimisation problem with constraints and the sequential fitting of distribution function and coefficients of correlation, as well as the treatment of the problem as a multi-criteria optimization problem, have resulted in more efficient algorithms than used before.
These two approaches for the fitting of the distribution function and the coefficients of correlation of MAPs should be combined to one hybrid algorithm. It seems reasonable to modify in first iteration steps both matrices via evolutionary algorithms and apply other optimisation techniques for fine adjustment afterwards.
Using MAPs in Simulation Models
Up to now MAPs are used primarily for analytical models. Since MAPs can capture correlations well, it seems reasonable to use them also in simulation. Yet, the following issues remain to be solved:
Most simulation tools are limited to the specification of different distributions, allowing only for independent and identically distributed interarrival times, but do not offer the specification of stochastic processes.
Parameterisation of MAPs is complex, and specifying the processes manually is laborious.
Stochastic processes such as autoregressive processes that have been known for a long time are used for simulation by specialists. But even in common software-tools for input modelling like ExpertFit or Arena Input Analyzer no method is available for fitting parameters of such processes. MAPs as a younger area of research neither have attracted interest yet nor, to the best of our knowledge, have been compared to autoregressive models.
The main goal is to overcome the aforementioned obstacles in the application of MAPs in simulation. Moreover, the potential of MAPs should be evaluated in comparison to other processes.
Aggregation of MAPs for the Computation of Bounds
One of the strengths of MAPs is that the superposition of MAPs and also the departure process of servers with a MAP-arrival process and MAP-service time are again a MAP. Unfortunately the number of states of the resulting MAPs increases exponentially with the number of considered MAPs, or is infinite, like in case of servers with infinite queue capacity. This means that the resulting MAPs are not manageable any more. The objective is to develop a method which allows for the aggregation of the states of a MAP in such a way that the resulting MAP provides upper and lower bounds for the departure process. Using a special partial stochastic order for the states of a Markov process then starting with the stochastically largest state ensures that all transient and stationary probability measures are at least as large as the measure obtained when starting from a stochastically smaller state. If a number of states are replaced by the largest state, a stochastically larger aggregate is obtained, and by replacing all states by the smallest state one obtains a stochastically smaller aggregate. This concept was developed for general interacting Markov processes. It can be shown, that under some constraints the order is preserved for the composition of processes.
When applying this technique to MAPs some specifics have to be considered: The behaviour of a MAP is not influenced by its environment. Furthermore, only the process generated by the MAP is of interest, but not the distribution of the states of the MAP. Thus the relative restrictive conditions of the stochastic order can be softened, which results in an increased number of states being aggregated. The result of such an order preserving aggregation is a MAP that guarantees generating more/less arrivals per time unit than the original MAP. This can be proved formally by showing that the resultant departure process of the aggregated MAP is stochastically lesser/greater than the departure process of the original MAP. It remains to be shown that the system with the MAP as arrival process is monotone regarding the relevant probability measures when the number of arrivals increases.
Modelling of correlations in multiple processes
So far only single processes with autocorrelations have been examined. However, in some practical applications there are correlations between different quantities. Examples include the correlation of interarrival times and message sizes in computer networks or the correlation of the time of occurrence of a failure and the repair time. For modelling such correlations single MAPs are not sufficient, instead several MAPs have to be combined. Thus, the simplest model for considering the correlation between interarrival- und service time is a MAP/PH/1-station with a service time depending on the interarrival time. This can be realized by choosing the initial probability of the PH-distribution according to the phase of the MAP that initiated the arrival. This model can be interpreted as a system with n classes of customers, whereas every phase of a MAP from which an arrival can occur defines a new class. These kinds of models cannot be analysed by means of standard algorithms for MAP/PH/1-models. Nevertheless the special structure of the model allows for a numerical calculation of the state probabilities. The existing approach leaves a number of questions open. For example algorithmic aspects are not taken into account so far and it is not clear for which types of models the analysis can be extended as the present approach examines only simple discrete-time models.
It has to be investigated to what extend efficient algorithms for the analysis of MAP/PH/1-systems can be used for models with correlated interarrival and service times. Furthermore it should be investigated which models could possibly be analysed by well-known matrix-analytic methods. Of special interest are multiserver queueing systems and systems with finite capacity queues.
The problem of parameterisation of correlated MAPs and PH-distributions has not been treated at all in the literature yet. Traces with at least two values for each entry are necessary for fitting. For example the interarrival times and the message sizes could be measured. The parameterisation aims at generating a MAP with a good approximation of the interarrival times, while at the same time a PH-distribution (or MAP) that models the message sizes has to be found. The correlation between interarrival and service times is determined by the dependency of the initial distribution of the service time on the phase of the MAP that generated the arrival, which has to be modeled as well. Though the problem is clearly more complex than fitting a single MAP, first studies makes one sense that the methods for MAP fitting could be extended to this problem.
Approximation and Computation of Bounds for Tandem and Fork/Join-Networks
When using MAPs as processes in models the main aspect is the analysability of the resultant models. The state space of the models grows combinatorial with the order of the MAPs and PH-distributions (state-space explosion). Open systems can only be analyzed in form of single stations. Hence, there is a great demand in analyzing Queuing Networks with non-exponential service or arrival times at least approximately and there are numerous approaches dealing with this problem. Most of the applied decomposition techniques are suitable for systems without or with just slight feedback in particular. However, a general drawback is an unknown approximation error.
The main focus of research when developing advanced methods for the approximate analysis of queuing networks using MAPs and PH-distributions will be on Tandem and Fork/Join-Networks.
The focus of the second project term is on the extension of methods developed for MAPs to Rational Arrival Processes (RAPs). RAPs are a superset of MAPs which can not always be interpreted as Markov processes. Thus it is difficult to decide whether given matrices describe a valid RAP. Therefore one aim of the second project term is the characterization of RAP matrices. Based on lumpabillity and stochastic bisimulation equivalence relations and algorithms to determine a minimal representation have been developed for MAPs. An additional aim of the project is to develop similar relations and algorithms also for RAPs.
Stochastic automata networks (SANs) are often used to specify Markov models for performance and reliability analysis. SANs offer a compositional approach, since the composition of Markov processes in SANs results again in a Markov process. It has to be checked whether this also holds for RAPs so that the compositional approach for SANs can be extended to rational automata networks.
In order to increase practicality algorithms for parameter fitting for RAPs should be developed and be integrated into the Toolkit ProFiDo. Furthermore it is the intention to develop methods for the application of RAPs in simulation and numerical models.