Modeling and Simulation Systems Simulation:
The Shortest Route to Applications
Europe Site Middle East Site for Asia South Africa Site
UK Site USA East Coast USA West Coast
This site features information about discrete event system modeling and simulation. It includes discussions on descriptive simulation modeling, programming commands, techniques for sensitivity estimation, optimization and goal-seeking by simulation, and what-if analysis.
Advancements in computing power, availability of PC-based modeling and simulation, and efficient computational methodology are allowing leading-edge of prescriptive simulation modeling such as optimization to pursue investigations in systems analysis, design, and control processes that were previously beyond reach of the modelers and decision makers.
To search the site, try Edit | Find in page [Ctrl + f]. Enter a word or phrase in the dialogue box, e.g. "optimization" or "sensitivity" If the first appearance of the word/phrase is not what you are looking for, try Find Next.
MENU
- Introduction & Summary
- Statistics and Probability for Simulation
- Topics in Descriptive Simulation Modeling
- Techniques for Sensitivity Estimation
- Simulation-based Optimization Techniques
- Metamodeling and the Goal seeking Problems
- "What-if" Analysis Techniques
Companion Sites:
- JavaScript E-labs Learning Objects
- Statistics
- Excel For Statistical Data Analysis
- Topics in Statistical Data Analysis
- Time Series Analysis
- Computers and Computational Statistics
- Probabilistic Modeling
- Probability and Statistics Resources
- Optimization Resources
- Simulation Resources
Introduction & Summary
Statistics and Probability for Simulation
Statistics for Correlated Data
What Is Central Limit Theorem?
What Is a Least Squares Model?
ANOVA: Analysis of Variance
Exponential Density Function
Poisson Process Uniform Density Function
Random Number Generators
Test for Random Number Generators
Some Useful SPSS Commands
References & Further Readings
Modeling & Simulation Topics in Descriptive Simulation Modeling
Development of Systems Simulation
A Classification of Stochastic Processes
Simulation Output Data and Stochastic Processes
Techniques for the Steady State Simulation
Determination of the Warm-up Period
Determination of the Desirable Number of Simulation Runs Simulation Software Selection
Animation in Systems Simulation
SIMSCRIPT II.5
System Dynamics and Discrete Event Simulation
What Is Social Simulation?
What Is Web-based Simulation?
Parallel and Distributed Simulation
References & Further Readings
Introduction Techniques for Sensitivity Estimation
Applications of sensitivity information
Finite difference approximation
Simultaneous perturbation methods
Perturbation analysis
Score function methods
Harmonic analysis
Conclusions & Further Readings
Introduction Simulation-based Optimization Techniques
Deterministic search techniques Pattern search techniquesProbabilistic search techniques Evolutionary Techniques Stochastic approximation techniques Gradient surface method
- Conjugate direction search
- Steepest ascent (descent)
- Tabu search technique
- Hooke and Jeeves type techniques
- Simplex-based techniques
Post-solution analysis
Rare Event Simulation
Conclusions & Further Readings
Introduction Metamodeling and the Goal seeking Problems
Metamodeling
Goal seeking Problem
References and Further Readings
Introduction "What-if" Analysis Techniques
Likelihood Ratio (LR) Method
Exponential Tangential in Expectation Method
Taylor Expansion of Response Function
Interpolation Techniques
Conclusions & Further Readings
Introduction & Summary
Computer system users, administrators, and designers usually have a goal of highest performance at lowest cost. Modeling and simulation of system design trade off is good preparation for design and engineering decisions in real world jobs. In this Web site we study computer systems modeling and simulation. We need a proper knowledge of both the techniques of simulation modeling and the simulated systems themselves.
The scenario described above is but one situation where computer simulation can be effectively used. In addition to its use as a tool to better understand and optimize performance and/or reliability of systems, simulation is also extensively used to verify the correctness of designs. Most if not all digital integrated circuits manufactured today are first extensively simulated before they are manufactured to identify and correct design errors. Simulation early in the design cycle is important because the cost to repair mistakes increases dramatically the later in the product life cycle that the error is detected. Another important application of simulation is in developing "virtual environments" , e.g., for training. Analogous to the holodeck in the popular science-fiction television program Star Trek, simulations generate dynamic environments with which users can interact "as if they were really there." Such simulations are used extensively today to train military personnel for battlefield situations, at a fraction of the cost of running exercises involving real tanks, aircraft, etc.
Dynamic modeling in organizations is the collective ability to understand the implications of change over time. This skill lies at the heart of successful strategic decision process. The availability of effective visual modeling and simulation enables the analyst and the decision-maker to boost their dynamic decision by rehearsing strategy to avoid hidden pitfalls.
System Simulation is the mimicking of the operation of a real system, such as the day-to-day operation of a bank, or the value of a stock portfolio over a time period, or the running of an assembly line in a factory, or the staff assignment of a hospital or a security company, in a computer. Instead of building extensive mathematical models by experts, the readily available simulation software has made it possible to model and analyze the operation of a real system by non-experts, who are managers but not programmers.
A simulation is the execution of a model, represented by a computer program that gives information about the system being investigated. The simulation approach of analyzing a model is opposed to the analytical approach, where the method of analyzing the system is purely theoretical. As this approach is more reliable, the simulation approach gives more flexibility and convenience. The activities of the model consist of events, which are activated at certain points in time and in this way affect the overall state of the system. The points in time that an event is activated are randomized, so no input from outside the system is required. Events exist autonomously and they are discrete so between the execution of two events nothing happens. The SIMSCRIPT provides a process-based approach of writing a simulation program. With this approach, the components of the program consist of entities, which combine several related events into one process.
In the field of simulation, the concept of "principle of computational equivalence" has beneficial implications for the decision-maker. Simulated experimentation accelerates and replaces effectively the "wait and see" anxieties in discovering new insight and explanations of future behavior of the real system.
Consider the following scenario. You are the designer of a new switch for asynchronous transfer mode (ATM) networks, a new switching technology that has appeared on the marketplace in recent years. In order to help ensure the success of your product in this is a highly competitive field, it is important that you design the switch to yield the highest possible performance while maintaining a reasonable manufacturing cost. How much memory should be built into the switch? Should the memory be associated with incoming communication links to buffer messages as they arrive, or should it be associated with outgoing links to hold messages competing to use the same link? Moreover, what is the best organization of hardware components within the switch? These are but a few of the questions that you must answer in coming up with a design.
With the integration of artificial intelligence, agents and other modeling techniques, simulation has become an effective and appropriate decision support for the managers. By combining the emerging science of complexity with newly popularized simulation technology, the PricewaterhouseCoopers, Emergent Solutions Group builds a software that allows senior management to safely play out "what if" scenarios in artificial worlds. For example, in a consumer retail environment it can be used to find out how the roles of consumers and employees can be simulated to achieve peak performance.
Statistics for Correlated Data
We concern ourselves with n realizations that are related to time, that is having n correlated observations; the estimate of the mean is given by
mean = S Xi / n, where the sum is over i = 1 to n.Let
A = S [1 - j/(m + 1)] rj,x where the sum is over j = 1 to m, then the estimated variance is:
[1 + 2A ] S2 / n Where
S2 = the usual variance estimate
rj,x = the jth coefficient of autocorrelation
m = the maximum time lag for which autocorrelations are computed, such that j = 1, 2, 3, ..., m As a good rule of thumb, the maximum lag for which autocorrelations are computed should be approximately 2% of the number of n realizations, although each rj,x could be tested to determine if it is significantly different from zero.
Sample Size Determination: We can calculate the minimum sample size required by
n = [1 + 2A ] S2 t2 / (d2 mean2) Application: A pilot run was made of a model, observations numbered 150, the mean was 205.74 minutes and the variance S2 = 101, 921.54, estimate of the lag coefficients were computed as: r1,x = 0.3301 r2,x = 0.2993, and r3,x = 0.1987. Calculate the minimum sample size to assure the estimate lies within +d = 10% of the true mean with a = 0.05.
n = [(1.96)2 (101,921.54) {1 + 2 [(1-1/4) 0.3301 + (1 - 2/4) 0.2993 + (1- 3/4) 0.1987]}] / (0.1)2 (205.74)2
» 1757 You may like using Statistics for Time Series, and Testing Correlation JavaScript.
What Is Central Limit Theorem?
For practical purposes, the main idea of the central limit theorem (CLT) is that the average of a sample of observations drawn from some population with any shape-distribution is approximately distributed as a normal distribution if certain conditions are met. In theoretical statistics there are several versions of the central limit theorem depending on how these conditions are specified. These are concerned with the types of assumptions made about the distribution of the parent population (population from which the sample is drawn) and the actual sampling procedure.One of the simplest versions of the theorem says that if is a random sample of size n (say, n larger than 30) from an infinite population, finite standard deviation , then the standardized sample mean converges to a standard normal distribution or, equivalently, the sample mean approaches a normal distribution with mean equal to the population mean and standard deviation equal to standard deviation of the population divided by the square root of sample size n. In applications of the central limit theorem to practical problems in statistical inference, however, statisticians are more interested in how closely the approximate distribution of the sample mean follows a normal distribution for finite sample sizes, than the limiting distribution itself. Sufficiently close agreement with a normal distribution allows statisticians to use normal theory for making inferences about population parameters (such as the mean ) using the sample mean, irrespective of the actual form of the parent population.
It is well known that whatever the parent population is, the standardized variable will have a distribution with a mean 0 and standard deviation 1 under random sampling. Moreover, if the parent population is normal, then it is distributed exactly as a standard normal variable for any positive integer n. The central limit theorem states the remarkable result that, even when the parent population is non-normal, the standardized variable is approximately normal if the sample size is large enough (say > 30). It is generally not possible to state conditions under which the approximation given by the central limit theorem works and what sample sizes are needed before the approximation becomes good enough. As a general guideline, statisticians have used the prescription that if the parent distribution is symmetric and relatively short-tailed, then the sample mean reaches approximate normality for smaller samples than if the parent population is skewed or long-tailed.
In this lesson, we will study the behavior of the mean of samples of different sizes drawn from a variety of parent populations. Examining sampling distributions of sample means computed from samples of different sizes drawn from a variety of distributions, allow us to gain some insight into the behavior of the sample mean under those specific conditions as well as examine the validity of the guidelines mentioned above for using the central limit theorem in practice.
Under certain conditions, in large samples, the sampling distribution of the sample mean can be approximated by a normal distribution. The sample size needed for the approximation to be adequate depends strongly on the shape of the parent distribution. Symmetry (or lack thereof) is particularly important. For a symmetric parent distribution, even if very different from the shape of a normal distribution, an adequate approximation can be obtained with small samples (e.g., 10 or 12 for the uniform distribution). For symmetric short-tailed parent distributions, the sample mean reaches approximate normality for smaller samples than if the parent population is skewed and long-tailed. In some extreme cases (e.g. binomial) samples sizes far exceeding the typical guidelines (e.g., 30) are needed for an adequate approximation. For some distributions without first and second moments (e.g., Cauchy), the central limit theorem does not hold.
'' Central Limit Theorem Justification '' Preamble Define X, XBAR as a real 1-dimensional arrays Define I, J, M, N as integer variables End Main Open 3 for output, Name = "CLT.OUT" Use 3 for output LET N=50 LET M = 1000 Reserve X(*) as N Reserves XBAR(*) as M For J = 1 to M DO For I = 1 to N DO X(I) = Uniform.f(1., 0., 1) Compute XBAR (j) as the average of X(i) Loop Compute Ave as the average of XBAR(j) Compute ST as the standard deviation of XBAR(j) Loop LL = Ave -1.96*ST/SQRT.F(real.f(M)) UL = Ave + 1.96*ST/SQRT.F(real.f(M)) For J =1 to M Do IF XBAR(j) < LL AND XBAR(j) > XU Let Count = Count + 1 Always Loop Print 1 line with Count/M thus The P-value is = ***** ENDWhat Is a Least Squares Model?
Many problems in analyzing data involve describing how variables are related. The simplest of all models describing the relationship between two variables is a linear, or straight-line, model. The simplest method of fitting a linear model is to "eye-ball'' a line through the data on a plot. A more elegant, and conventional method is that of "least squares", which finds the line minimizing the sum of distances between observed points and the fitted line.Realize that fitting the "best'' line by eye is difficult, especially when there is a lot of residual variability in the data.
Know that there is a simple connection between the numerical coefficients in the regression equation and the slope and intercept of regression line.
Know that a single summary statistic like a correlation coefficient does not tell the whole story. A scatter plot is an essential complement to examining the relationship between the two variables.
ANOVA: Analysis of Variance
The tests we have learned up to this point allow us to test hypotheses that examine the difference between only two means. Analysis of Variance or ANOVA will allow us to test the difference between 2 or more means. ANOVA does this by examining the ratio of variability between two conditions and variability within each condition. For example, say we give a drug that we believe will improve memory to a group of people and give a placebo to another group of people. We might measure memory performance by the number of words recalled from a list we ask everyone to memorize. A t-test would compare the likelihood of observing the difference in the mean number of words recalled for each group. An ANOVA test, on the other hand, would compare the variability that we observe between the two conditions to the variability observed within each condition. Recall that we measure variability as the sum of the difference of each score from the mean. When we actually calculate an ANOVA we will use a short-cut formulaThus, when the variability that we predict (between the two groups) is much greater than the variability we don't predict (within each group) then we will conclude that our treatments produce different results.
Exponential Density Function
An important class of decision problems under uncertainty concerns the chance between events. For example, the chance of the length of time to next breakdown of a machine not exceeding a certain time, such as the copying machine in your office not to break during this week.Exponential distribution gives distribution of time between independent events occurring at a constant rate. Its density function is:
f(t) = l exp(-lt), where l is the average number of events per unit of time, which is a positive number.
The mean and the variance of the random variable t (time between events) are 1/ l, and 1/l2, respectively.
Applications include probabilistic assessment of the time between arrival of patients to the emergency room of a hospital, and arrival of ships to a particular port.
Comments: Special case of both Weibull and gamma distributions.
You may like using Exponential Applet to perform your computations.
You may like using the following Lilliefors Test for Exponentially to perform the goodness-of-fit test.
Poisson Process
An important class of decision problems under uncertainty is characterized by the small chance of the occurrence of a particular event, such as an accident. Gives probability of exactly x independent occurrences during a given period of time if events take place independently and at a constant rate. May also represent number of occurrences over constant areas or volumes. The following statements describe the Poisson Process:
- The occurrences of the events are independent. The occurrence of events from a set of assumptions in an interval of space or time has no effect on the probability of a second occurrence of the event in the same, or any other, interval.
- Theoretically, an infinite number of occurrences of the event must be possible in the interval.
- The probability of the single occurrence of the event in a given interval is proportional to the length of the interval.
- In any infinitesimally small portion of the interval, the probability of more than one occurrence of the event is negligible.
Poisson process are often used, for example in quality control, reliability, insurance claim, incoming number of telephone calls, and queuing theory.
An Application: One of the most useful applications of the Poisson Process is in the field of queuing theory. In many situations where queues occur it has been shown that the number of people joining the queue in a given time period follows the Poisson model. For example, if the rate of arrivals to an emergency room is l per unit of time period (say 1 hr), then:
P ( n arrivals) = ln e-l / n! The mean and variance of random variable n are both l . However if the mean and variance of a random variable having equal numerical values, then it is not necessary that its distribution is a Poisson.
Applications:
P ( 0 arrival) = e-l P ( 1 arrival) = l e-l / 1! P ( 2 arrival) = l2 e-l / 2! and so on. In general:
P ( n+1 arrivals ) = l Pr ( n arrivals ) / n.
You may like using Poisson Applet to perform your computations.
Goodness-of-Fit for Poisson
Replace the numerical example data with your up-to-14 pairs of Observed values & their frequencies, and then click the Calculate button. Blank boxes are not included in the calculations.
In entering your data to move from cell to cell in the data-matrix use the Tab key not arrow or enter keys.
For Technical Details, Back to:
Statistical Thinking for Decision Making
Uniform Density Function
Application: Gives probability that observation will occur within a particular interval when probability of occurrence within that interval is directly proportional to interval length.Example: Used to generate random numbers in sampling and Monte Carlo simulation.
Comments: Special case of beta distribution.
The mass function of geometric mean of n independent uniforms [0,1] is:
P(X = x) = n x(n - 1) (Log[1/xn])(n -1) / (n - 1)!.
zL = [UL-(1-U)L] / L is said to have Tukey's symmetrical l-distribution.
You may like using Uniform Applet to perform your computations.
Some Useful SPSS Commands
Test for Binomial: NPAR TEST BINOMIAL(p)=GENDER(0, 1) Gooness-of-fit for discrete r.v.: NPAR TEST CHISQUARE=X (1,3)/EXPECTED=20 30 50 Needed information to perform the t-test: DISCRIPTIVES X /STATISTICS= 1 2 Two population t-test T-TEST GROUPS=GENDER(1,2)/VARIABLES=X Plot x vs y: PLOT FORMAT=REGRESSION/SYMBOLS='*' /TITLE='PLOT OF Y ON X' /VERTICAL='Y' /HORIZONTAL='X' /PLOT=Y WITH X RANDOM VARIATES GENERATORS: LOOP #I = 1 to 100. (normal with mean = 0 and std = 1) COMPUTE XNORM = RV.NORMAL(0,1) (chi-square with 2 d.f.) COMPUTE XCHISQ=RV.CHISQ(2) (exponential with mean 2) COMPUTE XEXPON = RV.EXP(1/2) (binomial n = 10 and p = .50 COMPUTE XBINOM = RV.BINOM(10, 0.5). END CASE END LOOP END FILE END INOUT PROGRAM EXAMINE VARS=ALL /STATISTICS /HISTOGRAM(NORMSAL)= XNORM /HISTOGRAM(NORMSAL)= XCHISQ /HISTOGRAM(NORMSAL)= XEXPON /HISTOGRAM(NORMSAL)= XBINOM NORMAL RANOM VARIATE GENERATOR, K-S, AND RUNS TESTS: SPSS/OUTPUT=HW3.OUT TITLE 'GENERATING FROM NORMAL 0,1' INPUT PROGRAM LOOP I=1 TO 50 COMPUTE X2=NORMAL(1) END CASE END LOOP END FILE END INPUT PROGRAM VAR LABLE X2 'NORMAL VARIATE' LIST CASE CASE=50/VARIABLE=ALL// CONDESCRIPTIVE X2(ZX2) STATISTICS ALL FREQUENCIES VARIABLE=ZX2/FORMAT=NOTABLE/ HISTOGRAM MIN(-3.0) MAX(+3.0) INCREMENT(0.2)/ NPAR TESTS RUNS(MEAN)=ZX2/ NPAR TESTS K-S(NORMAL,0.0,1.0)=ZX2/ SAMPLE 10 FROM 50 LIST CASE CASE=10/VARIABLES=X2,ZX2/ FINISH K-S LILLIEFORS TEST FOR NORMALITY: $SPSS/OUTPUT=L.OUT TITLE 'K-S LILLIEFORS TEST FOR NORMALITY' DATA LIST FREE FILE='L.DAT'/X VAR LABELS X 'SAMPLE VALUES' LIST CASE CASE=20/VARIABLES=ALL CONDESCRIPTIVE X(ZX) LIST CASE CASE=20/VARIABLES=X ZX/ SORT CASES BY ZX(A) RANK VARIABLES=ZX/RFRACTION INTO CRANK/TIES=HIGH COMPUTE Y=CDFNORM(ZX) COMPUTE SPROB=CRANK COMPUTE DA=Y-SPROB COMPUTE DB=Y-LAG(SPROB,1) COMPUTE DAABS=ABS(DA) COMPUTE DBABS=ABS(DB) COMPUTE LILLSTAT=MAX(DAABS,DBABS) LIST VARIABLES=X,ZX,Y,SPROB,DA,DB LIST VARIABLES=LILLSTAT SORT CASES BY LILLSTAT(D) LIST CASES CASE=1/VARIABLES=LILLSTAT FINISH K-S TEST FOR EXPONENTIAL DATA WITH MEAN = 1 DATA LIST FREE FILE='Ex.DAT'/ X DESCRIPTIVES VARIABLES=X STATISTICS MEAN COMPUTE Y=1.-EXP(-X) NPAR TESTS K-S(UNIFORM, 0.0, 1.0) Y (For large sample size) FINISHFor more SPSS programs useful to simulation input/output analysis, visit Data Analysis Routines.
Random Number Generators
Classical uniform random number generators have some major defects, such as, short period length and lack of higher dimension uniformity. However, nowadays there are a class of rather complex generators which is as efficient as the classical generators while enjoy the property of a much longer period and of a higher dimension uniformity.Computer programs that generate "random" numbers use an algorithm. That means if you know the algorithm and the seedvalues you can predict what numbers will result. Because you can predict the numbers they are not truly random - they are pseudorandom. For statistical purposes "good" pseudorandom numbers generators are good enough.
real function random() c c Algorithm AS 183 Appl. Statist. (1982) vol.31, no.2 c c Returns a pseudo-random numbers with rectangular distribution. c between 0 and 1. The cycle length is 6.95E+12 (See page 123 c of Applied Statistics (1984) vol.33), not as claimed in the c original article. c c IX, IY and IZ should be set to integer values between 1 and c 30000 before the first entry. c c Integer arithmetic up to 30323 is required. c integer ix, iy, iz common /randc/ ix, iy, iz c ix = 171 * mod(ix, 177) - 2 * (ix / 177) iy = 172 * mod(iy, 176) - 35 * (iy / 176) iz = 170 * mod(iz, 178) - 63 * (iz / 178) c if (ix .lt. 0) ix = ix + 30269 if (iy .lt. 0) iy = iy + 30307 if (iz .lt. 0) iz = iz + 30323 c c If integer arithmetic up to 5212632 is available, the preceding c 6 statements may be replaced by: c c ix = mod(171 * ix, 30269) c iy = mod(172 * iy, 30307) c iz = mod(170 * iz, 30323) c random = mod(float(ix) / 30269. + float(iy) / 30307. + + float(iz) / 30323., 1.0) return end c c c c real function uniform() c c Generate uniformly distributed random numbers using the 32-bit c generator from figure 3 of: L'Ecuyer, P., 1988. c The cycle length is claimed to be 2.30584E+18 c Seeds can be set by calling the routine set_uniform c It is assumed that the Fortran compiler supports long variable c names, and integer*4. c integer*4 z, k, s1, s2 common /unif_seeds/ s1, s2 save /unif_seeds/ c k = s1 / 53668 s1 = 40014 * (s1 - k * 53668) - k * 12211 if (s1 .lt. 0) s1 = s1 + 2147483563 c k = s2 / 52774 s2 = 40692 * (s2 - k * 52774) - k * 3791 if (s2 .lt. 0) s2 = s2 + 2147483399 c z = s1 - s2 if (z .lt. 1) z = z + 2147483562 c uniform = z / 2147483563. return end subroutine set_uniform(seed1, seed2) c c Set seeds for the uniform random number generator. c integer*4 s1, s2, seed1, seed2 common /unif_seeds/ s1, s2 save /unif_seeds/ s1 = seed1 s2 = seed2 return endThe Random Number Generator RANECU
A FORTRAN code for a generator of uniform random numbers on [0,1]. RANECU is multiplicative linear congruential generator suitable for a 16-bit platform. It combines three simple generators, and has a period exceeding 81012.
It is constructed for more efficient use by providing for a sequence of such numbers, LEN in total, to be returned in a single call. A set of three non-zero integer seeds can be supplied, failing which a default set is employed. If supplied, these three seeds, in order, should lie in the ranges [1,32362], [1,31726] and [1,31656] respectively.
SUBROUTINE RANECU (RVEC,LEN) C Portable random number generator for 16 bit computer. C Generates a sequence of LEN pseudo-random numbers, returned in C RVEC. DIMENSION RVEC(*) SAVE ISEED1,ISEED2, ISEED3 DATA ISEED1,ISEED2,ISEED3/1234, 5678, 9876/ C Default values, used if none suppliedvia an ENTRY C call at RECUIN DO 100 I = 1,LEN K=ISEED1/206 ISEED1 = 157 * (ISEED1 - K * 206) - K * 21 IF(ISEED1.LT.0) ISEED1=ISEED1+32363 K=ISEED2/217 ISEED2 = 146 * (ISEED2 - K*217) - K* 45 IF(ISEED2.LT.O) ISEED2=ISEED2+31727 K=ISEED3/222 ISEED3 = 142 * (ISEED3 - K *222) - K * 133 IF(ISEED3.LT.0) ISEED3=ISEED3+31657 IZ=ISEED1-ISEED2 IF(IZ.GT.706)IZ = Z - 32362 IZ = 1Z+ISEED3 IF(IZ.LT.1)IZ = 1Z + 32362 RVEC(I)=REAL(IZ) * 3.0899E - 5 100 CONTINUE RETURN ENTRY RECUIN(IS1, IS2, IS3) ISEED1=IS1 ISEED2=IS2 ISEED3=IS3 RETURN ENTRY RECUUT(IS1,IS2,IS3) IS1=ISEED1 IS2=ISEED2 IS3=ISEED3 RETURN END
The Shuffling Routine in Visual Basic
Dim Ran0Y As Double Dim Ran0V(97) As Double Function RandShuffle(idum As Integer) Dim dum As Double Dim j As Integer If idum < 0 Then Randomize (-idum) For j = 1 To 97 dum = Rnd() Next For j = 1 To 97 Ran0V(j) = Rnd() Next Ran0Y = Rnd() End If Ran0Y = Rnd() j = 1 + Int(97 * Ran0Y) If (j > 97) Or (j < 1) Then MsgBox "Error" End If Ran0Y = Ran0V(j) RandShuffle = Ran0Y Ran0V(j) = Rnd() End FunctionThe Square Histogram Method
We are given a histogram, with vertical bars having heights proportional to the probability with which we want to produce a value indicated by the label at the base.A simple such histogram, layed flat, might be:
a:********************************32 b:**************************27 c:************************26 d:************12 e:***3The idea is to cut the bars into pieces then reassemble them into a square histogram, all heights equal, with each final bar having a lower part, as well as an upper part indicating where it came from. A single uniform random variable U can then be used to choose one of the final bars and to indicate whether to use the lower or upper part. There are many ways to do this cutting and reassembling; the simplest seems to be the Robin Hood Algorithm: Take from richest to bring the poorest up to average.
STEP 1: The original (horizontal) histogram, average "height" 20:
a:******************************** 32 b:************************** 27 c:************************ 26 d:************ 12 e:*** 3Take 17 from strip 'a' to bring strip 'e' up to average. Record donor and use old 'poor' level to mark lower part of donee:
a:*************** 15 b:************************** 27 c:************************ 26 d:************ 12 e:**|****************| 20 (a)Then bring 'd' up to average with donor 'b'. Record donor and use old 'poor' level to mark lower part of donee:
a:*************** 15 b:****************** 19 c:************************ 26 d:***********|*******| 20 (b) e:*****|*************| 20 (a)Then bring 'a' up to average with donor 'c'. Record donor and use old 'poor' level to mark lower part of donee:
a:***************|***| 20(c) b:******************* 19 c:********************* 21 d:***********|*******| 20(b) e:*****|*************| 20(a)Finally, bring 'b' up to average with donor 'c'. Record donor and use old 'poor' level to mark lower part of donee:
a:**************|****| 20(c) b:******************|| 20(c) c:*******************| 20 d:***********|*******| 20(b) e:*****|*************| 20(a)We now have a "squared histogram", i.e., a rectangle with 4 strips of equal area, each strip with two regions. A single uniform variate U can be used to generate a,b,c,d,e with the required probabilities, .32, .27, .26, .12 .06.
Setup: Make tables,
V[1]=a K[1]=c T[1]=0+16/20 V[2]=b K[2]=c T[2]=1+19/20 V[3]=c K[3]=c T[3]=2+20/20 V[4]=d K[4]=b T[4]=3+12/20 V[5]=e K[5]=a T[5]=4+ 6/20Generation Process:
Let j be the integer part of 1+5*U, with U uniform in (0,1). If U < T[j] return V[j], else return V[K[j]]. In many applications no V table is necessary: V[i]=i and the generating procedure becomes If U < T[j] return j, else return K[j].
For more, visit the Web site: Modeling & Simulation Resources.
References & Further Readings:
Aiello W., S. Rajagopalan, and R. Venkatesan, Design of practical and provably good random number generators, Journal of Algorithms, 29, 358-389, 1998.
Dagpunar J., Principles of Random Variate Generation, Clarendon, 1988.
Fishman G., Monte Carlo, Springer, 1996.
James, Fortran version of L'Ecuyer generator, Comput. Phys. Comm., 60, 329-344, 1990.
Knuth D., The Art of Computer Programming, Vol. 2, Addison-Wesley, 1998.
L'Ecuyer P., Efficient and portable combined random number generators, Comm. ACM, 31, 742-749, 774, 1988.
L'Ecuyer P., Uniform random number generation, Ann. Op. Res., 53, 77-120, 1994.
L'Ecuyer P., Random number generation. In Handbook on Simulation, J. Banks (ed.), Wiley, 1998.
Maurer U., A universal statistical test for random bit generators, J. Cryptology, 5, 89-105, 1992.
Sobol' I., and Y. Levitan, A pseudo-random number generator for personal computers, Computers & Mathematics with Applications, 37(4), 33-40, 1999.
Tsang W-W., A decision tree algorithm for squaring the histogram in random number generation, Ars Combinatoria, 23A, 291-301, 1987
Test for Randomness
We need to test for both randomness as well as uniformity. The tests can be classified in 2 categories: Empirical or statistical tests, and theoretical tests.
Theoretical tests deal with the properties of the generator used to create the realization with desired distribution, and do not look at the number generated at all. For example, we would not use a generator with poor qualities to generate random numbers.
Statistical tests are based solely on the random observations produced.Test for Randomness:
A. Test for independence:
Plot the xi realization vs xi+1. If there is independence, the graph will not show any distinctive patterns at all, but will be perfectly scattered.B. Runs tests.(run-ups, run-downs):
This is a direct test of the independence assumption. There are two test statistics to consider: one based on a normal approximation and another using numerical approximations.Test based on Normal approximation:
Suppose you have N random realizations. Let a be the total number of runs in a sequence. If the number of positive and negative runs are greater than say 20, the distribution of a is reasonably approximated by a Normal distribution with mean (2N - 1) /3 and (16N - 29) / 90. Reject the hypothesis of independence or existence of runs if | Zo| > Z(1-alpha/2) where Zo is the Z score.C. Correlation tests:
Do the random numbers exhibit discernible correlation? Compute the sample Autcorrelation Function.Frequency or Uniform Distribution Test:
Use Kolmogorov-Smirimov test to determine if the realizations follow a U(0,1)References & Further Readings:
Headrick T., Fast fifth-order polynomial transforms for generating univariate and multivariate nonnormal distributions, Computational Statistics and Data Analysis, 40 (4), 685-711, 2002.
Karian Z., and E. Dudewicz, Modern Statistical Systems and GPSS Simulation, CRC Press, 1998.
Kleijnen J., and W. van Groenendaal, Simulation: A Statistical Perspective, Wiley, Chichester, 1992
Korn G., Real statistical experiments can use simulation-package software, Simulation Modelling Practice and Theory, 13(1), 39-54, 2005.
Lewis P., and E. Orav, Simulation Methodology for Statisticians, Operations Analysts, and Engineers, Wadsworth Inc., 1989
Madu Ch., and Ch-H. Kuei, Experimental Statistical Designs and Analysis in Simulation Modeling, Greenwood Publishing Group, 1993.
Pang K., Z. Yang, S. Hou, and P. Leung, Non-uniform random variate generation by the vertical strip method, European Journal of Operational Research, 142(3), 595-609, 2002.
Robert C., and G. Casella, Monte Carlo Statistical Methods, Springer, 1999.
Modeling & Simulation
Simulation in general is to pretend that one deals with a real thing while really working with an imitation. In operations research the imitation is a computer model of the simulated reality. A flight simulator on a PC is also a computer model of some aspects of the flight: it shows on the screen the controls and what the "pilot" (the youngster who operates it) is supposed to see from the "cockpit" (his armchair).
Why to use models? To fly a simulator is safer and cheaper than the real airplane. For precisely this reason, models are used in industry commerce and military: it is very costly, dangerous and often impossible to make experiments with real systems. Provided that models are adequate descriptions of reality (they are valid), experimenting with them can save money, suffering and even time.
When to use simulations? Systems that change with time, such as a gas station where cars come and go (called dynamic systems) and involve randomness. Nobody can guess at exactly which time the next car should arrive at the station, are good candidates for simulation. Modeling complex dynamic systems theoretically need too many simplifications and the emerging models may not be therefore valid. Simulation does not require that many simplifying assumptions, making it the only tool even in absence of randomness.
How to simulate? Suppose we are interested in a gas station. We may describe the behavior of this system graphically by plotting the number of cars in the station; the state of the system. Every time a car arrives the graph increases by one unit while a departing car causes the graph to drop one unit. This graph (called sample path), could be obtained from observation of a real station, but could also be artificially constructed. Such artificial construction and the analysis of the resulting sample path (or more sample paths in more complex cases) consists of the simulation.
Types of simulations: Discrete event. The above sample path consisted of only horizontal and vertical lines, as car arrivals and departures occurred at distinct points of time, what we refer to as events. Between two consecutive events, nothing happens - the graph is horizontal. When the number of events are finite, we call the simulation "discrete event."
In some systems the state changes all the time, not just at the time of some discrete events. For example, the water level in a reservoir with given in and outflows may change all the time. In such cases "continuous simulation" is more appropriate, although discrete event simulation can serve as an approximation.
Further consideration of discrete event simulations.
How is simulation performed? Simulations may be performed manually. Most often, however, the system model is written either as a computer program (for an example click here) or as some kind of input into simulator software.
System terminology:
State: A variable characterizing an attribute in the system such as level of stock in inventory or number of jobs waiting for processing.
Event: An occurrence at a point in time which may change the state of the system, such as arrival of a customer or start of work on a job.
Entity: An object that passes through the system, such as cars in an intersection or orders in a factory. Often an event (e.g., arrival) is associated with an entity (e.g., customer).
Queue: A queue is not only a physical queue of people, it can also be a task list, a buffer of finished goods waiting for transportation or any place where entities are waiting for something to happen for any reason.
Creating: Creating is causing an arrival of a new entity to the system at some point in time.
Scheduling: Scheduling is the act of assigning a new future event to an existing entity.
Random variable: A random variable is a quantity that is uncertain, such as interarrival time between two incoming flights or number of defective parts in a shipment.
Random variate: A random variate is an artificially generated random variable.
Distribution: A distribution is the mathematical law which governs the probabilistic features of a random variable.
A Simple Example: Building a simulation gas station with a single pump served by a single service man. Assume that arrival of cars as well their service times are random. At first identify the:
states: number of cars waiting for service and number of cars served at any moment
events: arrival of cars, start of service, end of service
entities: these are the cars
queue: the queue of cars in front of the pump, waiting for service
random realizations: interarrival times, service times
distributions: we shall assume exponential distributions for both the interarrival time and service time.
Next, specify what to do at each event. The above example would look like this: At event of entity arrival: Create next arrival. If the server is free, send entity for start of service. Otherwise it joins the queue. At event of service start: Server becomes occupied. Schedule end of service for this entity. At event of service end: Server becomes free. If any entities waiting in queue: remove first entity from the queue; send it for start of service.
Some initiation is still required, for example, the creation of the first arrival. Lastly, the above is translated into code. This is easy with an appropriate library which has subroutines for creation, scheduling, proper timing of events, queue manipulations, random variate generation and statistics collection.
How to simulate? Besides the above, the program records the number of cars in the system before and after every change, together with the length of each event.
Development of Systems Simulation
Discrete event systems (DES) are dynamic systems which evolve in time by the occurrence of events at possibly irregular time intervals. DES abound in real-world applications. Examples include traffic systems, flexible manufacturing systems, computer-communications systems, production lines, coherent lifetime systems, and flow networks. Most of these systems can be modeled in terms of discrete events whose occurrence causes the system to change from one state to another. In designing, analyzing and operating such complex systems, one is interested not only in performance evaluation but also in sensitivity analysis and optimization.A typical stochastic system has a large number of control parameters that can have a significant impact on the performance of the system. To establish a basic knowledge of the behavior of a system under variation of input parameter values and to estimate the relative importance of the input parameters, sensitivity analysis applies small changes to the nominal values of input parameters. For systems simulation, variations of the input parameter values cannot be made infinitely small. The sensitivity of the performance measure with respect to an input parameter is therefore defined as (partial) derivative.
Sensitivity analysis is concerned with evaluating sensitivities (gradients, Hessian, etc.) of performance measures with respect to parameters of interest. It provides guidance for design and operational decisions and plays a pivotal role in identifying the most significant system parameters, as well as bottleneck subsystems. I have carried out research in the fields of sensitivity analysis and stochastic optimization of discrete event systems with an emphasis on computer simulation models. This part of lecture is dedicated to the estimation of an entire response surface of complex discrete event systems (DES) from a single sample path (simulation), such as the expected waiting time of a customer in a queuing network, with respect to the controllable parameters of the system, such as service rates, buffer sizes and routing probabilities. With the response surfaces at hand, we are able to perform sensitivity analysis and optimization of a DES from a single simulation, that is, to find the optimal parameters of the system and their sensitivities (derivatives), with respect to uncontrollable system parameters, such as arrival rates in a queuing network. We identified three distinct processes. Descriptive Analysis includes: Problem Identification & Formulation, Data Collection and Analysis, Computer Simulation Model Development, Validation, Verification and Calibration, and finally Performance Evaluation. Prescriptive Analysis: Optimization or Goal Seeking. These are necessary components for Post-prescriptive Analysis: Sensitivity, and What-If Analysis. The prescriptive simulation attempts to use simulation to prescribe decisions required to obtain specified results. It is subdivided into two topics- Goal Seeking and Optimization. Recent developments on "single-run" algorithms for the needed sensitivities (i.e. gradient, Hessian, etc.) make the prescriptive simulation feasible.
Click on the image to enlarge it and THEN print it.Problem Formulation: Identify controllable and uncontrollable inputs. Identify constraints on the decision variables. Define measure of system performance and an objective function. Develop a preliminary model structure to interrelate the inputs and the measure of performance.
Click on the image to enlarge it and THEN print it.Data Collection and Analysis: Regardless of the method used to collect the data, the decision of how much to collect is a trade-off between cost and accuracy.
Simulation Model Development: Acquiring sufficient understanding of the system to develop an appropriate conceptual, logical and then simulation model is one of the most difficult tasks in simulation analysis.
Model Validation, Verification and Calibration: In general, verification focuses on the internal consistency of a model, while validation is concerned with the correspondence between the model and the reality. The term validation is applied to those processes which seek to determine whether or not a simulation is correct with respect to the "real" system. More prosaically, validation is concerned with the question "Are we building the right system?". Verification, on the other hand, seeks to answer the question "Are we building the system right?" Verification checks that the implementation of the simulation model (program) corresponds to the model. Validation checks that the model corresponds to reality. Calibration checks that the data generated by the simulation matches real (observed) data.
Validation: The process of comparing the model's output with the behavior of the phenomenon. In other words: comparing model execution to reality (physical or otherwise)
Verification: The process of comparing the computer code with the model to ensure that the code is a correct implementation of the model.
Calibration: The process of parameter estimation for a model. Calibration is a tweaking/tuning of existing parameters and usually does not involve the introduction of new ones, changing the model structure. In the context of optimization, calibration is an optimization procedure involved in system identification or during experimental design.Input and Output Analysis: Discrete-event simulation models typically have stochastic components that mimic the probabilistic nature of the system under consideration. Successful input modeling requires a close match between the input model and the true underlying probabilistic mechanism associated with the system. The input data analysis is to model an element (e.g., arrival process, service times) in a discrete-event simulation given a data set collected on the element of interest. This stage performs intensive error checking on the input data, including external, policy, random and deterministic variables. System simulation experiment is to learn about its behavior. Careful planning, or designing, of simulation experiments is generally a great help, saving time and effort by providing efficient ways to estimate the effects of changes in the model's inputs on its outputs. Statistical experimental-design methods are mostly used in the context of simulation experiments.
Performance Evaluation and What-If Analysis: The `what-if' analysis is at the very heart of simulation models.
Sensitivity Estimation: Users must be provided with affordable techniques for sensitivity analysis if they are to understand which relationships are meaningful in complicated models.
Optimization: Traditional optimization techniques require gradient estimation. As with sensitivity analysis, the current approach for optimization requires intensive simulation to construct an approximate surface response function. Incorporating gradient estimation techniques into convergent algorithms such as Robbins-Monroe type algorithms for optimization purposes, will be considered.
Gradient Estimation Applications: There are a number of applications which measure sensitivity information, (i.e., the gradient, Hessian, etc.), Local information, Structural properties, Response surface generation, Goal-seeking problem, Optimization, What-if Problem, and Meta-modelling
Report Generating: Report generation is a critical link in the communication process between the model and the end user.
A Classification of Stochastic Processes
A stochastic process is a probabilistic model of a system that evolves randomly in time and space. Formally, a stochastic process is a collection of random variables {X(t), t Î T} all defined on a common sample (probability) space. The X(t) is the state while (time) t is the index that is a member of set T.Examples are the delay {D(i), i = 1, 2, ...} of the ith customer and number of customers {Q(t), T ³ 0} in the queue at time t in an M/M/1 queue. In the first example, we have a discrete- time, continuous state, while in the second example the state is discrete and time in continuous.
The following table is a classification of various stochastic processes. The man made systems have mostly discrete state. Monte Carlo simulation deals with discrete time while in discrete even system simulation the time dimension is continuous, which is at the heart of this site.
Change in the States of the System Continuous Discrete Time Continuous Level of water
behind a damNumber of
customers in a bankDiscrete Weekdays' range
of temperatureSales at the
end of the dayA Classification of Stochastic Processes
Simulation Output Data and Stochastic Processes
To perform statistical analysis of the simulation output we need to establish some conditions, e.g. output data must be a covariance stationary process (e.g. the data collected over n simulation runs).Stationary Process (strictly stationary): A stationary stochastic process is a stochastic process {X(t), t Î T} with the property that the joint distribution all vectors of h dimension remain the same for any fixed h.
First Order Stationary: A stochastic process is a first order stationary if expected of X(t) remains the same for all t.
For example in economic time series, a process is first order stationary when we remove any kinds of trend by some mechanisms such as differencing.
Second Order Stationary: A stochastic process is a second order stationary if it is first order stationary and covariance between X(t) and X(s) is function of t-s only.
Again, in economic time series, a process is second order stationary when we stabilize also its variance by some kind of transformations such as taking square root.
Clearly, a stationary process is a second order stationary, however the reverse may not hold.
In simulation output statistical analysis we are satisfied if the output is covariance stationary.
Covariance Stationary: A covariance stationary process is a stochastic process {X(t), t Î T} having finite second moments, i.e. expected of [X(t)]2 be finite.
Clearly, any stationary process with finite second moment is covariance stationary. A stationary process may have no finite moment whatsoever.
Since a Gaussian process needs a mean and covariance matrix only, it is stationary (strictly) if it is covariance stationary.
Two Contrasting Stationary Process:
Consider the following two extreme stochastic processes:
- A sequence Y0, Y1,....., of independent identically distributed, random-value sequence is a stationary process, if its common distribution has a finite variance then the process is covariance stationary.
- Let Z be a single random variable with known distribution function, and set Z0 = Z1 = ....Z. Note that in a realization of this process, the first element, Z0, may be random but after that there is no randomness. The process {Zi, i = 0, 1, 2, ..} is stationary if Z has a finite variance.
Output data in simulation fall between these two type of process. Simulation outputs are identical, and mildly correlated (how mild? It depends on e.g. in a queueing system how large is the traffic intensity r). An example could be the delay process of the customers in a queueing system.
Techniques for the Steady State Simulation
Unlike in queuing theory where steady state results for some models are easily obtainable, the steady state simulation is not an easy task. The opposite is true for obtaining results for the transient period (i.e., the warm-up period).Gather steady state simulation output requires statistical assurance that the simulation model reached the steady state. The main difficulty is to obtain independent simulation runs with exclusion of the transient period . The two technique commonly used for steady state simulation are the Method of Batch means, and the Independent Replication.
None of these two methods is superior to the other in all cases. Their performance depend on the magnitude of the traffic intensity. The other available technique is the Regenerative Method, which is mostly used for its theoretical nice properties, however it is rarely applied in actual simulation for obtaining the steady state output numerical results.
![]()
Suppose you have a regenerative simulation consisting of m cycles of size n1, n2,…nm, respectively. The cycle sums is:
yi = S xij / ni, the sum is over j=1, 2, ..,ni The overall estimate is:
Estimate = Syi / S ni, the sums are over i=1, 2, ..,m The 100(1-a/2)% confidence interval using the Z-table (or T-table, for m less than, say 30), is:Estimate ± Z. S/ (n. m½) where,n = S ni /m, the sum is over i=1, 2, ..,m and the variance is:S2 = S (yi - ni . Estimate)2/(m-1), the sum is over i=1, 2, ..,m
Method of Batch Means: This method involves only one very long simulation run which is suitably subdivided into an initial transient period and n batches. Each of the batch is then treated as an independent run of the simulation experiment while no observation are made during the transient period which is treated as warm-up interval. Choosing a large batch interval size would effectively lead to independent batches and hence, independent runs of the simulation, however since number of batches are few on cannot invoke the central limit theorem to construct the needed confidence interval. On the other hand, choosing a small batch interval size would effectively lead to significant correlation between successive batches therefore cannot apply the results in constructing an accurate confidence interval.
![]()
Suppose you have n equal batches of m observations each. The means of each batch is:
meani = S xij / m, the sum is over j=1, 2, ..,m The overall estimate is:Estimate = Smeani / n, the sum is over i=1, 2, ..,n The 100(1-a/2)% confidence interval using the Z-table (or T-table, for n less than, say 30), is:
Estimate ± Z. S where the variance is:S2 = S (meani - Estimate)2/(n-1), the sum is over i=1, 2, ..,n Method of Independent Replications: This method is the most popularly used for systems with short transient period. This method requires independent runs of the simulation experiment different initial random seeds for the simulators' random number generator. For each independent replications of the simulation run it transient period is removed. For the observed intervals after the transient period data is collected and processed for the point estimates of the performance measure and for its subsequent confidence interval.
![]()
Suppose you have n replications with of m observations each. The means of each replication is:
meani = S xij / m, the sum is over j=1, 2, ..,m The overall estimate is:Estimate = Smeani / n, the sum is over i=1, 2, ..,n The 100(1-a/2)% confidence interval using the Z-table (or T-table, for n less than, say 30), is:Estimate ± Z. S where the variance is:
S2 = S (meani - Estimate)2/(n-1), the sum is over i=1, 2, ..,n
Further Reading:
Sherman M., and D. Goldsman, Large-sample normality of the batch-means variance estimator, Operations Research Letters, 30, 319-326, 2002.
Whitt W., The efficiency of one long run versus independent replications in steady-state simulation, Management Science, 37(6), 645-666, 1991.
Determination of the Warm-up Period
To estimate the long-term performance measure of the system, there are several methods such as Batch Means, Independent Replications and Regenerative Method.Batch Means is a method of estimating the steady-state characteristic from a single-run simulation. The single run is partitioned into equal size batches large enough for estimates obtained from different batches to be approximately independent. In the method of Batch Means, it is important to ensure that the bias due to initial conditions is removed to achieve at least a covariance stationary waiting time process. An obvious remedy is to run the simulation for a period large enough to remove the effect of the initial bias. During this warm-up period, no attempt is made to record the output of the simulation. The results are thrown away. At the end of this warm-up period, the waiting time of customers are collected for analysis. The practical question is "How long should the warm-up period be?". Abate and Whitt provided a relatively simple and nice expression for the time required (tp) for an M/M/1/ queue system (with traffic intensity r) starting at the origin (empty) to reach and remain within 100p% of the steady- state limit as follows:
tp(r) = 2C(r) Ln {1/[(1-p)(1+2C(r))]}/(1-r)2 where
C(r)=[2+ r + ( r2 + 4r )½] / 4. Some notions of tp(r) as a function of r and p, are given in following table:
Traffic
Intensity100p r 95.0 99.0 99.9 99.99 0.10 3.61 6.33 10.23 14.12 0.20 5.01 8.93 14.53 20.14 0.30 7.00 12.64 20.71 28.79 0.40 10.06 18.39 30.31 42.23 0.50 15.18 28.05 46.47 64.89 0.60 24.70 46.13 76.79 107.45 0.70 45.51 85.87 143.61 201.36 0.80 105.78 201.53 338.52 475.51 0.90 435.74 838.10 1413.70 1989.40 Time (tp) required for an M/M/1 queue to reach and
remain with 100p% limits of the steady-state value.Although this result is developed for M/M/1 queues, it has already been established that it can serve as an approximation for more general; i.e., GI/G/1 queues.
Further Reading:
Abate J., and W. Whitt, Transient behavior of regular Brownian motion, Advance Applied Probability, 19, 560-631, 1987.
Chen E., and W. Kelton, Determining simulation run length with the runs test, Simulation Modelling Practice and Theory, 11, 237-250, 2003.
Determination of the Desirable Number of Simulation Runs
The two widely used methods for experimentation on simulation models are method of bath means, and independent replications. Intuitively one may say the method of independent replication is superior in producing statistically a "good" estimate for the system's performance measure. In fact, not one method is superior in all cases and it all depends on the traffic intensity r.After deciding what method is more suitable to apply, the main question is determination of number of runs. That is, at the planning stage of a simulation investigation of the question of number of simulation runs (n) is critical.
The confidence level of simulation output drawn from a set of simulation runs depends on the size of data set. The larger the number of runs, the higher is the associated confidence. However, more simulation runs also require more effort and resources for large systems. Thus, the main goal must be in finding the smallest number of simulation runs that will provide the desirable confidence.
Pilot Studies: When the needed statistics for number of simulation runs calculation is not available from existing database, a pilot simulation is needed.
For large pilot simulation runs (n), say over 30, the simplest number of runs determinate is:
[(Za/2)2 S2] / d2 where d is the desirable margin of error (i.e., the absolute error), which is the half-length of the confidence interval with 100(1- a)% confidence interval. S2 is the variance obtained from the pilot run.
One may use the following sample size determinate for a desirable relative error D in %, which requires an estimate of the coefficient of variation (C.V. in %) from a pilot run with n over 30:
[(Za/2)2 (C.V.)2] / D2
These sample size determinates could also be used for simulation output estimation of unimodal output populations, with discrete or continuous random variables provided the pilot run size (n) is larger than (say) 30.
The aim of applying any one of the above number of runs determinates is at improving your pilot estimates at feasible costs.
You may like using the following Applet for determination of number of runs.
Further Reading:
Díaz-Emparanza I, Is a small Monte Carlo analysis a good analysis? Checking the size power and consistency of a simulation-based test, Statistical Papers, 43(4), 567-577, 2002.
Whitt W., The efficiency of one long run versus independent replications in steady-state simulation, Management Science, 37(6), 645-666, 1991.
Determination of Simulation Runs' Size
At the planning stage of a simulation modeling the question of number of simulation runs (n) is critical. The following Java applets compute the needed Runs Size based on current avialable information ontained from a pilot simulation run, to achieve an acceptable accuracy and/or risk.
Enter the needed information, and then click the Calculate button.
The aim of applying any one of the following number of simulation runs determinates is at improving your pilot estimates at a feasible cost.
Notes: The normality condition might be relaxed for number of simulation runs over, say 30. Moreover, determination of number of simulation runs for mean could also be used for other unimodal simulation output distributions including those with discrete random variables, such as proportion, provided the pilot run is sufficiently large (say, over 30).
Runs' Size with Acceptable
Absolute Precision
Simulation Software Selection
The vast amount of simulation software available can be overwhelming for the new users. The following are only a random sample of software in the market today:ACSL, APROS, ARTIFEX, Arena, AutoMod, C++SIM, CSIM, Call$im, FluidFlow, GPSS, Gepasi, JavSim, MJX, MedModel, Mesquite, Multiverse, NETWORK, OPNET Modeler, POSES++, Simulat8, Powersim, QUEST, REAL, SHIFT, SIMPLE++, SIMSCRIPT, SLAM, SMPL, SimBank, SimPlusPlus, TIERRA, Witness, SIMNON, VISSIM, and javasim.
There are several things that make an ideal simulation package. Some are properties of the package, such as support, reactivity to bug notification, interface, etc. Some are properties of the user, such as their needs, their level of expertise, etc. For these reasons asking which package is best is a sudden failure of judgment. The first question to ask is for what purpose you need the software? Is it for education, teaching, student-projects or research?
The main question is: What are the important aspects to look for in a package? The answer depends on specific applications. However some general criteria are: Input facilities, Processing that allows some programming, Optimization capability, Output facilities, Environment including training and support services, Input-output statistical data analysis capability, and certainly the Cost factor.
You must know which features are appropriate for your situation, although, this is not based on a "Yes" or "No" judgment.
For description of available simulation software, visit Simulation Software Survey.
Reference & Further Reading:
Nikoukaran J., Software selection for simulation in manufacturing: A review, Simulation Practice and Theory, 7(1), 1-14, 1999.
Animation in Systems Simulation
Animation in systems simulation is a useful tool. Most graphically based software packages have default animation. This is quite useful for model debugging, validation, and verification. This type of animation comes with little or no additional effort and gives the modeler additional insight into how the model. This type of animation comes with little or no additional effort and gives the modeler additional insight into how the model works. However, it augments the modeling tools available. The more realistic animation presents qualities which intend to be useful to the decision-maker in implementing the developed simulation model. There are also, good model management tools. Some tools have been developed which combined a database with simulation to store models, data, results, and animations. However, there is not one product that provides all of those capabilities.
Without computer one cannot perform any realistic dynamic systems simulation.
SIMSCRIPT II.5 is a powerful, free-format, English-like simulation language designed to greatly simplify writing programs for simulation modelling. Programs written in SIMSCRIPT II.5 are easily read and maintained. They are accurate, efficient, and generate results which are acceptable to users. Unlike other simulation programming languages, SIMSCRIPT II.5 requires no coding in other languages. SIMSCRIPT II.5 has been fully supported for over 33 years. Contributing to the wide acceptance and success of SIMSCRIPT II.5 modelling are:
DESIGN:
A powerful worldview, consisting of Entities and Processes, provides a natural conceptual framework with which to relate real objects to the model.
PROGRAMMING:
SIMSCRIPT II.5 is a modern, free-form language with structured programming constructs and all the built-in facilities needed for model development. Model components can be programmed so they clearly reflect the organization and logic of the modeled system. The amount of program needed to model a system is typically 75% less than its FORTRAN or C counterpart.
DEBUGGER:
A well designed package of program debug facilities is provided. The required tools are available to detect errors in a complex computer program without resorting an error. Simulation status information is provided, and control is optionally transferred to a user program for additional analysis and output.
EVOLUTION:
This structure allows the model to evolve easily and naturally from simple to detailed formulation as data becomes available. Many modifications, such as the choice of set disciplines and statistics are simply specified in the Preamble.
DOCUMENTATION:
You get a powerful, English-like language supporting a modular implementation. Because each model component is readable and self-contained, the model documentation is the model listing; it is never obsolete or inaccurate.
For more information contact SIMSCRIPT
Guidelines for Running SIMSCRIPT on the VAX System
Network Access/Utilities Connect UBE Username: Password: Get $ sign Step 1. Create and edit your source program. Type in $EDT PROG.SIM Step 2. Attach SIMSCRIPT. Type in $SIMSCRIPT Step 3. Compile the source program file. Type in $SIMSCOMP PROG.SIM Check for compilation errors. To locate the errors, type in $EDT PROG.LIST Step 4. Link the object file. Type in $SIMLINK PROG now you have a file containing your program in an
executable format. Step 5. To execute the program type in $RUN PROG Step 6. To get your hard copy print, type $PRINT/NAME=your own name PROG.OUT, PROG.LIS Alternatively, use submit command to place the command procedure
in the batch job que. Create a command file say PROG.COM containing: $SIMSCRIPT $SIMSCOMP PROG.SIM $SIMLINK PROG $RUN PROG Then, submit $Submit PROG.COM An Example: '' Solving an analytic equation arising from optimization '' of a coherent reliability system with 3 homogeneous components
Preamble Define V as a real 1-dimensional arrays Define I, N as integer variables End Main Open 3 for output, Name = "PROG.OUT" Use 3 for ouTPUT LET N=50 Reserve V(*) as N LET V(1)=1. For I = 1 to N-1 DO LET U=V(I) LET PPRAM=-9./((1+U)**2) + 9./((2.+U)**2) + 1./U**2 LET V(I+1)=V(I)+PPRAM/I LOOP PRINT 1 LINE WITH V(N) AND PPRAM THUS OPTIMAL RATE IS ****.*****, DERIVATIVE IS ***.***** END The output OPTIMAL RATE IS 0.76350, DERIVATIVE IS -0.00000
System Dynamics and Discrete Event Simulation
The modeling techniques used by system dynamics and discrete event simulations are often different at two levels: The modeler way of representing systems might be different, the underlying simulators' algorithms are also different. Each technique is well tuned to the purpose it is intended. However, one may use a discrete event approach to do system dynamics and vice versa.
Traditionally, the most important distinction is the purpose of the modeling. The discrete event approach is to find, e.g., how many resources the decision maker needs such as how many trucks, and how to arrange the resources to avoid bottlenecks, i.e., excessive of waiting lines, waiting times, or inventories. While the system dynamics approach is to prescribe for the decision making to, e.g., timely respond to any changes, and how to change the physical structure, e.g., physical shipping delay time, so that inventories, sales, production, etc.
System dynamics is the rigorous study of problems in system behavior using the principles of feedback, dynamics and simulation. In more words system dynamics is characterized by:
- Searching for useful solutions to real problems, especially in social systems (businesses, schools, governments,...) and the environment.
- Using computer simulation models to understand and improve such systems.
- Basing the simulation models on mental models, qualitative knowledge and numerical information.
- Using methods and insights from feedback control engineering and other scientific disciplines to assess and improve the quality of models.
- Seeking improved ways to translate scientific results into achieved implemented improvement.
- Systems dynamics approach looks at systems at a very high level so is more suited to strategic analysis. Discrete event approach may look at subsystems for a detailed analysis and is more suited, e.g., to process re-engineering problems.
- Systems dynamics is indicative, i.e., helps us understand the direction and magnitude of effects (i.e., where in the system do we need to make the changes), whereas discrete event approach is predictive (i.e., how many resources do we need to achieve a certain goal of throughout).
- Systems dynamics analysis is continuous in time and it uses mostly deterministic analysis, whereas discrete event process deals with analysis in a specific time horizon and uses stochastic analysis.
Some interesting and useful areas of system dynamics modeling approach are:
- Short-term and long term forecasting of agricultural produce with special reference to field crops and perennial fruits such as grapes, which have significant processing sectors of different proportions of total output where both demand and supply side perspectives are being considered.
- Long term relationship between the financial statements of balance sheet, income statement and cash flow statement balanced against scenarios of the stock market's need to seek a stable/growing share price combined with a satisfactory dividend and related return on shareholder funds policy.
- Managerial applications include the development and evaluation of short-term and long-term strategic plans, budget analysis and assessment, business audits and benchmarking.
A modeler must consider both as complementary tools to each other. Systems dynamic to look at the high level problem and identify areas which need more detailed analysis. Then, use discrete event modeling tools to analyze (and predict) the specific areas of interest.
What Is Social Simulation?
Social scientists have always constructed models of social phenomena. Simulation is an important method for modeling social and economic processes. In particular, it provides a "middle way" between the richness of discursive theorizing and rigorous but restrictive mathematical models. There are different types of computer simulation and their application to social scientific problems.Faster hardware and improved software have made building complex simulations easier. Computer simulation methods can be effective for the development of theories as well as for prediction. For example, macro-economic models have been used to simulate future changes in the economy; and simulations have been used in psychology to study cognitive mechanisms.
The field of 'social simulation' seems to be following an interesting line of inquiry. As a general approach in the field, a 'world' is specified with much computational detail. Then the 'world' is simulated (using computers) to reveal some of the 'non-trivial' implications (or 'emergent properties') of the 'world'. When these 'non trivial' implications are made known (fed back) in world, apparently it constitutes some 'added values'.
Artificial Life is an interdisciplinary study enterprise aimed at understanding life-as-it-is and life-as-it-could-be, and at synthesizing life-like phenomena in chemical, electronic, software, and other artificial media. Artificial Life redefines the concepts of artificial and natural, blurring the borders between traditional disciplines and providing new media and new insights into the origin and principles of life.
Simulation allows the social scientist to experiment with ‘artificial societies' and explore the implications of theories in ways not otherwise possible.
Reference and Further Readings:
Gilbert N., and K. Troitzsch, Simulation for the Social Scientist, Open University Press, Buckingham, UK, 1999.
Sichman J., R. Conte, and N. Gilbert, (eds,), Multi-Agent Systems and Agent-Based Simulation, Berlin, Springer-Verlag, 1998.
What Is Web-based Simulation?
Web-based simulation is quickly emerging as an area of significant interest for both simulation researchers and simulation practitioners. This interest in web-based simulation is a natural outgrowth of the proliferation of the World-Wide Web and its attendant technologies, e.g. HTML, HTTP, CGI, etc. Also the surging popularity of, and reliance upon, computer simulation as a problem solving and decision support systems tools.The appearance of the network-friendly programming language, Java, and of distributed object technologies like the Common Object Request Broker Architecture (CORBA) and the Object Linking and Embedding / Component Object Model (OLE/COM) have had particularly acute effects on the state of simulation practice.
Currently, the researchers in the field of web-based simulation are interested in dealing with topics such as methodologies for web-based model development, collaborative model development over the Internet, Java-based modeling and simulation, distributed modeling and simulation using web technologies, and new applications.
Parallel and Distributed Simulation
The increasing size of the systems and designs requires more efficient simulation strategies to accelerate the simulation process. Parallel and distributed simulation approaches seem to be a promising approach in this direction. Current topics under extensive research are:Synchronization, scheduling, memory management, randomized and reactive/adaptive algorithms, partitioning and load balancing.
Synchronization in multi-user distributed simulation, virtual reality environments, HLA, and interoperability.
System modeling for parallel simulation, specification, re-use of models/code, and parallelizing existing simulations.
Language and implementation issues, models of parallel simulation, execution environments, and libraries.
Theoretical and empirical studies, prediction and analysis, cost models, benchmarks, and comparative studies.
Computer architectures, VLSI, telecommunication networks, manufacturing, dynamic systems, and biological/social systems.
Web based distributed simulation such as multimedia and real time applications, fault tolerance, implementation issues, use of Java, and CORBA.
References & Further Readings:
Bossel H., Modeling & Simulation, A. K. Peters Pub., 1994.
Delaney W., and E. Vaccari, Dynamic Models and Discrete Event Simulation, Dekker, 1989.
Fishman G., Discrete-Event Simulation: Modeling, Programming and Analysis, Springer-Verlag, Berlin, 2001.
Fishwick P., Simulation Model Design and Execution: Building Digital Worlds, Prentice-Hall, Englewood Cliffs, 1995.
Ghosh S., and T. Lee, Modeling & Asynchronous Distributed Simulation: Analyzing Complex Systems, IEEE Publications, 2000.
Gimblett R., Integrating Geographic Information Systems and Agent-Based Modeling: Techniques for Simulating Social and Ecological Processes, Oxford University Press, 2002.
Harrington J., and K. Tumay, Simulation Modeling Methods: An Interactive Guide to Results-Based Decision, McGraw-Hill, 1998.
Haas P., Stochastic Petri Net Models Modeling and Simulation, Springer Verlag, 2002.
Hill D., Object-Oriented Analysis and Simulation Modeling, Addison-Wesley, 1996.
Kouikoglou V., and Y. Phillis, Hybrid Simulation Models of Production Networks, Kluwer Pub., 2001.
Law A., and W. Kelton, Simulation Modeling and Analysis, McGraw-Hill, 2000.
Nelson B., Stochastic Modeling: Analysis & Simulation, McGraw-Hill, 1995.
Oakshott L., Business Modelling and Simulation, Pitman Publishing, London, 1997.
Pidd M., Computer Simulation in Management Science, Wiley, 1998.
Rubinstein R., and B. Melamed, Modern Simulation and Modeling, Wiley, 1998.
Severance F., System Modeling and Simulation: An Introduction, Wiley, 2001.
Van den Bosch, P. and A. Van der Klauw, Modeling, Identification & Simulation of Dynamical Systems, CRC Press, 1994.
Woods R., and K. Lawrence, Modeling and Simulation of Dynamic Systems, Prentice Hall, 1997.
Techniques for Sensitivity Estimation
Simulation continues to be the primary method by which engineers and managers obtain information about complex stochastic systems, such as telecommunication networks, health service, corporate planning, financial modeling, production assembly lines, and flexible manufacturing systems. These systems are driven by the occurrence of discrete events; and complex interactions within these discrete events occur over time. For most discrete event systems (DES) no analytical methods are available, so DES must be studied via simulation. DES are studied to understand their performance, and to determine the best ways to improve their performance. In particular, one is often interested in how system performance depends on the system's parameter v, which could be a vector.DES's system performance is often measured as an expected value. Consider a system with continuous parameter v Î V Í Rn, where V is an open set. Let
J(v) = EY | v [Z (Y)] be the steady state expected performance measure, where Y is a random vector with known probability density function (pdf), f(y; v) depends on v, and Z is the performance measure.
In discrete event systems, Monte Carlo simulation is usually needed to estimate J(v) for a given value v = v0. By the law of large numbers
J(v0) = 1/n S Z (y i) converges to the true value, where yi, i = 1, 2,..., n are independent, identically distributed, random vector realizations of Y from f (y; v 0), and n is the number of independent replications.
We are interested in sensitivities estimation of J(v) with respect to v.
Applications of sensitivity information
There are a number of areas where sensitivity information (the gradient, Hessian, etc.) of a performance measure J(v) or some estimate of it, is used for the purpose of analysis and control. In what follows, we single out a few such areas and briefly discuss them.Local information: An estimate for dJ/dv is a good local measure of the effect of on performance. For example, simply knowing the sign of the derivative dJ/dv at some point v immediately gives us the direction in which v should be changed. The magnitude of dJ/d? also provides useful information in an initial design process: If dJ/dv is small, we conclude that J is not very sensitive to changes in , and hence focusing concentration on other parameters may improve performance.
Structural properties: Often sensitivity analysis provides not only a numerical value for the sample derivative, but also an expression which captures the nature of the dependence of a performance measure on the parameter v . The simplest case arises when dJ/dv can be seen to be always positive (or always negative) for any sample path; we may not be able to tell if the value of J(v) is monotonically increasing (or decreasing) in v . This information in itself is very useful in design and analysis. More generally, the form of dJ/dv can reveal interesting structural properties of the DES (e.g., monotonicity, convexity). Such properties must be exploited in order to determine optimal operating policies for some systems.
Response surface generation: Often our ultimate goal is to obtain the function J(v), i.e., a curve describing how the system responds to different values of v . Since J(v) is unknown, one alternative is to obtain estimates of J(v) for as many values of v as possible. This is clearly a prohibitively difficult task. Derivative information, however may include not only first-order but also higher derivatives which can be used to approximate J(v). If such derivative information can be easily and accurately obtained, the task of response surface generation may be accomplished as well.
Goal-seeking and What-if problems: Stochastic models typically depend upon various uncertain parameters that must be estimated from existing data sets. Statistical questions of how input parameter uncertainty propagates through the model into output parameter uncertainty is the so-called "what-if" analysis. A good answer to this question often requires sensitivity estimates. The ordinary simulation output results are the solution of a direct problem: Given the underlying pdf with a particular parameter value v , we may estimate the output function J(v). Now we pose the goal-seeking problem: given a target output value J0 of the system and a parameterized pdf family, find an input value for the parameter, which generates such an output. There are strong motivations for both problems. When v is any controllable or uncontrollable parameter [the decision maker is, for example, interested in estimating J(v) for a small change in v ], the so called "what-if" problem, which is a "direct problem" and can be solved by incorporating sensitivity information in the Taylor's expansion of J(v) in the neighborhood of v . However, when v is a controllable input, the decision maker may be interested in the goal-seeking problem: what change in the input parameter will achieve a desired change in output value J(v). Another application of goal-seeking arises when we want to adapt a model to satisfy a new equality constraint (condition) for some stochastic function. The solution to the goal-seeking problem is to estimate the derivative of the output function with respect to the input parameter for the nominal system; use this estimate in a Taylor's expansion of the output function in the neighborhood of the parameter; and finally, use Robbins-Monro (R-M) type of stochastic approximation algorithm to estimate the necessary controllable input parameter value within the desired accuracy.
Optimization: Discrete-event simulation is the primary analysis tool for designing complex systems. However, simulation must be linked with a mathematical optimization technique to be effectively used for systems design. The sensitivity dJ/dv can be used in conjunction with various optimization algorithms whose function is to gradually adjust v until a point is reached where J(v) is maximized (or minimized). If no other constraints on v are imposed, we expect dJ/dv = 0 at this point .