To use all functions of this page, please activate cookies in your browser.
my.chemeurope.com
With an accout for my.chemeurope.com you can always see everything at a glance – and you can configure your own website and individual newsletter.
- My watch list
- My saved searches
- My saved topics
- My newsletter
Importance samplingImportance sampling is a general technique for estimating the properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both. Additional recommended knowledge
Basic theoryMore formally, let X be a random variable in S. Let p be a probability measure on S, and f some function on S. Then the expectation of f under p can be written as
If we have random samples In that case, we can easily obtain the Monte-Carlo empirical estimate of
Unfortunately, when the samples are generated from a different distribution than the one that we are interested in, we can no longer use this straightforward estimate. However, we may use the importance sampling technique, which consists of placing different importance on each sample, depending on how likely it was for it to have been generated by the distribution that we're interested in, p, rather than the actual sampling distribution, q. More formally, consider another probability measure, q, with the same support as p. From the definition of the expectation given above, we have
where
where The technique is completely general and the above analysis can be repeated essentially exactly also for other choices of p, for example when it represents a conditional distribution. Note that when p is the uniform distribution, we are just estimating the (scaled) integral of f over S, so the method can also be used for estimating simple integrals. There are two main applications of importance sampling methods, which naturally, are interrelated. While the aim of both applications is to estimate statistics of random variables, the field of probabilistic inference focuses more on the estimation of p or related statistics, while the field of simulation focuses more on the choice of the distribution q. Nevertheless, the basic theory and tools are identical. Application to probabilistic inferenceSuch methods are frequently used to estimate posterior densities or expectations in state and/or parameter estimation problems in probabilistic models that are too hard to treat analytically. Application to simulationImportance sampling (IS) is a variance reduction technique that can be used in the Monte Carlo method. The idea behind IS is that certain values of the input random variables in a simulation have more impact on the parameter being estimated than others. If these "important" values are emphasized by sampling more frequently, then the estimator variance can be reduced. Hence, the basic methodology in IS is to choose a distribution which "encourages" the important values. This use of "biased" distributions will result in a biased estimator if it is applied directly in the simulation. However, the simulation outputs are weighted to correct for the use of the biased distribution, and this ensures that the new IS estimator is unbiased. The weight is given by the likelihood ratio, that is, the Radon-Nikodym derivative of the true underlying distribution with respect to the biased simulation distribution. The fundamental issue in implementing IS simulation is the choice of the biased distribution which encourages the important regions of the input variables. Choosing or designing a good biased distribution is the "art" of IS. The rewards for a good distribution can be huge run-time savings; the penalty for a bad distribution can be longer run times than for a general Monte Carlo simulation without importance sampling. Mathematical ApproachConsider estimating by simulation the probability Importance sampling is concerned with the determination and use of an alternate density function where is a likelihood ratio and is referred to as the weighting function. The last equality in the above equation motivates the estimator This is the IS estimator of Now, the IS problem then focuses on finding a biasing density Conventional biasing methodsAlthough there are many kinds of biasing methods, the following two methods are most widely used in the applications of IS. ScalingShifting probability mass into the event region In IS by scaling, the simulation density is chosen as the density function of the scaled random variable and the weighting function is While scaling shifts probability mass into the desired event region, it also pushes mass into the complementary region TranslationAnother simple and effective biasing technique employs translation of the density function (and hence random variable) to place much of its probability mass in the rare event region. Translation does not suffer from a dimensionality effect and has been successfully used in several applications relating to simulation of digital communication systems. It often provides better simulation gains than scaling. In biasing by translation, the simulation density is given by where Effects of System ComplexityThe fundamental problem with IS is that designing good biased distributions becomes more complicated as the system complexity increases. Complex systems are the systems with long memory since complex processing of a few inputs is much easier to handle. This dimensionality or memory can cause problems in three ways:
In principle, the IS ideas remain the same in these situations, but the design becomes much harder. A successful approach to combat this problem is essentially breaking down a simulation into several smaller, more sharply defined subproblems. Then IS strategies are used to target each of the simpler subproblems. Examples of techniques to break the simulation down are conditioning and error-event simulation (EES) and regenerative simulation. Evaluation of ISIn order to identify successful IS techniques, it is useful to be able to quantify the run-time savings due to the use of the IS approach. The performance measure commonly used is Variance Cost FunctionVariance is not the only possible cost function for a simulation, and other cost functions, such as the mean absolute deviation, are used in various statistical applications. Nevertheless, the variance is the primary cost function addressed in the literature, probably due to the use of variances in confidence intervals and in the performance measure An associated issue is the fact that the ratio References
See also
References
|
|
This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Importance_sampling". A list of authors is available in Wikipedia. |