tmaterna.com

THOMATE

4. THOMATE

4.1 The theory

We are going to minimize the :

(12)

Using (9), the can be written as a function of the source variables :

(13)

The are the statistical errors. In most cases, they depend only on the experiment since the errors on the can be made negligible by generating enough evnts in the Monte-Carlo simulation.

It would be too good to be true if it were really that simple. Unfortunately, we are dealing with distributions and hence, we must have

(14)

because the distribution cannot have negative values, e.g. one cannot have -20 customers earning more than $50.000, or -200 events emitting 1 neutron.

4.2 The implementation

An exact resolution of the problem is not trivial. One should solve

and if the solution doesn't satisfy (14), one must then search a solution on the boundaries of the allowed domain, i.e. for a certain number of indices . The dimension of the space (the number of possible ) reaching easily 103 in practical cases, this method requires huge amounts of time.

To avoid this limitation, we are going to use an iterative method that will satisfy (14) at each step. is a paraboloid (see equation (13)) and hence we can start from any point such that (14) is satisfied and follow the slope towards the minimum, paying attention to remain in the valid domain. For that, we have to calculate the gradient:

   with

(15)

   where we have defined

(16)
(17)

Assuming we start from and look at a cut in along the direction of the gradient

It is simply a parabola (cut in a paraboloid by a plane parallel to its axis) of which one can find the minimum by knowing its value in 3 points,

for e.g. . One then has

The minimum corresponds to

Note that otherwise, one would have had , meaning that we were already at the minimum.

One defines . Before moving to the next iteration, one applies to a transformation that equals to 0 all its negative components:

By iterating until , or equivalently , one finds the solution.

A improvement can easily be made to this method: instead of using the gradient to look for the minimum, one uses a more efficient vector

(18)

which simple translates the fact that if the coordinate is 0 and the slope doesn't tend to make it grow, it can go down further and it is not worth it to go look there for the minimum since we will have to set this coordinate back to 0. The improvement is illustrated on the figure below. One sees easily that the newly defined vector will reduce the number of iterations.

4.3 Notes on the solutions

  1. By using this iterative procedure, the solution depends on the initial guess .
  2. There is always convergence towards a solution, and the minimum reached is automatically a absolute minimum.
  3. The statistical errors are now taken into account. The methods needs muche less statistics to obtain significant results.
  4. If (meaning that the only source distribution making the model predict the null observable distribution is the null distribution), the solution is unique. If not, all the solutions are known and can be obtained by adding a vector in to the solution found.

    It is interesting to note here that depends on the experimental setup, one can hence prepare the setup in such a way as to reduce the dimension of , improving therefore the significance of the information collected. The can be a valuable tool when preparing an experiment.


tmaterna.com is maintained by Thomas Materna
thomas@tmaterna.com