







      
           MM      MM    LL        DDDDDD       CCCCCC
          MMM    MMM    LL        DD   DD     CC     CC
          MMMM  MMMM    LL        DD    DD   CC
          MM MMMM MM    LL        DD    DD   CC
          MM  MM  MM    LL        DD   DD     CC     CC
          MM  MM  MM    LLLLLLLL  DDDDDD       CCCCCC
  


             MLDC/94:  Modified LAOCOON with Dipolar Couplings
                     PERCHIT: PERCH version of NUMARIT
                    DORES: DOuble REsonance Simulation 
                                      
                                      
                                     by
           Reino Laatikainen, Ursula Weber* and Matthias Niemitz
                    Kuopio University NMR Research Group
                          Department of Chemistry
                     University of Kuopio, P.O.Box 1627
                              SF-70211 Kuopio
                                  Finland



                         Version January 15th, 1994
                   * University of Duesseldorf, Germany 
         1994 PERCH Project, University of Kuopio, Kuopio, Finland.
                            All rights reserved.CONTENTS

1.   INTRODUCTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3

2.   SETUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4

3.   EXECUTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4

4.   INPUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     4.1  Spectral parameters file ([name].PMS). . . . . . . . . . . . . .   4
     4.2  Experimental frequency file ([name].DAT) . . . . . . . . . . . .   8
     4.3  Output ([name].OUT and MLDOC). . . . . . . . . . . . . . . . . .   8

5.   MODES AND OPTIONS . . . . . . . . . . . . . . . . . . . . . . . . . .  10
     5.1  Run-time control parameters. . . . . . . . . . . . . . . . . . .  12

6.   THE FLOW OF ANALYSIS. . . . . . . . . . . . . . . . . . . . . . . . .  14
     6.1  Simulation:. . . . . . . . . . . . . . . . . . . . . . . . . . .  14
     6.2  Assignment:. . . . . . . . . . . . . . . . . . . . . . . . . . .  14
     6.3  Optimization of spectral parameters: . . . . . . . . . . . . . .  14
     6.4  Further examination of the solution. . . . . . . . . . . . . . .  17

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  17
1.   INTRODUCTION

This manual contains instructions for use of the programs
MLDC/94,  PERCHIT and DORES. MLDC/94  is able to handle up to 10
spins but cannot take advantage of X-approximation.  The quantum
mechanical calculations of PERCHIT are a based on the algorithms
used in the program NUMARIT [see J.S.Martin and A.R.Quirt,
J.Magn.Reson., 5, 318 (1971)]  and the present version of
program is able to handle up to 8 nuclei  (symmetric pairs or
spins up to 9/2; a bigger version will be released soon), taking
an advange of symmetry and X-approximation. DORES [see G.Govil
and D.H.Whiffen,  Mol.Phys., 12, 449 (1967)] is meant for
simulation of double resonance experiments. MLDC and  DORES have
identical input format;  the nuclei indexes and symmetry are
given in a slightly different way for PERCHIT. 

The present versions (1/1994) of DORES and PERCHIT are under
testing and have no release status yet. They are not able to
include dipolar couplings. Any suggestions, comments and reports
are highly appreciated.

In MLDC (Modified LAOCOON with Dipolar Couplings) and PERCHIT
the normal phases (simulation, assignment, reassignments and
examination of dependence of spectral parameters on overlap of
lines) of the traditional spectral analysis can be incorporated
into one run. The program contains also an integral-transform-
fitting mode that allows the total-line-shape analysis. 

The full menu consists of two phases:

A.The refinement of spectral parameters using integral
transforms (IT's) or a minimal number of assigned lines or both.
   
B.The peak-top-fitting: the refinement of the result taking the
overlap of lines into account and the examination of the
dependence of the result on the line-width and line-shape
parameters. 

Although the IT-fitting gives in principle as accurate spectral
parameters as the frequency fitting, the result may be less
reliable in practice because it is based also on intensity
information, which is much more vulnerable to artifacts. The
refinement of the result using the peak-top-fitting is thus
appropriate after the IT-fitting. The IT-strategy has been
described in details in ref. 1. Detailed instructions for
preparation of  a spectrum for the IT fitting are given with the
example 5, in the PERCH manual.

NOTE: The present version is meant to be used with the PERCH
software.



2.   SETUP

MLDC, PER and DOR are delivered and setup with the PERCH
software package
The disks contain the following model files:
 
EX1.OBS      An experimental spectrum.
EX1.PMS      A spectral parameter file or MLDC and DORES.
EX1it.PMS    A spectral parameters file for PERCHIT 
EX1.DAT      An experimental frequency file.
EX2.PMS      Another trial parameter file for the spectrum in
             EX1.OBS.
BENZENE.PMS  An input for the benzene 13C-coupled 1H NMR system
             showing how to define dipolar couplings for the
             program MLDC.
BENZit.PMS     Shows how the symmetry of benzene is given for
               PERCHIT.
A2B2X2.PMS   Shows how the symmetry of 1,2-difluorobenzene is
             given for PERCHIT.



3.   EXECUTION

MLDC is started by the command MLD, PERCHIT by PER and DORES by
DOR. One also can start MLDC via the batch file MLB.BAT, when
the control parameters for MLDC are stored in the file
[name].PAR.



4.   INPUT

The spectral parameters are given in the file [name].PMS,
experimental frequencies are in [name].DAT and the digital
spectrum used by the IT-modes is in [name].OBS.


4.1  Spectral parameters file ([name].PMS)

The best way to prepare the input is to edit an old PMS-type
file. With the command PMS the DOS screen editor (EDIT.COM) is
opening [name].PMS. The present input format contains 6
sections:

1.   Title of the case 
2.   Chemical shifts
3.   Coupling constants
4.   Other parameters
5.   Linear constraints for parameters (optional)
The sections are separated by an empty line(card) and the first
line of each section may contain any text. Section 2 may contain
several modules, one for each nucleus. Formats are given in -
FORTRAN-notation (A=Character, I=Integer Numbers, F=Real Numbers
etc.).


1.   Title of the case

Format    Up to 60 characters.
Itext     contains the title of the case


2.   Chemical shifts
    
TYPE 1a card: defines the name and the SPIN of the nucleus   
Format    LTYP, NSPIN, POWER
LTYP      is the type of the species (8 characters are
          accepted). For example, give LTYP = proton for
          protons, 'carbon' for carbons, etc.. If you want to
          apply the X-approximation for a proton, set use
          'formyl' for the one type of protons and 'aromatic,
          (for example) for the other, etc.. If LTYP is not used
          for heteronuclear spectra, the chemical shift
          difference of the different type nuclei must be very
          large to avoid small second-order effects.
NSPIN     2 x the spin of the nucleus, default = 1 (spin = 1/2)
          (for PERCHIT only).
POWER     The decoupling power in Hz (for DORES only).
 
Type 1b card: the values of the chemical shifts
Format    ITH(L),PM(L),(IA(L,K),K=1,10) 
          Give ITH in columns 1-4, otherwise the format is free:
          the numbers are separated by  spaces or commas. 
         
ITH       The index of the parameter (0 => not optimized,
          negative => not optimized with option 2).
PM        The value of the chemical shift.

MLDC,DORES:
IA        Defines the indexes of the nucleus having the same
          chemical shift.

PERCHIT:
IA(L,1)   2 if chemically equivalent pair (AA'), otherwise not
          given..  
IA(L,2)   Number of magnetically equivalent nuclei. For example,
          A3 A3' is defined        by 2*3.                

3.   Coupling constants    

Format    ITH(L),JTYP,PM(L),(IA(L,K),IB(L,K),K=1,10)
          Give ITH in columns 1-4, otherwise format is free, IA
          and IB must be as separated by a spaces or commas. 

ITH       The index of the parameter (0 => not optimized,
          negative => not optimized with option 2).
PM        The value of the coupling.
JTYP      Use 'J' for isotropic coupling and 'D' for dipolar
          couplings (for MLDC only). Default = 'J'.
IA,IB     Define the origin of nuclei. For example, IA=1,IB=2
          and LTYP = 'J' give the value PM for the coupling
          constant J(1,2).

PERCHIT:
          The couplings between chemically equivalent nuclei of,
          for example, AA'BB' system are given by 1,1 and 2,2.
          The J(AB) coupling of the AA'BB' system is defined by
          1,2 and J(AB') is defined by 2,1. 

NOTE:     If a parameter is not defined in a card, the parameter
          is set to zero.


4.   Other parameters

AMIN,DIAG,DOWN,UP,RESOL,DATAR,XGAU,ASYM,DEC
Format    each parameter has its own card, the value is given in
          the columns 1-12, the rest of the card is reserved for
          the text.

     NOTE:   If RESOL, DATAR, XGAU and ASYM are set zero, they
             are read from [name].INF.

AMIN      is the minimum intensity of the transitions included.
          For a spins less than 6, one can use any value. A
          suitable value is often 0.001-0.01. For larger systems
          AMIN of 0.01 is usually large enough and the lines
          with AMIN < 0.01 are not observable.
DIAG      is the diagonalization criterium for Hamiltonian. DIAG
          should be the accuracy within which the theoretical
          lines are to be computed. 0.0001 is small enough for
          almost any case, 0.01 or 0.001 is sufficient in the
          most cases. DIAG effects on time needed in the
          calculation of the spectrum. For a rough simulation or
          for analysis of very large and asymmetric spin
          systems, DIAG of 0.01 can be advantageous.
DOWN      is the lower limit of the frequency range, within
          which the transitions are computed.
UP        is the upper limit of the frequency range.
RESOL     is the half-height line-width required by the modes P,
          D and I. Also some default values of the mode A are
          based on the given line-width.
DATAR     is the data-point-resolution for computing the peak-
          tops by mode P. DATAR is not the accuracy, with which
          the peak-tops are obtained, because the peak-tops are
          interpolated after the maxima are located between two
          frequencies separated with DATAR. DATAR should be the
          same as used in the measurement of the spectrum. A
          good value for DATAR is 0.05.  DATAR must be bigger
          than 0.01 * RESOL.
XGAU is the contribution (in %) of the Gaussian distribution
     function to the line-shape. The sum of the Gaussian and
     Lorenzian contributions is 100%; XGAU can be negative or
     bigger than 100%.
ASYM is the asymmetry factor of the line-shape (as given by
     PIC). For symmetric lines ASYM =1.0.
DEC       is the decoupling frequency, the decoupling power is
          given  on the 1B type card.


5.   Constraining equations

(EQUP(I,K),IEQUP(I,K),K=1,5)
Format    for example as follows:  1.234*11 = 1.457*12 - 0.1*13

EQUP(I,i) The coefficient in the following formula.
IEQUP(I,j)   The index ij in X(ij) for the following formula.
          The above equation means that:
             1.234 * P11 = 1.457 * P12 - 0.1 * P13
             where P11 is the 11th parameter given above by ITH.
             The equation is taken into account during the
             iteration in the least-square sense. If the
             equation is wanted to have larger weight, multiply
             the coefficient by 5 or 10. A typical application
             is the constraining of dipolar couplings on the
             basis of geometry. For standard geometry  
                    1.53 *  3D(ortho)  =   7.95  * 4D(meta)
                    1.53 *  3D(ortho)  =  12.24  * 5D(para)


4.2  Experimental frequency file ([name].DAT)

The experimental frequencies are fed from the digital spectrum
into the file [name].DAT using the command PIC or by editing
[name].DAT manually (command DAT) . One useful way in analysis
of many similar type spectra is to copy the simulated spectrum
[name].OUT into [name].DAT and change the frequencies using the
editor.

INDL,OBSL,HEIGHT
Format    I10,2F12.4
INDL      is the assignment of the observed peak-top frequency
          at OBSL. If INDL is not known (or the spectrum is to
          be simulated first or mode I is to be used), set INDL
          = 0. For example, INDL = 1122 means the transition
          between states 11 and 22. Lines with negative INDL are
          ignored.
OBSL      is the observed transition frequency (in Hz), given
          preferably in descending order. 
HEIGHT    is the intensity of the observed line, actually the
          total area of the line. The intensities are required
          in mode I. The height of the line is not a good
          estimate of its intensity if the line is composed from
          a few non-degenerate lines or if the line overlaps
          with other lines. In most cases it is, however,
          sufficient to give just these peak-heights or use PIC
          to remove the effect of the overlap using a simple
          method. If the line-intensities are obtained by using
          a deconvolution procedure (TLS), set the line-width
          very small for the integral transform analysis. See
          also option 15. 


4.3  Output ([name].OUT and MLDOC)

IND(L),HZ(L),HEI(L),NUC,OBS(L),DEV,TOP(L),DTOP,IFUSED
Format    I10,F12.4,F8.3,I4,2(F12.4,F8.4),A1

IND(L)    The index of the line L. For example, the line 1122
          origins from the transition between the energy levels
          11 and 22.
HZ(L)        The calculated transition frequency.
HEI(L)    The intensity of the transition.
NUC       The origin of the transition. 99 means a combination
          line or that the system is strongly second-order type
          and that the suggested origin of the transition is not
          reliable.
OBS(L)    The observed transition frequency. 
DEV       HZ(L)-OBS(L)
TOP(L)    The peak-top frequency given by mode P.
DTOP      HZ(L)-TOP(L)
IFUSED    Give * to include the peak-top in the optimization of
          the spectral parameters or give - to reject.
5.   MODES AND OPTIONS

The mode and options are defined in the 'CONTROL SECTION'.  For
example, the options given with
                                             1,2<enter>
mean that the trial spectral parameters are taken from
[name].sav and that the parameters marked with negative indexes
(in [name].pms) are not iterated. When the program returns to
the control section, all the options are cancelled and MODE is
set to 0.

Modes:
     0 =  None; much like the traditional LAOCOON3-type
          analysis.
     C =  Checking mode: the traditional LAOCOON3 procedure with
          a possibility to reject bad assignments.
     A =  Automatic Assignment procedure. See also option 2.
     P =  Peak-top-fitting. The theoretical peak-tops are
          computed by using given the line-width and line-shape,
          which have been estimated from the spectrum or which
          is varied to give a minimum rms-value. For computing
          the exact peak-tops frequencies, the program requires
          also the parameter DATAR corresponding to the data-
          resolution of the observed spectrum. - This procedure
          is rather useless if the spectrum has been recorded
          with a system which is unable to interpolate the peak-
          top-frequencies.
     I =  Integral-transform-fitting: the optimization is based
          on integral transforms of the spectrum. Intensities of
          the signals must be given in the input. 
     D =  Integral-transform-fitting when the experimental
          spectrum is given digital form. The mode with a small
          SPAN performs effectively the TLS-fitting.
     S =  Simulation using parameters in [name].PMS
     T =  Real total-line-shape fitting [under testing] 

Options:

     1 =  Spectral parameters are read from the file [name].SAV
          and eigenvectors are read from VECTORS. The
          eigenvectors are used in prediagonalization of the
          Hamiltonian.
     2 =  The parameters with negative indexes are not iterated.
     3 =  Default values are not used for the control
          parameters. This option is useful when the effects of
          the rejection criterium and the line-shape parameters
          on the result are examined.
     4 =  Each observation, with ADEV = the absolute value of
          the difference between an observed and calculated
          frequency, is weighted by                              
               ADEV/(RMS+0.1*ADEV)
          in solving the adjustments to the spectral parameters.
          The  98% confidence limit is used in rejecting bad
          assignments, instead of 95% used otherwise. This
          option is an alternative of small OUTCRI, especially
          when the number of spectral lines is small in
          comparison with the number of the parameters to be
          optimized.
     5 =  Short output.
     10 = Long output.
     11 = Each signal is weighted by SQRT(HEI); HEI = the
          intensity of the signal. The default rejections
          criterium is increased to 3*RMS. 
          33 = Assigned lines are taken into account in the it-
               modes.    

Weighting procedures for mode=P (see also options 4 and 11):

     13 = each signal is weighted by 1/SQRT(NTOP), where NTOP =
          no. of lines within +RESOL from the peak-top.
     14 = The weight = 1/NTOP (see above).
     17 = The weight =  SQRT(HEI), where HEI = is the intensity
          of the signal.

Iterative procedure:

     18 = Standard regression analysis used instead of the
          principal component regression analysis.
     21 = No test for convergence. Useful in batch operation
          when option 2 is used.

Options for the IT procedure:

     25 = The values of the IT's are printed.
     38 = An option meant for eliminating impurity signals and
          poor estimates of intensities of lines in modes D and
          I.
     33 = See above.
     34 = Use stick spectra for forming IT's till SPAN is close
          to line-width.

     NOTE:   20-40 may contain also options for testing the
             program. 


5.1  Run-time control parameters

One can redefine the following control parameters during the run
from the terminal or on the control cards in batch operation.
The control parameters have default values, which should work in
the most of the cases.

In mode A 

REJCRI =  the criterium for bad assignments. If the default
          value is not approved or if the distribution of the
          observed-calculated differences are is examined,
          REJCRI can be redefined.  
     
          REJCRI is given in the control section for the whole
          following phase. The default is that the bad
          assignments are found out by using the automatic
          procedure (see below). If a positive REJCRI is given,
          the bad assignments are rejected by using the value.
          When a negative REJCRI is given, the assigning is done
          as above and the 'default REJCRI' is used unless
          smaller than the 'given ABS(REJCRI)'. If one uses a
          small positive REJCRI, there is a danger that also
          correct assignments are rejected; a recommendable way
          is to use option 4 with a positive REJCRI, which can
          thus be set larger.


The default for the automatic rejection of bad assignments is
obtained as follows (in the subroutine NEWRMS):

     -    The rms-value is computed including all the
          assignments.
     -    The observed-calculated values bigger than t95% * rms
          (or t98% * rms with option 4) are picked up and the new
          rms is computed ignoring these assignments.
     -    The process is repeated till the number of bad
          assignments is not changed or till the rms is not
          reduced significantly (as tested by F-test). For
          option 4 or if the number of the bad assignments
          exceeds 0.2 * NOL( = number of lines after the first
          rejection cycle), the risk limit is increased to 98%
          and if it exceeds 0.5 * NOL, to 99%.
     
     t95% = the Student t-factor for 95% confidence level.


In mode P

REJCRI    =  the rejection criterium for excluding peaks from
             the optimization of the parameters. The default
             value is obtained as described above.
RESOL     =  the new line-width.
DATAR     =  the new data-resolution.
XGAU =    the new value of the Gaussian contribution (in %) to
          the line-shape. Because the old value is used if XGAU
          is tried to be set to 0.0, XGAU cannot be set to zero
          in the CONTROL SECTION; however, it can be set to
          0.01, for example. 

For option 4, see above.


In modes D, I and T (T will be released during 1994)

SMAX =    The starting value of the SPAN (= the width) of the
          functions used in the IT fitting (1). The default
          value is twice the largest sum of coupling constants
          of one nucleus. The width is gradually narrowed until
          it reaches the value of SMIN.
SMIN      =  See above. As a rule, SMIN should be 1-2 times the
             smallest (J+D) coupling constant to be iterated or
             bigger than the line-width. The default value is
             the smallest difference of two peaks in the
             observed spectrum.
GRAD =    The factor with which SPAN is multiplied at each step.
          
THRES     =  The threshold percentage. THRES = 100.0 means that
             the principal components (PC) with d(RMS)/d(PC)
             smaller than 100 % of the (weighted) RMS are
             excluded in the principal component regression.
             Keeping THRES larger than the default one may try
             to find a solution starting with a poor guess. When
             a good minimum is approached, THRES can be also
             smaller than the default. 

     NOTE:   Modes D, I and T require also the resolution
             (RESOL) and Gaussian % (XGAU) because they are used
             in the overlap correction of line intensities.




6.   THE FLOW OF ANALYSIS


6.1  Simulation:

The trial spectral parameters are obtained from the spectrum for
example, by rough first-order analysis and given in [name].PMS. 


6.2  Assignment:

Assign as many lines as necessary, to fix every spectral
parameter that you want to optimize. This rule is not strict
because the principal component analysis prevents the divergence
when the number of independent parameters (=RANK) is smaller
than the number of parameters to be adjusted.

As a rule, to be able to optimize a chemical shift it, one must
give at least one line arising from the nucleus. To be able to
optimize a coupling constant, one must assign at least one
doublet arising from the coupling. If each nucleus of the spin
system gives a well-defined multiplet, the outmost lines of the
multiplets are usually easy to assign. For example, to make all
the 10 spectral parameters of AMRX spin-system well-defined, one
must assign 10 lines, 8 of which can be outmost lines of the
multiplets. 

The simulation and assignment phases are not necessary if the
analysis is started with the integral-transform-fitting (modes
D, I and T). However, a careful  baseline correction and
'scaling' of the observed spectrum (see A10/PERCH manual) should
be done if the D or T mode is applied for a tightly coupled
and/or strongly resolution enhanced spectrum (see example
5/PERCH manual).    


6.3  Optimization of spectral parameters: 

Phase 1

The purpose of this phase is to bring the observed and
calculated spectra so close to each other that the automatic
assignment procedure is able to assign the most of the lines
correctly.

Start the run with mode 0 (default), C, D or I. In mode 0 the
program behaves like the program LAOCOON3. If there is a
possibility of mistakes in the manual assignments, start with
mode C. If you are going to refine the result in mode A or P,
give 1-3 for the number of iterations because a complete
convergence of the result is not needed.

If you start with modes D or I, give 1 or 2 for the number of
the (grand) iteration cycles. If some spectral parameters are
known relatively well, they can be left out of the iteration for
the mode by giving them a negative index and using option 2.
When the rms is converged, being numerically clearly smaller
than SPAN in absence of artifacts in the spectrum, go on to the
following phase. If the spectrum contains impurity signals or
the line-intensities are biased, it may be worth to decrease
SMAX to about the size of average coupling constant for the
following cycle. 

Phases 2 -

The purpose of this phase is to refine the result.

As a rule,  if there are overlapping or nearly degenerate lines,
the final refinement of the spectral parameters is started
giving mode P. If there are no or  little overlap between
spectral lines one may use  mode A. The program assigns now all
the lines and rejects the bad assignments automatically. Go on
the iteration till the spectral parameters and rms converge. The
default convergence test is rather strict and the test is not
fulfilled, although rarely, for example, if the spectral parame-
ters start tilting between two close solutions.

If there are plenty of overlapping or nearly degenerate lines,
use option 4 to decrease the importance of the lines with large
observed-calculated differences. If the default rejection
criterium differs clearly from the expected standard deviation
of the observed frequencies multiplied by 2-3 or if the
automatic procedure tends to reject too many lines, you may vary
the rejection criterium.

The examination of the effects of the rejection criterium, the
line-width and the line-shape on the result can be done
conveniently redefining them in a new phase. In an ideal case,
the result is insensitive to the parameters. If the result
depends on the parameters, choose the solution giving the best
fit, on the presumption that the number of the rejected lines
and the line-shape parameters are sensible, and take the
sensitivity of the spectral parameters into account in your
conclusions.

A real TLS mode (T), suitable for last refinement of the result,
will be released during 1994.
6.4  Further examination of the solution

The optimized spectral parameters are stored in [name].SAV at
the end of the run. To try later another line-width (etc.), the
last spectral parameters can be read from [name].SAV by using
option 1. Normally one can start then the optimization directly
in mode P. In the same way one can test the effect of the sign
of a coupling: after finding a good solution with one sign
combination, change the sign of the coupling in [name].SAV and
start the analysis directly in mode P (A, D or I) with option 1.

The final optimized parameters may depend on the line-shape and
control parameters used, on the options, on the lines included,
on the computer and, in the case of the peak-top-fitting mode,
even on the trial parameters if there are signals composed of
several non-degenerate lines. However, this variation should
occur normally well within the 90% confidence limits given by
the program. For example, when spectral parameters arising from
different chemical conditions are compared and very high
accuracy is desired, the spectral analysis should be performed
using so similar procedures and rejection criteria as possible.

REFERENCES

1.   R. Laatikainen, J.Magn.Reson., 92, 1 (1991).
2.   R. Laatikainen, J.Magn.Reson., 78, 127 (1988).
3.   R. Laatikainen, Magn.Reson.Chem., 24, 588 (1986).

