#2. Significance testing

MERLIN can perform many statistical tests, but the one we get asked about most is how to flag significant differences between cells in the same row with letters – something like this…

The letter highlighted in yellow denotes that the percentage above it is significantly greater than the percentage in column B of the same row at the 99% level – and the letter highlighted in green denotes that the mean score above it is significantly greater than the mean score in column G of the same row at the 95% level.

In the table above, we have set formats TTS3/SHG1, but this article describes many variations allowed.

Which values are to be tested?

This is determined by format TTS, which can be set as follows:
TTS0 = no tests
TTS1 = test mean scores only (using a t-test)
TTS2 = test column percentages only (using a z-test)
TTS3 = test both

Which columns are to be tested against which?

This is determined by format SHG:
SHG-2 = test each column against the remainder (i.e. total column minus current column)
SHG-1 = test each column against the total column
SHG0 = test all columns against each other (the default setting)
SHG1 = test within each 1st level header group, i.e. REGION, AGE, GENDER
SHG2 = test within each 2nd level header group (known in MERLIN as ‘overheaders’)
SHG3 = test within each 3rd level header group (known in MERLIN as ‘superheaders’)

Note that SHG-1 is a special case that should not be used unless the data has been changed with MANIP.

In the table above, we have used SHG1 – as indicated by the footnote which is generated automatically (although its appearance and position can be varied with special text %SHG and formats CGI, CGS and PSF).

Clients sometimes wish to test columns within header groups (SHG1) and against total minus each column (SHG-2) on the same table – and this can be achieved using formats TTS3/SHG1/SGX3. SGX does the same test as TTS3/SHG-2, but flags cells with one or two asterisks (depending on the significance level) instead of letters, and making it easy to distinguish between the two tests. By default, the asterisks are shown in the same cell as the values but format FBC allows you to move them below, and format SMS allows you to show + or – instead of *. Special text %SGX allows you to add text to the footnote.

Finally, you can specify your own pattern of testing with the SELECT SHG statement, e.g. SELECT SHG (1.4) (2.5) (3.6), means test column 1 against 4, 2 against 5, and 3 against 6.

Which levels of testing are to be used?

MERLIN allows testing at two significance levels, determined by formats SLA and SLB. By default these are set to 95 and 99 respectively but can be given any values between 60 – 99.99999, with up to 5 decimal places. If SLA and SLB have the same value, only one level of testing will be done.

Which flag characters are to be used?

By default, the columns on each page will be lettered from A-Z (lower case for SLA level and UPPER case for SLB level). This means that you cannot test more than 26 columns, which can be a problem when outputting tables to Excel with no ‘page breaks’ in the banners – but there are two ways around this. First, format TTL may be set to:
TTL0 = assign letters across all columns (the default setting)
TTL1 = re-start lettering on each test group, as determined by format SHG. So, if F=SHG1, the lettering will re-start on the 1st column within each header group
TTL2 = same as TTL1, but omitting the first column in each test group (typically used when this contains a sub-total). So, if F=SHG2, the lettering will re-start on the 2nd column within each overheader group.

Second, you may re-define and/or extend the string of characters used, in special texts %FCA and %FCB, e.g.
%FCA = ‘abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ’,
If you are only testing at one level, you could redefine %FCA to include both upper and lower case letters, allowing a large number of columns to be tested against each other.

Finally, you can assign flag characters manually using label control <T=SMAx/SMBy>, and this method allows you to exclude columns from testing by not assigning any characters to them.

The table below was produced using formats TTS3/SHG1/SGX3/FBC/SLB95/TTL1.

How are low bases managed?

If the unweighted base for a column is lower than the value specified by format TMA, it will be marked with a flag character and, if it is lower than the value specified by format TMB, it will be flagged and excluded from significance testing. The flag characters are controlled by formats SBA and SBB.

By default, columns will be flagged with * and excluded from testing if the base is lower than 30.

What else?

Here are some other significance tests that can be done in MERLIN, using the formats shown:

CHI chi-square test
CHS single sample chi-square test
DEP dependent t-test
KST Kolmogorov-Smirnov test
LSD least significant difference
MWW Mann-Whitney-Wilcoxon test
TTF f-test

