Classical Cryptography Course,
Volumes I and II from Aegean Park Press

By Randy Nichols (LANAKI)
President of the American Cryptogram Association from 1994-1996.
Executive Vice President from 1992-1994

Table of Contents
  • Lesson 1
  • Lesson 2
  • Lesson 3
  • Lesson 4
  • Lesson 5
  • Lesson 6
  • Lesson 7
  • Lesson 8
  • Lesson 9
  • Lesson 10
  • Lesson 11
  • Lesson 12
  • CLASSICAL CRYPTOGRAPHY COURSE


    BY LANAKI

    December 05, 1995


    LECTURE 4
    SUBSTITUTION WITH VARIANTS
    Part III
    MULTILITERAL SUBSTITUTION

    SUMMARY

    Welcome back from the Thanksgiving holiday break. The good news is that this lecture will come to you about Christmas, therefore, no homework. The not so good news is that this concluding Lecture 4 on Substitution with Variants covers some difficult material of wide practically in the field.

    In Lecture 4, we complete our look into English monoalphabetic substitution ciphers, by describing multiliteral substitution with difficult variants. The Homophonic and GrandPre Ciphers will be covered. The use of isologs is demonstrated. A synoptic diagram of the substitution ciphers described in Lectures 1-4 will be presented.

    MULTILITERAL SUBSTITUTION WITH MULTIPLE-EQUIVALENT CIPHER ALPHABETS - aka "MONOALPHABETIC SUBSTITUTION WITH VARIANTS"

    Each English letter in plain text has a characteristic frequency which affords definite clues in the solution of simple monoalphabetic ciphers. Associations which individual letters form in combining to make up words, and the peculiarities which certain of them manifest in plain text, afford further direct clues by means of which ordinary monoalphabetic substitution encipherments of such plain text may be readily solved. [FR1]

    Cryptographers have devised methods for disguising, suppressing, or eliminating the foregoing characteristics in the cryptograms produced by methods described in Lectures 1-3. One category of methods called "variants or variant values" is that in which the letters of the plain component of a cipher alphabet are assigned two or more cipher equivalents.

    Systems involving variants are generally multiliteral. In such systems, there are a large number of equivalents made available by combinations and permutations of a limited number of elements, each letter of the plain text may be represented by several multiliteral cipher equivalents which may be selected at random. For example, if 3-letter combinations are employed as multiliteral equivalents, there are 263 or 17,576 available equivalents for the 26 letters of the plain text.

    They may be assigned in equal numbers of different equivalents for the 26 letters, in which case each letter would be representable by 676 different 3 letter equivalents or they be assigned on some other basis, for example proportionately to the relative frequencies of the plain text letters. [FR1]

    The primary object of substitution with variants is again to provide several values which may be employed at random in a simple substitution of cipher equivalents for the plain text letters.

    As a slight diversion, the reader may ask about uniliteral substitution with variants. It is but not very practical. Note the following cipher alphabet constructed in French by Captain Roger Baudouin in reference [BAUD]:

    Plain
    a
    b
    c
    d
    e
    f
    g
    h
    i
    l
    m
    n
    o
    p
    q
    r
    s
    t
    u
    v
    x
    z
    Cipher
    L
    G
    O
    R
    F
    Q
    A
    H
    C
    M
    B
    T
    I
    D
    N
    P
    U
    S
    Y
    E
    W
    J
    K
    X
    Z
    V

    (Note that the Captain was not an ACA member. The H=H combination is not allowed.)

    Baudouin proposed that the J and Y plain be replaced by I plain and K plain by C plain or Q plain and W plain by VV plain. Four cipher letters would be available as variants for the high- frequency plain text letters in French.

    Mixed alphabets formed by including all repeated letters of the key word or key phrase in the cipher component were common in Edgar Allen Poe's day but are impractical because they are ambiguous, making decipherment difficult; for example:

    Enciphering Alphabet:

    Plain
    a
    b
    c
    d
    e
    f
    g
    h
    i
    j
    k
    l
    m
    n
    o
    p
    q
    r
    s
    t
    u
    v
    w
    x
    y
    z
    Cipher
    N
    O
    W
    I
    S
    T
    H
    E
    T
    I
    M
    E
    F
    O
    R
    A
    L
    L
    G
    O
    O
    D
    M
    E
    N
    T

    Inverse form for deciphering:

    Cipher
    A
    B
    C
    D
    E
    F
    G
    H
    I
    J
    K
    L
    M
    N
    O
    P
    Q
    R
    S
    T
    U
    V
    W
    X
    Y
    Z
    Plain
    p
    v
    h
    m
    s
    g
    d
    q
    k
    a
    b
    o
    e
    f
    c
    l
    j
    r
    w
    y
    n
    i
    x
    t
    z
    u

    The average cipher clerk would have difficulty in decrypting a cipher group such as TOOET, each letter having 3 or more equivalents, from which plain text fragments (n)inth, ft thi(s), it thi, etc. can be formed on decipherment. [FR1]

    THEORETICAL DISTINCTIONS

    In simple or single-equivalent monoalphabetic substitution with variants, two points are evident:

    In multiliteral - equivalent monoalphabetic substitution with variants, two points are also evident:

    SIMPLE TYPES OF CIPHER ALPHABETS WITH VARIANTS

    Figure 4-1
    6
    7
    8
    9
    0
    1
    2
    3
    4
    5
    *
    *
    *
    *
    *
    *
    *
    *
    6
    1
    *
    A
    B
    C
    D
    E
    7
    2
    *
    F
    G
    H
    IJ
    K
    8
    3
    *
    L
    M
    N
    O
    P
    9
    4
    *
    Q
    R
    S
    T
    U
    0
    5
    *
    V
    W
    X
    Y
    Z

    Figure 4-2
    V
    W
    X
    Y
    Z
    Q
    R
    S
    T
    U
    *
    *
    *
    *
    *
    *
    *
    *
    *
    L
    F
    A
    *
    A
    B
    C
    D
    E
    M
    G
    B
    *
    F
    G
    H
    IJ
    K
    N
    H
    C
    *
    L
    M
    N
    O
    P
    O
    I
    D
    *
    Q
    R
    S
    T
    U
    P
    K
    E
    *
    V
    W
    X
    Y
    Z

    Figure 4-3
    A
    E
    I
    O
    U
    *
    *
    *
    *
    *
    *
    T
    N
    H
    B
    *
    A
    B
    C
    D
    E
    V
    P
    J
    C
    *
    F
    G
    H
    IJ
    K
    W
    Q
    K
    D
    *
    L
    M
    N
    O
    P
    X
    R
    L
    F
    *
    Q
    R
    S
    T
    U
    Z
    S
    M
    G
    *
    V
    W
    X
    Y
    Z

    Figure 4-4
    V
    W
    X
    Y
    Z
    Q
    R
    S
    T
    U
    L
    M
    N
    O
    P
    F
    G
    H
    I
    K
    A
    B
    C
    D
    E
    *
    *
    *
    *
    *
    *
    V
    Q
    L
    F
    A
    *
    A
    B
    C
    D
    E
    W
    R
    M
    G
    B
    *
    F
    G
    H
    IJ
    K
    X
    N
    S
    H
    C
    *
    L
    M
    N
    O
    P
    Y
    T
    O
    I
    D
    *
    Q
    R
    S
    T
    U
    Z
    U
    P
    K
    E
    *
    V
    W
    X
    Y
    Z

    Figure 4-5
    O
    M
    N
    J
    K
    L
    F
    G
    H
    I
    A
    B
    C
    D
    E
    *
    *
    *
    *
    *
    *
    O
    M
    J
    F
    A
    *
    E
    N
    A
    L
    U
    N
    K
    G
    B
    *
    T
    R
    S
    F
    W
    L
    H
    C
    *
    O
    IJ
    H
    Y
    X
    I
    D
    *
    D
    C
    M
    V
    K
    E
    *
    P
    G
    B
    Q
    Z

    Figure 4-6
    Z
    W
    X
    Y
    S
    T
    U
    V
    N
    O
    P
    Q
    R
    *
    *
    *
    *
    *
    *
    M
    J
    F
    A
    *
    E
    N
    A
    L
    U
    K
    G
    B
    *
    T
    R
    S
    F
    W
    L
    H
    C
    *
    O
    IJ
    H
    Y
    X
    I
    D
    *
    D
    C
    M
    V
    K
    E
    *
    P
    G
    B
    Q
    Z

    Figure 4-7
    1
    2
    3
    4
    5
    6
    7
    8
    9
    0
    *
    *
    *
    *
    *
    *
    *
    *
    *
    *
    *
    7
    4
    1
    *
    A
    B
    C
    D
    E
    F
    G
    H
    I
    J
    8
    5
    2
    *
    K
    L
    M
    N
    O
    P
    Q
    R
    S
    T
    9
    6
    3
    *
    U
    V
    W
    X
    Y
    Z
    .
    ,
    :
    ;

    Figure 4-8
    1
    2
    3
    4
    5
    6
    7
    8
    9
    *
    *
    *
    *
    *
    *
    *
    *
    *
    *
    7
    4
    1
    *
    A
    B
    C
    D
    E
    F
    G
    H
    I
    8
    5
    2
    *
    J
    K
    L
    M
    N
    O
    P
    Q
    R
    9
    6
    3
    *
    S
    T
    U
    V
    W
    X
    Y
    Z
    *

    Figure 4-9
    1
    2
    3
    4
    5
    6
    7
    8
    9
    *
    *
    *
    *
    *
    *
    *
    *
    *
    *
    5
    1
    *
    A
    B
    C
    D
    E
    F
    G
    H
    I
    6
    2
    *
    J
    K
    L
    M
    N
    O
    P
    Q
    R
    7
    3
    *
    S
    T
    U
    V
    W
    X
    Y
    Z
    1
    8
    4
    *
    2
    3
    4
    5
    6
    7
    8
    9
    0

    Figure 4-10
    1
    2
    3
    4
    5
    6
    7
    8
    9
    *
    *
    *
    *
    *
    *
    *
    *
    *
    *
    0
    8
    5
    1
    *
    T
    E
    R
    M
    I
    N
    A
    L
    S
    9
    6
    2
    *
    B
    C
    D
    F
    G
    H
    K
    J
    K
    7
    3
    *
    P
    Q
    U
    V
    W
    X
    Y
    Z
    1
    4
    *
    2
    3
    4
    5
    6
    7
    8
    9
    0

    The matrices in Figures 4 -1 to 4-10 represent some of the simpler means for accomplishing monoalphabetic substitution with variants. The matrices are extensions of the basic ideas of multiliteral substitution presented in Lecture 3.

    The variant equivalents for any plain text letter may be chosen at will; thus, in Figure 4-1, e= 10, 15, 60, or 65; in Figure 4-2, e= AU, AZ, FU, FZ, LU or LZ.

    Encipherment by means of matrices shown in Figures 4-2, 4-3, 4-6 is commutative. The coordinates may be read row by column or visa versa. There is no cryptographic ambiguity. The remaining matrices are noncommutative. The general convention is to read row by column.

    In Figures 4-5 and 4-6, the letters in the square have been inscribed in such a manner that, coupled with the particular arrangement of the row and column coordinates, the number of variants available for each plain text letter is roughly proportional to the frequencies of the letters in the plain text. Figure 35 incorporates a keyword on top of this idea. [FR1]

    HOMOPHONIC

    The Homophonic Cipher is a simple variant system. It is a 4-level (alphabets) dinome cipher. Consider Figure 4-11.

    Figure 4-11
    A
    B
    C
    D
    E
    F
    G
    H
    IJ
    K
    L
    M
    N
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    68
    69
    70
    71
    72
    73
    74
    75
    51
    52
    53
    54
    55
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    O
    P
    Q
    R
    S
    T
    U
    V
    W
    X
    Y
    Z
    21
    22
    23
    24
    25
    01
    02
    03
    04
    05
    06
    07
    48
    49
    50
    26
    27
    28
    29
    30
    31
    32
    33
    34
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    00
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86

    The keyword TRIP is found by inspecting dinomes 01, 26, 51, and 76. (The lowest number in each of the four sequences.) [FR1] [FR5]

    The Russians added an interesting gimmick called the Disruption Area. Consider Figure 4-12 and note the slashes under U - X for the fourth level of dinomes. The famous VIC cipher used this feature very effectively. [NIC4]
    Figure 4-12
    A
    B
    C
    D
    E
    F
    G
    H
    I
    J
    K
    L
    M
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    N
    O
    P
    Q
    R
    S
    T
    U
    V
    W
    X
    Y
    Z
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    71
    72
    73
    74
    75
    76
    77
    78
    53
    54
    55
    56
    57
    94
    95
    96
    97
    98
    99
    00
    //
    //
    //
    //
    79
    80

    The keyword NAVY is represented by dinomes 01, 27, 53, and 79.

    Security for Homophonic systems is greatly improved if the dinomes and the four sequences are assigned randomly. However, the easy mnemonic feature of the keyworded four sequences is lost.

    The Mexican Cipher device is a Homophonic consisting of five concentric disks, the outer disk bearing 26 letters and the other four bearing sequences 01-26, 27-52, 53-78, 79-00. The cipher disk enhances frequent key changes. Figure 4-12 shows the matrix without the disruption area. [FR5] [NIC4]

    HOMOPHONIC CRYPTANALYSIS

    Lets solve the following cryptogram.

    68321   09022   48057   65111   88648   42036   45235   09144
    05764   22684   00225   57003   97357   14074   82524   40768
    51058   93074   92188   47264   09328   04255   06186   79882
    85144   45886   32574   55136   56019   45722   76844   68350
    45219   71649   90528   65106   11886   44044   89669   70553
    18491   06985   48579   33684   50957   70612   09795   29148
    56109   08546   62062   65509   32800   32568   97216   44282
    34031   84989   68564   53789   12530   77401   68494   38544
    11368   87616   56905   20710   58864   67472   22490   09136
    62851   24551   35180   14230   50886   44084   06231   12876
    05579   58980   29503   99713   32720   36433   82689   04516
    52263   21175   06445   72255   68951   86957   76095   67215
    53049   08567   9730
    
    
    
    Assuming we did not know that the above cryptogram was a HOMOPHONIC, we might make a preliminary analysis to see if we are dealing with a cipher or a code. We will cover code systems later in the course, but a few introductory remarks might be in order. The five letter groups could indicate either a cipher or a code.

    If the cryptogram contains an even number of digits, as for example 494 in the previous message, this leaves open the possibility that the message is a cipher containing 247 pairs of digits; were the number of digits an exact odd multiple of five, such as 125, 135, etc., the possibility that the cryptogram is in code of the 5-figure group type must be considered.

    We next study the message repetitions and what their characteristics are. If the cipher text is of 5-figure code type, then such repetitions as appear should generally be in whole groups of five digits, and they should be visible in the text just as the message stands, unless the code message has been superenciphered. If the cryptogram is a cipher, then repetitions should extend beyond the 5-digit groupings; if they conform to any definite at all they should for the most part contain even numbers of digits since each letter is probably represented by a pair (dinome) of digits.

    We start with 4-part frequency distribution. We next assume a 25 character alphabet from 01-00. This is the common scheme of drawing up the alphabets. Breaking the text into dinomes (2-digit) pairs yields:
    01
    02
    03
    04
    05
    06
    07
    08
    09
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    ///
    ///
    /
    /////
    //////
    ///
    ////
    ////
    /////
    ///
    /
    /
    /
    ///
    //////
    /
    //
    /////
    //
    /
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    ///
    /
    /
    /
    //////
    /
    /
    /
    /////
    /
    /
    ///
    ////
    /
    //////
    //////
    ///
    ///
    /////
    /////
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    /////
    /////
    ///
    ////
    /////
    //////
    //
    //
    //////
    /
    //
    ///////
    //
    /
    /
    ////
    ////
    /
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    00
    //////
    /
    /
    ///
    ////
    /
    //////
    //////
    ///
    ////
    /////
    //////
    ///
    /
    /
    /
    ///
    //////
    /
    //

    What we have before us are four simple, monoalphabetic frequency distributions similar to those involved in a monoalphabetic substitution cipher using standard cipher alphabets. The next step is to fit the distribution to the normal. Since I=J for the 25 letter alphabet, we find that the Keyword is JUNE and the following alphabets result:

     

    01 I-J 26 U 51 N 76 E

    02 K 27 V 52 O 77 F

    03 L 28 W 53 P 78 G

    04 M 29 X 54 Q 79 H

    05 N 30 Y 55 R 80 IJ

    06 O 31 Z 56 S 81 K

    07 P 32 A 57 T 82 L

    08 Q 33 B 58 U 83 M

    09 R 34 C 59 V 84 N

    10 S 35 D 60 W 85 O

    11 T 36 E 61 X 86 P

    12 U 37 F 62 Y 87 Q

    13 V 38 G 63 Z 88 R

    14 W 39 H 64 A 89 S

    15 X 40 IJ 65 B 90 T

    16 Y 41 K 66 C 91 U

    17 Z 42 L 67 D 92 V

    18 A