Appendix B. Nucleotide Scoring Schemes
Nucleotide scoring schemes are often
summarized by their target frequency, which is the expected frequency
of nucleotide pairs. This frequency is usually expressed as the
expected percent identity. For example, the +1/-1 match/mismatch
values have a target frequency of 75 percent identity. But this is
true only for ungapped alignments between sequences of infinite
length. Short sequences and gapped alignment change the true target
frequency. In the following table, the target frequencies for a
variety of match (+), mismatch (-), and simple gap costs (gap) are
calculated for pairs of sequences of length 100, 500, and 1,000 by
performing local alignments of random nucleotide sequences of
unbiased composition. The theoretical target frequency (TF) is
included for comparison.
|
1
|
1
|
1
|
75
|
55
|
49
|
49
|
1
|
1
|
2
|
75
|
79
|
70
|
69
|
1
|
1
|
3
|
75
|
85
|
79
|
79
|
1
|
2
|
2
|
95
|
93
|
89
|
88
|
1
|
2
|
3
|
95
|
98
|
96
|
96
|
1
|
2
|
4
|
95
|
98
|
97
|
97
|
1
|
3
|
3
|
99
|
99
|
99
|
98
|
5
|
4
|
4
|
65
|
51
|
48
|
48
|
5
|
4
|
5
|
65
|
53
|
49
|
49
|
5
|
4
|
6
|
65
|
55
|
50
|
49
|
5
|
4
|
7
|
65
|
59
|
51
|
50
|
5
|
4
|
8
|
65
|
62
|
52
|
50
|
5
|
4
|
9
|
65
|
64
|
55
|
53
|
5
|
4
|
10
|
65
|
67
|
59
|
57
|
5
|
4
|
11
|
65
|
69
|
61
|
60
|
5
|
4
|
12
|
65
|
71
|
63
|
62
|
5
|
5
|
5
|
75
|
55
|
49
|
49
|
5
|
5
|
6
|
75
|
59
|
51
|
50
|
5
|
5
|
7
|
75
|
64
|
55
|
53
|
5
|
5
|
8
|
75
|
70
|
61
|
59
|
5
|
5
|
9
|
75
|
72
|
65
|
64
|
5
|
5
|
10
|
75
|
79
|
70
|
69
|
5
|
5
|
11
|
75
|
80
|
73
|
71
|
5
|
5
|
12
|
75
|
81
|
75
|
74
|
5
|
5
|
13
|
75
|
82
|
76
|
76
|
5
|
5
|
14
|
75
|
82
|
77
|
77
|
5
|
5
|
15
|
75
|
85
|
79
|
79
|
5
|
6
|
6
|
82
|
62
|
53
|
51
|
5
|
6
|
7
|
82
|
69
|
60
|
58
|
5
|
6
|
8
|
82
|
75
|
67
|
65
|
5
|
6
|
9
|
82
|
79
|
73
|
71
|
5
|
6
|
10
|
82
|
83
|
77
|
75
|
5
|
6
|
11
|
82
|
85
|
79
|
79
|
5
|
6
|
12
|
82
|
87
|
81
|
81
|
5
|
6
|
15
|
82
|
90
|
85
|
84
|
5
|
6
|
18
|
82
|
90
|
87
|
86
|
5
|
7
|
7
|
87
|
73
|
64
|
63
|
5
|
7
|
8
|
87
|
78
|
72
|
70
|
5
|
7
|
9
|
87
|
83
|
77
|
76
|
5
|
7
|
10
|
87
|
87
|
82
|
81
|
5
|
7
|
11
|
87
|
89
|
84
|
83
|
5
|
7
|
12
|
87
|
90
|
86
|
85
|
5
|
7
|
13
|
87
|
91
|
88
|
87
|
5
|
7
|
14
|
87
|
91
|
88
|
87
|
5
|
7
|
21
|
87
|
93
|
91
|
90
|
5
|
8
|
8
|
90
|
81
|
75
|
73
|
5
|
8
|
9
|
90
|
85
|
80
|
79
|
5
|
8
|
10
|
90
|
89
|
85
|
84
|
5
|
8
|
11
|
90
|
91
|
87
|
86
|
5
|
8
|
12
|
90
|
92
|
89
|
88
|
5
|
8
|
13
|
90
|
93
|
90
|
89
|
5
|
8
|
14
|
90
|
93
|
91
|
90
|
5
|
8
|
15
|
90
|
94
|
92
|
91
|
5
|
8
|
16
|
90
|
94
|
93
|
92
|
5
|
8
|
24
|
90
|
95
|
94
|
93
|
5
|
9
|
9
|
93
|
86
|
82
|
81
|
5
|
9
|
10
|
93
|
90
|
86
|
85
|
5
|
9
|
11
|
93
|
92
|
89
|
89
|
5
|
9
|
12
|
93
|
93
|
91
|
90
|
5
|
9
|
13
|
93
|
93
|
92
|
91
|
5
|
9
|
14
|
93
|
94
|
92
|
91
|
5
|
9
|
15
|
93
|
95
|
93
|
92
|
5
|
9
|
16
|
93
|
95
|
94
|
93
|
5
|
9
|
17
|
93
|
95
|
94
|
93
|
5
|
9
|
18
|
93
|
95
|
94
|
94
|
5
|
9
|
27
|
93
|
96
|
95
|
94
|
5
|
10
|
10
|
95
|
93
|
89
|
88
|
5
|
10
|
11
|
95
|
94
|
92
|
90
|
5
|
10
|
12
|
95
|
95
|
93
|
91
|
5
|
10
|
13
|
95
|
95
|
94
|
93
|
5
|
10
|
14
|
95
|
95
|
94
|
96
|
5
|
10
|
15
|
95
|
98
|
96
|
96
|
5
|
10
|
20
|
95
|
98
|
97
|
97
|
5
|
10
|
30
|
95
|
98
|
98
|
97
|
|