Table 2 A contingency table of orthographic labels (Ortho) (columns) and pronounced phonetic symbols (Prono) (rows) of the Buckeye Corpus

Pron\Ortho aa ae ah ao aw ay eh er ey ih iy ow oy uh uw null Total
aa 10,920 47 1,197 4,821 50 35 108 487 5 67 12 113 12 20 4 48 17,946
ae 204 18,037 2,660 6 17 34 8,804 24 157 3,307 25 10 1 94 13 311 33,704
ah 475 90 38,258 283 16 76 2,234 1,091 88 8,150 79 113 5 963 136 3,726 55,783
aw 699 150 145 16 4,826 8 13 1 3 37 99 5,997
ay 2,011 352 2,602 23 25 23,019 757 3 112 167 119 3 5 8 1 63 29,270
eh 31 67 2,487 11 4 22 19,532 834 176 2,052 40 19 2 66 17 250 25,610
er 9 1,152 11 125 10,766 11 275 33 9 50 6 248 12,695
ey 142 32 3,947 4 42 1,630 53 13,883 1,224 143 17 8 63 14 199 21,401
ih 17 24 4,488 7 1 81 2,144 289 282 36,685 3,257 7 2 484 63 63 47,894
iy 15 17 4,889 4 1 76 436 300 221 4,825 29,070 20 1 172 87 602 40,736
ow 260 16 3,101 3,246 47 5 165 2,692 11 235 13 17,808 34 207 74 27,914
oy 1 6 1 5 480 493
uh 2 274 3 2 12 24 1 409 21 9 3,165 133 72 4,127
uw 16 16 4,014 16 3 12 475 840 18 5,697 702 44 625 10,838 907 24,223
Sum 14,802 18,848 69,220 8,452 4,990 23,412 36,435 17,403 14,966 63,096 33,514 18,214 649 5,917 11,386 6,489 347,793