Table 2 A contingency table of orthographic labels (Ortho) (columns) and pronounced phonetic symbols (Prono) (rows) of the Buckeye Corpus
Pron\Ortho | aa | ae | ah | ao | aw | ay | eh | er | ey | ih | iy | ow | oy | uh | uw | null | Total |
aa | 10,920 | 47 | 1,197 | 4,821 | 50 | 35 | 108 | 487 | 5 | 67 | 12 | 113 | 12 | 20 | 4 | 48 | 17,946 |
ae | 204 | 18,037 | 2,660 | 6 | 17 | 34 | 8,804 | 24 | 157 | 3,307 | 25 | 10 | 1 | 94 | 13 | 311 | 33,704 |
ah | 475 | 90 | 38,258 | 283 | 16 | 76 | 2,234 | 1,091 | 88 | 8,150 | 79 | 113 | 5 | 963 | 136 | 3,726 | 55,783 |
aw | 699 | 150 | 145 | 16 | 4,826 | 8 | 13 | | 1 | 3 | | 37 | 99 | | | | 5,997 |
ay | 2,011 | 352 | 2,602 | 23 | 25 | 23,019 | 757 | 3 | 112 | 167 | 119 | 3 | 5 | 8 | 1 | 63 | 29,270 |
eh | 31 | 67 | 2,487 | 11 | 4 | 22 | 19,532 | 834 | 176 | 2,052 | 40 | 19 | 2 | 66 | 17 | 250 | 25,610 |
er | 9 | | 1,152 | 11 | | | 125 | 10,766 | 11 | 275 | 33 | 9 | | 50 | 6 | 248 | 12,695 |
ey | 142 | 32 | 3,947 | 4 | | 42 | 1,630 | 53 | 13,883 | 1,224 | 143 | 17 | 8 | 63 | 14 | 199 | 21,401 |
ih | 17 | 24 | 4,488 | 7 | 1 | 81 | 2,144 | 289 | 282 | 36,685 | 3,257 | 7 | 2 | 484 | 63 | 63 | 47,894 |
iy | 15 | 17 | 4,889 | 4 | 1 | 76 | 436 | 300 | 221 | 4,825 | 29,070 | 20 | 1 | 172 | 87 | 602 | 40,736 |
ow | 260 | 16 | 3,101 | 3,246 | 47 | 5 | 165 | 2,692 | 11 | 235 | 13 | 17,808 | 34 | 207 | 74 | | 27,914 |
oy | 1 | | 6 | 1 | | | | | | | | 5 | 480 | | | | 493 |
uh | 2 | | 274 | 3 | | 2 | 12 | 24 | 1 | 409 | 21 | 9 | | 3,165 | 133 | 72 | 4,127 |
uw | 16 | 16 | 4,014 | 16 | 3 | 12 | 475 | 840 | 18 | 5,697 | 702 | 44 | | 625 | 10,838 | 907 | 24,223 |
Sum | 14,802 | 18,848 | 69,220 | 8,452 | 4,990 | 23,412 | 36,435 | 17,403 | 14,966 | 63,096 | 33,514 | 18,214 | 649 | 5,917 | 11,386 | 6,489 | 347,793 |