Pronunciator
The Moby Pronunciator II contains 177,267 words with corresponding pronunciations. The Project Gutenberg distribution also contains a copy of the cmudict v0.3. The file follows the format word pronunciation. The part-of-speech field is used to disambiguate 770 of the words which have differing pronunciations depending on their part-of-speech. For example for the words spelled close, the verb has the pronunciation /ˈkloʊz/, whereas the adjective is /ˈkloʊs/. The parts-of-speech have been assigned the following codes:
Part-of-speech | Code |
---|---|
Noun | n |
Verb | v |
Adjective | aj |
Adverb | av |
Interjection | interj |
Following this is the pronunciation. Several special symbols are present:
Symbol | Meaning |
---|---|
/ | Used to separate phonemes |
_ | Used to separate words |
' | Primary stress on the following syllable |
, | Secondary stress on the following syllable |
The rest of the symbols are used to represent IPA characters, according to the following table:
Symbol | IPA |
---|---|
& | æ |
- | ə |
@ | ʌ, ə |
@r | ɜr, ər |
A | ɑː |
aI | aɪ |
Ar | ɑr |
AU | aʊ |
b | b |
d | d |
D | ð |
dZ | dʒ |
E | ɛ |
eI | eɪ |
f | f |
g | ɡ |
h | h |
hw | hw |
i | iː |
I | ɪ |
j | j |
k | k |
l | l |
m | m |
n | n |
N | ŋ |
O | ɔː |
Oi | ɔɪ |
oU | oʊ |
p | p |
r | r |
s | s |
S | ʃ |
t | t |
T | θ |
tS | tʃ |
u | uː |
U | ʊ |
v | v |
w | w |
z | z |
Z | ʒ |
Read more about this topic: Moby Project