I've decided to put together a brief written tutorial on using Aiko's style of CVVC English bank. I'm basing all the info in this tutorial on a combination of what I was taught by Cz and my own experiences in dealing with this type of voicebank.
Since it was a one-off reclist that has already been rewritten a few times since I recorded, this tutorial will ONLY cover USAGE of the voicebank, not recording or otoing. I'm leaving that to Cz (and of course she's free to edit and add onto this tutorial whenever!)
While this is meant in particular for Aiko's list, it can also be used for Cz's and Cdra's past reclists. The principle is about the same for each of them even if there are minor changes in otoing or phonetics.
Phonetic Reference Charts
Please note that pronunciation of course will differ between dialects and accents.
a = awful
e = bet
i = pick
o = loot
u = cut
A = aim
E = beet
I = right
O = bowl
@ = cat
3 = heard
1 = king or think, please combine with 1ng or enk endings (can also be used as a less-harsh E)
& = and / amber
6 = look, or your when combined with 6r ending
8 = town, out
9 = ball, gawk
Q = void, or boar when combined with Or ending
Most consonants are typical English consonants and resemble the words they are used in, but these 3 are notably unusual:
dh = there
th = wrath
zh = azure
Basic CVVC English Usage
Stringing CV and VC
Generally, CVVC works as follows to create smooth transitions between consonants and vowels:
[starting CV or -V] + [blending VC] + [blending CV] + [ending VC or V-]
For example in English, to create the word "bigger" you would write as follows:
[bi] [ig] [g3] [3-]
Generally, the VC used in this way can be a 32nd note long.
English in particular, however, has many combinations of consonants that occur.
If you were to write "fun time" for instance:
[fu] [un] [tI] [Im]
You can see that the "n" in "fun" and "t" in "time" are back to back. If you were to have this laid out exactly how "bigger" was, the "n" might eat into the "t" (unless you use consonant velocity)
In this case, spacing out the notes a bit can be useful:
See "Articulation and Pronunciation Tips" for more notes on how to get these types of consonant clusters to work.
Basic vowel endings are indicated with -
For example: e-, o-, A-
These endings are VERY IMPORTANT for the uppercase and "number" vowel sounds. Those are diphthongs, where the pronunciation shifts over the course of the entire sound, so the ending is extremely necessary for the entire vowel's pronunciation.
Please be sure to use "-" endings for correct pronunciation at the end of words that do not need VC.
Alternatively, there are ending breaths indicated with TWO hyphens instead of one.
Ending Consonant Clusters
This bank contains some consonant cluster endings that simply use neutral "e" (such as est, eks, ent, end, enjd and so on)
These endings are meant to be used on any vowel. HOWEVER, for the vowels A, I, O, 8 and Q, you should use these vowels' endings before the ending consonant cluster.
For example to write "ranged" you would need:
[rA] [A-] [enjd]
"Fake VCV" or "eVCV" are blending sounds for unusual consonants that can be used by using an apostrophe ' in front of the CV. They use the neutral vowel "e" which is crossfaded out of the oto.
'd = tongue-tap, as in American pronunciation of "butter" or "ladder"
For example: butter = [bu] ['d3] [3-]
W, H, R and Y are tricky gliding consonants that also have eVCV to assist in clean transitions: 'w 'h 'r 'y
• Use CV wa at the start of a line or after a VC.
• Use eVCV 'wa in the middle of a line to blend with a preceding vowel.
Articulation and Pronunciation
The English language is extremely variable. Do not feel restricted to the phonetic guides in this tutorial if you find something that works better for the accent or pronunciation you want.
Changing small details can make a world of difference for the articulation of sounds.
"You better believe it's true."
[yo] [ob] [be] ['d3] [3b] [bE] ['lE] [Ev] [-i] [ets] [tro] [o-]
[yo] [ob] [be] ['d3] [3b] [bu] ['lE] [Ev] [vi] [ets] [tro] [o-]
The first pronounces "believe" like bee-leave, the second uses neutral "u" for buh-leave.
The first separates "believe" and "it's" using a glottal stop on the i, the second has the v in "believe" leading into "it's" as though it were pronounced "leavits"
Of course, both of these require strategically placed consonant velocity and spaces in order to be fully pronounced correctly, depending on the tempo of the song.
You can experiment with all sorts of combinations to get the stresses and pronunciation you want, depending on the speed and length of notes, if the song is more staccato or legato, etc.
Consonant Velocity and Note Length
For using CVVC English, many consonant clusters in English require a good bit of space to be fully pronounced, and this becomes more of a challenge the faster the notes get!
Don't be afraid to use 64th notes (change quantize settings), consonant velocity, STP and to drop certain endings or sounds if a song has particularly fast or short notes.
"S" clusters can be particularly difficult in some cases. For example, "explain" could be pronounced:
[-e] [eks] [splA] [An]
for slower songs. However, it might need to be changed to
[-e] [eks] [plA] [An]
[-e] [ek] [splA] [An]
for faster songs, using higher consonant velocity on the VC and CV in the middle of the word.
It goes without saying really, but CVVC ENGLISH IS NOT AN EXACT SCIENCE!
Playing around with pronunciation, speed and the way the consonants and vowels link together is very important!
English CVVC will undoubtedly be used differently between different people. It's made to make the process of using an English bank easier, but there is no real structured, perfect "plug n' play" method out there.
So practice, experiment, and have fun!