Yamazaki-san says that he does not want a merely realistic human voice. Instead, he aimed at making some unique instrument that sounds like a machine generated human voice. (According to another interview article, he says that "machine generated" does not necessary exclude the possibility of being even clearer than actual human voice. Well, AquesTone2 did not reach that goal yet, but that's his ultimate goal.)
For version 2 he wrote the voice engine from the scratch. Unlike UTAU, AquesTone2 does not have VC (vowel-consonant sound data) at all. It just simply crossfeed consonant and vowel to cover up the transition phase, which worked surprisingly well for vocaloid. And, the polyphonic voicing was actually a by-product of this simple approach.
He has already implemented a function to import a new voice data file in version 2. Unfortunately, there are no concrete plan for when and what type of voice it would be added. But, he did suggest it may be some "neutral" voice.
Well, I don't know what he meant by "neutral" voice but I guess it is just like the default voice we have now (Lina) but with more optimized and better quality data. Because he seems not happy about the clarity of the pronunciation, and some part of this problem might come from the voice data. An interview with AV watch on the old version, he said something like that. I may be wrong though. Or, it could be some gender neutral voice?
No comments:
Post a Comment