Alize provides every necessary binary file to do the training and extracting after compilation.
Also HTK can to be downloaded if, the precompiled binaries of Alize are not sufficient enough or simply won't work due to other architecture.
Also the config files of this Tutorial are needed, since I strongly oriented myself on that stuff there, but yet did automatize the whole process. I will point out the 'pain in the ass' things of the configurations.
After downloading and compiling, create a new directory where you like to estimate a model in (e.g. model).
execute the following commands in the console:
mkdir lib mkdir lib/scp mkdir lst mkdir data mkdir data/prm mkdir data/lbl
After that we need to link the Alize binaries with this directory, so just use a softlink and execute :
ln -s <your path to Alize>/bin . cp -r {YourIVectorDir} cfgTo begin the process it is necessary to generate ( if not already done ) a .scp file, which maps the given raw files ( either .wav or .sph ) files to the corresponding feature file splits ( plp, mfcc, or what you like ).
The format of this file is e.g. :
/dnn_data/8h-plp39-z/jaat-B_10065_10348.plpwhere the last part signalizes the length of the speech utterance in that speakers full speech.
The following scripts are provided by me, firstly I wrote a script which generates a data.lst file ( simply cuts the given corpus filenames ), concatenates the speech utterances and outputs the concatinated files into a directory.
Download these Scripts and run the following commands:
python generateUBM.py -i {path to .scp} -o data/prm/ -glst lst/data.lstand after :
python TrainWorld_TrainTV_Train_IV.pyIt is essential that all the config files in the cfg directory, which were downloaded from the tutorial, are fully copied to the working directory. The main error which probably will happen is the outofmemory exception. This is probably due to the configuration, if the mode is set to HTK, but the "useBigEndian" Parameter is still set on false, Alize can't read an EOF and therefore will allocate all memory available. Just check if the parameters are set to HTK for type and "usebigEndian" should be true. After the scripts ran, you should have a directory which is called "iv", where the ivectors are extracted to.
No comments:
Post a Comment