Wednesday, April 16, 2014

Tutorial : Training a UBM Model and Extract I-Vector with Alize 3.0

To train a Model and get familiar with I-Vector behavior I did train my model with the already available Alize 3.0 framework.

Alize provides every necessary binary file to do the training and extracting after compilation.
Also HTK can to be downloaded if, the precompiled binaries of Alize are not sufficient enough or simply won't work due to other architecture.

Also the config files of this Tutorial are needed, since I strongly oriented myself on that stuff there, but yet did automatize the whole process. I will point out the 'pain in the ass' things of the configurations.

After downloading and compiling, create a new directory where you like to estimate a model in (e.g. model).

execute the following commands in the console:

mkdir lib

mkdir lib/scp

mkdir lst

mkdir data

mkdir data/prm

mkdir data/lbl


After that we need to link the Alize binaries with this directory, so just use a softlink and execute :
ln -s <your path to Alize>/bin .

cp -r {YourIVectorDir} cfg
To begin the process it is necessary to generate ( if not already done ) a .scp file, which maps the given raw files ( either .wav or .sph ) files to the corresponding feature file splits ( plp, mfcc, or what you like ).

The format of this file is e.g. :

/dnn_data/8h-plp39-z/jaat-B_10065_10348.plp
where the last part signalizes the length of the speech utterance in that speakers full speech.

The following scripts are provided by me, firstly I wrote a script which generates a data.lst file ( simply cuts the given corpus filenames ),  concatenates the speech utterances and outputs the concatinated files into a directory.

Download these Scripts and run the following commands:



python generateUBM.py -i {path to .scp} -o data/prm/ -glst lst/data.lst
and after :
python TrainWorld_TrainTV_Train_IV.py
It is essential that all the config files in the cfg directory, which were downloaded from the tutorial, are fully copied to the working directory. The main error which probably will happen is the outofmemory exception. This is probably due to the configuration, if the mode is set to HTK, but the "useBigEndian" Parameter is still set on false, Alize can't read an EOF and therefore will allocate all memory available. Just check if the parameters are set to HTK for type and "usebigEndian" should be true. After the scripts ran, you should have a directory which is called "iv", where the ivectors are extracted to.

No comments:

Post a Comment