Unsupervised Fine-tuning of Speaker Diarisation Pipelines using Silhouette Coefficients
Abstract
We investigate the use of silhouette coefficients in cluster analysis for speaker diarisation, with the dual purpose of unsupervised fine-tuning during domain adaptation and determining the number of speakers in an audio file. Our main contribution is to demonstrate the use of silhouette coefficients to perform per-file domain adaptation, which we show to deliver an improvement over per-corpus domain adaptation. Secondly, we show that this method of silhouette-based cluster analysis can be used to accurately determine more than one hyperparameter at the same time. Finally, we propose a novel method for calculating the silhouette coefficient of clusters using a PLDA score matrix as input
Collections
- Faculty of Engineering [1129]