The Speaker-conditional Chain model

In this work, we raise a common strategy named Speaker-Conditional Chain Model (SCCM) to process complex speech recordings.

Our model first infers the identities of variable numbers of speakers from the observation based on a sequence-to-sequence model. Then, it takes the information from the inferred speakers as conditions to extract their speech sources. With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings.

Basic Introduction Demo and Samples

Please see the introduction Video below or with this link

Supplementary Material for the paper

Please see the Supplementary Material with this link

Thank you.

Speaker-Conditional Chain Model for Speech Separation and Extraction

The introduction for "Speaker-Conditional Chain Model for Speech Separation and Extraction" , submitted to INTERSPEECH 2020

The Speaker-conditional Chain model

Basic Introduction Demo and Samples

Supplementary Material for the paper