Kaspars Grinvalds @ stock.adobe.com
Features such as the length of the vocal tract, nasal passage, pitch, accent, and even behavioral characteristics of an individual, are used to create an algorithmic model known as a voiceprint. Voiceprints, similar to fingerprints, utilize specific vocal features of a particular person to create a unique algorithmic model of the voice. Voiceprints are commonly used for biometric identification and authentification processes.
Although various machine-learning techniques are used to process and store voiceprints, current models are based on deep learning, which has been shown to outperform other models on a broad range of tasks involving noisy sensory data. Models based on deep learning can reconstruct a speaker's face by merely analyzing a voice recording and using a technique called deep clustering, which disentangles multiple conversations happening simultaneously in a crowded room.