Kaspars Grinvalds @ stock.adobe.com
Features such as the length of the vocal tract, nasal passage, pitch, accent, and even behavioral characteristics of each individual, are used to create an algorithmic model known, as a voiceprint. Voiceprints, similar to fingerprints, utilize specific vocal characteristics of a particular person instead of a mark left by the friction ridges of a human finger to create a unique algorithmic model of the voice. Voiceprints are commonly used for biometric identification and authentification processes.
Although various machine-learning techniques are used to process and store voiceprints, current models are based on deep learning, which has shown to outperform other models on a broad range of tasks involving noisy sensory data. Models based on deep-learning are able to reconstruct a speaker's face by merely analyzing a voice recording and using a technique called deep clustering, which disentangles multiple conversations happening at the same time in a crowded room.