On January 28, 2020, staff of the Federal Trade Commission examined voice cloning technologies that enable users to make near-perfect reproductions of a real person’s voice. Advances in artificial intelligence and text-to-speech (TTS) synthesis have allowed researchers to create a near-perfect voice clone with less than a five second recording of a person’s voice.
Although there are a number of promising uses for this technology (for example, editing the work of voice actors and enabling people with tracheotomies and other conditions to use TTS systems using voices derived from their previously-recorded audio samples), it also has the potential to cause substantial harm when used maliciously. For instance, numerous consumers already fall prey to “grandparent scams” (where an elderly person receives a phone call supposedly from a grandchild in distress who needs cash) and phishing scams (where an employee is contacted by a superior and directed to immediately wire funds to a vendor). Voice cloning may make it harder for consumers to identify these sorts of social engineering scams.
The workshop examined:
- Speech synthesis using the voice of an actual person.
- Development and deployment of voice cloning technologies, from healthcare and consumer-oriented applications (customer service, entertainment, etc.) to fraudulent schemes.
- Ethical concerns related to the use of cloned voices.