●Voice Clone Detection 2.0

Voice Clone Detection

Detect voice clones, synthetic speech, and TTS-generated audio with explainable signals and enterprise-grade accuracy.

API Docs Evidence Types

What We Detect

Comprehensive voice cloning and synthetic audio detection

🎙️

Voice Clones

Identify cloned voices from AI models like ElevenLabs, Play.ht, and custom voice cloning systems

🔊

TTS Detection

Detect text-to-speech generation from major TTS providers and open-source models

🎵

Voice Conversion

Identify voice conversion and voice changer tools

🗣️

Synthetic Speech

Detect fully synthetic AI-generated speech across multiple languages

🎭

Deepfake Audio

Identify audio deepfakes and synthetic impersonations

📊

Signal Analysis

Analyze spectral patterns, timing artifacts, and vocal characteristics

Explainable Evidence

Understand exactly why audio was flagged

Evidence Types

✓Timestamp-level confidence scores
✓Spectral analysis with anomaly detection
✓Vocal characteristic comparison
✓Timing and prosody signal analysis

Performance

⚡Sub-second average response time
📊Real-time capable processing
🎯93%+ accuracy on in-the-wild dataset

API Integration

Get started in minutes with our unified API

cURL

curl -X POST https://api.zerotrue.app/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "modality": "audio",
    "url": "https://example.com/audio.mp3",
    "options": {
      "include_evidence": true
    }
  }'

View Full API Documentation

Limitations & Best Practices

Understanding model constraints for optimal results

False Positive Risk

Low-quality audio, strong compression, or background noise may increase false positives.

Adversarial Attacks

Voice anonymization and anti-forensic techniques may reduce detection accuracy.

Generalization Bounds

Performance varies across languages and TTS systems. Rolling evaluations track real-world performance.

Frequently Asked Questions

What audio formats are supported?

We support MP3, WAV, M4A, and AAC. Audio up to 100MB can be processed.

How fast is audio detection?

Average response time is under 1 second for standard audio clips.

Can you detect all TTS systems?

We cover major TTS providers and popular open-source models.

Does it work on all languages?

We support 20+ languages with best performance on English, Spanish, and Mandarin.

Is my audio data stored?

By default, audio is not stored. Enable zero-retention mode for strict privacy.