Transparent Performance

Benchmarks & Rolling Metrics

Quarterly rolling evaluations on in-the-wild data. Honest metrics, transparent methodology, and continuous improvement.

94%+
Accuracy
<2s
Avg Latency
<3%
False Positive
92%+
Recall

Methodology

How we evaluate performance in the wild

Datasets

  • • In-the-wild dataset
  • • Adversarial samples
  • • Compression variants
  • • Cross-platform tests

Evaluation

  • • Quarterly rolling eval
  • • Blind test sets
  • • Human baseline
  • • A/B testing

Reporting

  • • Per-modality metrics
  • • Confidence calibration
  • • Error analysis
  • • False positive tracking

Real-World Generalization

Compression Robustness

Evaluated on heavily compressed content common in social media and messaging apps

Novel Generator Handling

Testing on new AI models and generation techniques as they emerge

Adversarial Resilience

Continuous evaluation against anti-detection techniques and obfuscation methods