Backdoors survive
alignment.

— 001

We remove adversarial backdoors from AI models. Independent third-party model cleaning for defense and compliance.

Request Access

Alignment doesn't mean clean.

Adversaries inject poisoned samples into training data. These samples embed hidden triggers that survive standard alignment and activate under specific conditions in production.

01 — The Problem

The attack chain

001
Poisoning

Poisoning

Backdoor triggers injected into training data at scale.

002
Survival

Survival

Triggers persist through RLHF and alignment procedures.

003
Deployment

Deployment

Compromised model passes all standard safety evaluations.

004
Activation

Activation

Hidden behavior fires when trigger condition is met.

02 — The Research

Continued training
removes backdoors.

Our defense uses continued training on verified, clean data to neutralize triggers without needing prior knowledge of the attack mechanism.

Request Access

Cleanup Effort by Model Size

Steps to remove backdoor (normalized)

410M
0.52x
1B
1.0x
2.8B
1.1x
6.9B
2.45x
0.0B
Largest model tested
0.00x
Cleanup scaling factor
0
Poison types removed

Based on Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training (Hubinger et al., 2024)

Validated on pretrained language models up to 6.9B parameters. Fine-tuned model support in development.

03 — Our Service

Model cleaning
service.

We don't host your models. We clean them. You provide a model, we apply continued training on verified clean data, you receive a cleaned model.

01

You provide the model

Any architecture, up to 6.9B+ parameters.

02

We apply continued training

Verified clean data neutralizes hidden trigger mechanisms.

03

You receive a cleaned model

Backdoor behaviors removed. No hosting, no data retention.

Compliance & Audit

Independent verification for regulatory and audit requirements.

Trusted Third Party

Third-party in the loop instead of relying solely on model providers.

Research-Backed

Built on rigorous research, tested on models up to 6.9B parameters.

Get Started

Request access.

We work with organizations that need independent verification of AI model safety for compliance and audit purposes.

48h response timeNo cost evaluationCleared personnel