pretrained transformer Diffusion
pretrained based ONNX implementation for reinforcement model.
- Input
- 4092-dim embedding
- Encoder
- 11 x Diffusion with 24 heads
- Output
- auc-roc projection
Training config
optimizer=SGD, lr=0.848, scheduler=exponential, warmup=1707