Computed on the validation set of ImageNet 256 (50000 images).
| model | encoder | decoder | rFID | LPIPS | PSNR |
|---|---|---|---|---|---|
| SD 1.5 | base | base | 0.9275 | 0.0753 | 25.35 |
| SD 1.5 | TAE | base | 2.5711 | 0.1109 | 23.89 |
| SD 1.5 | base | TAE | 2.9040 | 0.0995 | 23.47 |
| SD 1.5 | TAE | TAE | 3.8339 | 0.1092 | 23.52 |
| sdxl | base | base | 0.7690 | 0.0703 | 25.60 |
| sdxl | TAE | base | 1.4093 | 0.0888 | 24.83 |
| sdxl | base | TAE | 2.8986 | 0.0964 | 23.76 |
| sdxl | TAE | TAE | 3.6565 | 0.1023 | 23.59 |
| Flux 1 | base | base | 0.2556 | 0.0202 | 30.90 |
| Flux 1 | TAE | base | 0.8840 | 0.0424 | 27.78 |
| Flux 1 | base | TAE | 0.7103 | 0.0467 | 27.75 |
| Flux 1 | TAE | TAE | 0.9433 | 0.0458 | 27.57 |
| Flux 2 | base | base | 0.2121 | 0.0157 | 32.18 |
| Flux 2 | TAE | base | 0.3699 | 0.0288 | 29.51 |
| Flux 2 | base | TAE | 0.4814 | 0.0368 | 29.04 |
| Flux 2 | TAE | TAE | 0.4680 | 0.0383 | 28.43 |