Skip to content

Instantly share code, notes, and snippets.

@rwightman
Last active November 28, 2025 00:03
Show Gist options
  • Select an option

  • Save rwightman/07839a82d0f50e42840168bc43df70b3 to your computer and use it in GitHub Desktop.

Select an option

Save rwightman/07839a82d0f50e42840168bc43df70b3 to your computer and use it in GitHub Desktop.

Old RA2 hparams for some ResNet models, trained in the transition from SGD + RandAugment based 'RA' settings to RMSProp based 'RA2' with Mixup added

The yaml files are for 2x GPU distributed setup, so adjust accordingly for global batch size / LR equivalence.

aa: rand-m9-mstd1.0-inc1
amp: false
apex_amp: false
aug_splits: 0
batch_size: 352
bn_eps: null
bn_momentum: null
bn_tf: false
channels_last: true
clip_grad: null
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: 0.94
cutmix: 0.0
cutmix_minmax: null
data: /imagenet/
decay_epochs: 1.0
decay_rate: 0.988
dist_bn: reduce
drop: 0.19
drop_block: null
drop_connect: null
drop_path: 0.1
epochs: 700
eval_metric: top1
gp: fast
hflip: 0.5
img_size: 256
initial_checkpoint: ''
input_size: null
interpolation: ''
jsd: false
local_rank: 0
log_interval: 50
lr: 0.082
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_noise:
- 0.4
- 0.9
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 1.0e-05
mixup: 0.19
mixup_mode: batch
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: ecaresnet26d
model_ema: true
model_ema_decay: 0.999987
model_ema_force_cpu: false
momentum: 0.9
native_amp: true
no_aug: false
no_prefetcher: false
no_resume_opt: true
num_classes: 1000
opt: rmsproptf
opt_betas: null
opt_eps: 0.001
output: ''
patience_epochs: 10
pin_mem: false
pretrained: false
ratio:
- 0.75
- 1.3333333333333333
recount: 6
recovery_interval: 0
remode: pixel
reprob: 0.4
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: step
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
torchscript: false
train_interpolation: random
tta: 0
use_multi_epochs_loader: false
validation_batch_size_multiplier: 1
vflip: 0.0
warmup_epochs: 3
warmup_lr: 1.0e-06
weight_decay: 7.0e-06
workers: 8
aa: rand-m8-mstd0.5-inc1
amp: true
aug_splits: 0
batch_size: 320
bn_eps: null
bn_momentum: null
bn_tf: false
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: null
cutmix: 0.0
cutmix_minmax:
- 0.2
- 0.6
data: /imagenet/
decay_epochs: 1.0
decay_rate: 0.9875
dist_bn: ''
drop: 0.17
drop_block: null
drop_connect: null
drop_path: 0.09
epochs: 600
eval_metric: top1
gp: avg
hflip: 0.5
img_size: null
initial_checkpoint: ''
interpolation: bicubic
jsd: false
local_rank: 0
log_interval: 50
lr: 0.092
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_noise:
- 0.4
- 0.9
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 1.0e-05
mixup: 0.19
mixup_mode: pair
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.4
model: resnet18d
model_ema: true
model_ema_decay: 0.9999833
model_ema_force_cpu: false
momentum: 0.9
no_aug: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
num_gpu: 1
opt: rmsproptf
opt_eps: 0.001
output: ''
patience_epochs: 10
pin_mem: false
pretrained: false
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.2
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: step
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
train_interpolation: random
tta: 0
use_multi_epochs_loader: false
validation_batch_size_multiplier: 1
vflip: 0.0
warmup_epochs: 3
warmup_lr: 1.0e-06
weight_decay: 7.0e-06
workers: 7
aa: rand-m8-mstd0.5-inc1
amp: true
aug_splits: 0
batch_size: 320
bn_eps: null
bn_momentum: null
bn_tf: false
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: null
cutmix: 0.0
cutmix_minmax: null
data: /imagenet/
decay_epochs: 1.0
decay_rate: 0.9875
dist_bn: ''
drop: 0.17
drop_block: null
drop_connect: null
drop_path: 0.09
epochs: 600
eval_metric: top1
gp: avg
hflip: 0.5
img_size: null
initial_checkpoint: ''
interpolation: bicubic
jsd: false
local_rank: 0
log_interval: 50
lr: 0.092
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_noise:
- 0.4
- 0.9
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 1.0e-05
mixup: 0.19
mixup_mode: pair
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: resnet34d
model_ema: true
model_ema_decay: 0.9999833
model_ema_force_cpu: false
momentum: 0.9
no_aug: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
num_gpu: 1
opt: rmsproptf
opt_eps: 0.001
output: ''
patience_epochs: 10
pin_mem: false
pretrained: false
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.22
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: step
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
train_interpolation: random
tta: 0
use_multi_epochs_loader: false
validation_batch_size_multiplier: 1
vflip: 0.0
warmup_epochs: 3
warmup_lr: 1.0e-06
weight_decay: 7.0e-06
workers: 7
aa: rand-m8-mstd0.5-inc1
amp: true
apex_amp: false
aug_splits: 0
batch_size: 320
bn_eps: null
bn_momentum: null
bn_tf: false
channels_last: false
color_jitter: 0.4
cooldown_epochs: 10
crop_pct: null
cutmix: 0.0
cutmix_minmax: null
data: /imagenet/
decay_epochs: 1.0
decay_rate: 0.9875
dist_bn: ''
drop: 0.17
drop_block: null
drop_connect: null
drop_path: 0.09
epochs: 660
eval_metric: top1
gp: null
hflip: 0.5
img_size: null
initial_checkpoint: ''
interpolation: bicubic
jsd: false
local_rank: 0
log_interval: 50
lr: 0.092
lr_cycle_limit: 1
lr_cycle_mul: 1.0
lr_noise:
- 0.4
- 0.9
lr_noise_pct: 0.67
lr_noise_std: 1.0
mean: null
min_lr: 1.0e-05
mixup: 0.19
mixup_mode: pair
mixup_off_epoch: 0
mixup_prob: 1.0
mixup_switch_prob: 0.5
model: resnet50d
model_ema: true
model_ema_decay: 0.9999833
model_ema_force_cpu: false
momentum: 0.9
native_amp: false
no_aug: false
no_prefetcher: false
no_resume_opt: false
num_classes: 1000
num_gpu: 1
opt: rmsproptf
opt_eps: 0.001
output: ''
patience_epochs: 10
pin_mem: false
pretrained: false
ratio:
- 0.75
- 1.3333333333333333
recount: 1
recovery_interval: 0
remode: pixel
reprob: 0.22
resplit: false
resume: ''
save_images: false
scale:
- 0.08
- 1.0
sched: step
seed: 42
smoothing: 0.1
split_bn: false
start_epoch: null
std: null
sync_bn: false
train_interpolation: random
tta: 0
use_multi_epochs_loader: false
validation_batch_size_multiplier: 1
vflip: 0.0
warmup_epochs: 3
warmup_lr: 1.0e-06
weight_decay: 7.0e-06
workers: 7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment