xxxzhi · September 16, 2017 09:06
diff --git a/bugtrace.txt b/bugtrace.txt
 Namespace(batch_size=10, ignore_label=0, input_size='228, 304', learning_rate=0.00025, model='model_joint4', momentum=0.9, num_classes=4, num_steps=15001, power=0.9, random_mirror=False, random_scale=False, random_seed=1234, restore_model=False, save_num_images=2, save_pred_every=1000, train_list='', weight_decay=0.0005)
 begin test
 Model size:3,600,248
 Model size:7,209,840
 2017-09-16 17:03:50.563408: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
 2017-09-16 17:03:50.563431: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
 2017-09-16 17:03:50.563439: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
 2017-09-16 17:03:50.563445: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
 2017-09-16 17:03:50.563451: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
 2017-09-16 17:03:51.268766: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
 name: TITAN X (Pascal)
 major: 6 minor: 1 memoryClockRate (GHz) 1.531
 pciBusID 0000:08:00.0
 Total memory: 11.90GiB
 Free memory: 11.76GiB
 2017-09-16 17:03:51.268825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
 2017-09-16 17:03:51.268834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   Y
 2017-09-16 17:03:51.268849: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:08:00.0)
 2017-09-16 17:03:54.261152: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 822.66MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.264369: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 815.94MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.273886: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 885.94MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.273918: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.60GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.273934: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.10GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.288580: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.11GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.288611: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.57GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.305776: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.338472: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.49GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:03:54.338534: W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 815.94MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
 2017-09-16 17:04:04.350564: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 177.19MiB.  Current allocation summary follows.
 2017-09-16 17:04:04.350611: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (256):   Total Chunks: 3, Chunks in use: 0 768B allocated for chunks. 64B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350629: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (512):   Total Chunks: 3, Chunks in use: 0 1.8KiB allocated for chunks. 516B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350647: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1024):  Total Chunks: 4, Chunks in use: 0 5.5KiB allocated for chunks. 1.8KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350664: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2048):  Total Chunks: 1, Chunks in use: 0 3.2KiB allocated for chunks. 2.9KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350680: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4096):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350695: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8192):  Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350710: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16384):         Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350726: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (32768):         Total Chunks: 1, Chunks in use: 0 46.2KiB allocated for chunks. 1.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350742: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (65536):         Total Chunks: 1, Chunks in use: 0 64.0KiB allocated for chunks. 2.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350757: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (131072):        Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350773: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (262144):        Total Chunks: 1, Chunks in use: 0 260.0KiB allocated for chunks. 2.50MiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350788: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (524288):        Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350803: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (1048576):       Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350858: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (2097152):       Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350880: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (4194304):       Total Chunks: 1, Chunks in use: 0 4.28MiB allocated for chunks. 2.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350897: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (8388608):       Total Chunks: 1, Chunks in use: 0 14.00MiB allocated for chunks. 2.0KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350912: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (16777216):      Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350927: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (33554432):      Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350944: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (67108864):      Total Chunks: 1, Chunks in use: 0 126.56MiB allocated for chunks. 126.56MiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350958: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (134217728):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
 2017-09-16 17:04:04.350973: I tensorflow/core/common_runtime/bfc_allocator.cc:643] Bin (268435456):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.

 ...

 017-09-16 17:04:24.406910: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102876fd200 of size 44359680
 2017-09-16 17:04:24.406920: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1028a14b200 of size 44359680
 2017-09-16 17:04:24.406930: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1028cb99200 of size 11089920
 2017-09-16 17:04:24.406940: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1028d62ca00 of size 44359680
 2017-09-16 17:04:24.406950: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1029007aa00 of size 22179840
 2017-09-16 17:04:24.406960: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102915a1a00 of size 258048
 2017-09-16 17:04:24.406970: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10291917900 of size 3870720
 2017-09-16 17:04:24.406980: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10292ac8a00 of size 77629440
 2017-09-16 17:04:24.406990: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102974d1200 of size 88719360
 2017-09-16 17:04:24.407000: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1029c96d200 of size 88719360
 2017-09-16 17:04:24.407010: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102a1e09200 of size 11089920
 2017-09-16 17:04:24.407020: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102a289ca00 of size 11089920
 2017-09-16 17:04:24.407031: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102a3330200 of size 33269760
 2017-09-16 17:04:24.407041: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102a52eaa00 of size 88719360
 2017-09-16 17:04:24.407052: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102aa786a00 of size 46694400
 2017-09-16 17:04:24.407062: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102ad40ea00 of size 44359680
 2017-09-16 17:04:24.407072: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102afe5ca00 of size 44359680
 2017-09-16 17:04:24.407082: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102b28aaa00 of size 44359680
 2017-09-16 17:04:24.407092: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102b52f8a00 of size 44359680
 2017-09-16 17:04:24.407102: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102b7d46a00 of size 88719360
 2017-09-16 17:04:24.407111: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102bd1e2a00 of size 11089920
 2017-09-16 17:04:24.407121: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102bdc76200 of size 22179840
 2017-09-16 17:04:24.407131: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102bf19d200 of size 88719360
 2017-09-16 17:04:24.407141: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102c4639200 of size 88719360
 2017-09-16 17:04:24.407151: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102c9ad5200 of size 597196800
 2017-09-16 17:04:24.407162: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x102ed45d200 of size 353894400
 2017-09-16 17:04:24.407172: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x103025dd200 of size 88719360
 2017-09-16 17:04:24.407183: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10307a79200 of size 3870720
 2017-09-16 17:04:24.407193: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x103086d1000 of size 88719360
 2017-09-16 17:04:24.407203: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1030db6d000 of size 22179840
 2017-09-16 17:04:24.407213: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1030f094000 of size 44359680
 2017-09-16 17:04:24.407222: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10311ae2000 of size 88719360
 2017-09-16 17:04:24.407232: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10316f7e000 of size 44359680
 2017-09-16 17:04:24.407242: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x103199cc000 of size 88719360
 2017-09-16 17:04:24.407252: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x1031ee68000 of size 44359680
 2017-09-16 17:04:24.407262: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x103218b6000 of size 88719360
 2017-09-16 17:04:24.407273: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Chunk at 0x10326d52000 of size 132710400
 2017-09-16 17:04:24.407288: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020f879600 of size 65536
 2017-09-16 17:04:24.407310: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feaba00 of size 256
 2017-09-16 17:04:24.407320: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feabc00 of size 1536
 2017-09-16 17:04:24.407330: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feac300 of size 512
 2017-09-16 17:04:24.407340: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feac600 of size 256
 2017-09-16 17:04:24.407351: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feaca00 of size 1024
 2017-09-16 17:04:24.407361: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020fead000 of size 512
 2017-09-16 17:04:24.407371: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020fead300 of size 1792
 2017-09-16 17:04:24.407381: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feae000 of size 3328
 2017-09-16 17:04:24.407391: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020feaf100 of size 1280
 2017-09-16 17:04:24.407403: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020ff0a600 of size 768
 2017-09-16 17:04:24.407413: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020ff0e500 of size 256
 2017-09-16 17:04:24.407423: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1020ff12a00 of size 47360
 2017-09-16 17:04:24.407434: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x102915e0a00 of size 3370752
 2017-09-16 17:04:24.407445: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x10291cc8900 of size 14680320
 2017-09-16 17:04:24.407455: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x10307e2a200 of size 9072128
 2017-09-16 17:04:24.407465: I tensorflow/core/common_runtime/bfc_allocator.cc:687] Free at 0x1032ebe2000 of size 258046208
 2017-09-16 17:04:24.407475: I tensorflow/core/common_runtime/bfc_allocator.cc:693]      Summary of in-use Chunks by size:
 2017-09-16 17:04:24.407490: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 241 Chunks of size 256 totalling 60.2KiB
 2017-09-16 17:04:24.407503: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 26 Chunks of size 512 totalling 13.0KiB
 2017-09-16 17:04:24.407516: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 119 Chunks of size 1024 totalling 119.0KiB
 2017-09-16 17:04:24.407528: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1280 totalling 1.2KiB
 2017-09-16 17:04:24.407539: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1536 totalling 1.5KiB
 2017-09-16 17:04:24.407552: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 52 Chunks of size 2048 totalling 104.0KiB
 2017-09-16 17:04:24.407563: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 16384 totalling 64.0KiB
 2017-09-16 17:04:24.407576: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 37632 totalling 73.5KiB
 2017-09-16 17:04:24.407588: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 64512 totalling 63.0KiB
 2017-09-16 17:04:24.407600: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 19 Chunks of size 65536 totalling 1.19MiB
 2017-09-16 17:04:24.407612: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 66304 totalling 64.8KiB
 2017-09-16 17:04:24.407624: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 81920 totalling 80.0KiB
 2017-09-16 17:04:24.407636: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 11 Chunks of size 147456 totalling 1.55MiB
 2017-09-16 17:04:24.407648: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 241920 totalling 236.2KiB
 2017-09-16 17:04:24.407660: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 14 Chunks of size 258048 totalling 3.45MiB
 2017-09-16 17:04:24.407672: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 3 Chunks of size 258304 totalling 756.8KiB
 2017-09-16 17:04:24.407684: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 10 Chunks of size 262144 totalling 2.50MiB
 2017-09-16 17:04:24.407696: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 266240 totalling 260.0KiB
 2017-09-16 17:04:24.407707: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 294912 totalling 1.12MiB
 2017-09-16 17:04:24.407719: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 322816 totalling 315.2KiB
 2017-09-16 17:04:24.407732: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 21 Chunks of size 524288 totalling 10.50MiB
 2017-09-16 17:04:24.407743: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 4 Chunks of size 589824 totalling 2.25MiB
 2017-09-16 17:04:24.407755: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 3 Chunks of size 1179648 totalling 3.38MiB
 2017-09-16 17:04:24.407767: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 1835008 totalling 1.75MiB
 2017-09-16 17:04:24.407779: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 11 Chunks of size 2359296 totalling 24.75MiB
 2017-09-16 17:04:24.407791: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 3870720 totalling 7.38MiB
 2017-09-16 17:04:24.407802: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 3924992 totalling 3.74MiB
 2017-09-16 17:04:24.407814: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 8317440 totalling 7.93MiB
 2017-09-16 17:04:24.407826: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 17 Chunks of size 11089920 totalling 179.79MiB
 2017-09-16 17:04:24.407839: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 7 Chunks of size 22179840 totalling 148.07MiB
 2017-09-16 17:04:24.407851: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 33269760 totalling 31.73MiB
 2017-09-16 17:04:24.407863: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 30 Chunks of size 44359680 totalling 1.24GiB
 2017-09-16 17:04:24.407875: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 45137920 totalling 43.05MiB
 2017-09-16 17:04:24.407887: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 46694400 totalling 44.53MiB
 2017-09-16 17:04:24.407899: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 50790400 totalling 48.44MiB
 2017-09-16 17:04:24.407911: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 52741120 totalling 50.30MiB
 2017-09-16 17:04:24.407923: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 2 Chunks of size 77629440 totalling 148.07MiB
 2017-09-16 17:04:24.407935: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 17 Chunks of size 88719360 totalling 1.40GiB
 2017-09-16 17:04:24.407947: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 99809280 totalling 95.19MiB
 2017-09-16 17:04:24.407959: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 132710400 totalling 126.56MiB
 2017-09-16 17:04:24.407972: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 353894400 totalling 337.50MiB
 2017-09-16 17:04:24.407984: I tensorflow/core/common_runtime/bfc_allocator.cc:696] 1 Chunks of size 597196800 totalling 569.53MiB
 2017-09-16 17:04:24.407996: I tensorflow/core/common_runtime/bfc_allocator.cc:700] Sum Total of in-use chunks: 4.50GiB
 2017-09-16 17:04:24.408010: I tensorflow/core/common_runtime/bfc_allocator.cc:702] Stats:
 Limit:                  5112830361
 InUse:                  4827536384
 MaxInUse:               5112738816
 NumAllocs:                    1291
 MaxAllocSize:           4275296000

 2017-09-16 17:04:24.408085: W tensorflow/core/common_runtime/bfc_allocator.cc:277] *********************************************************************xxxx**********************_____
 2017-09-16 17:04:24.408109: W tensorflow/core/framework/op_kernel.cc:1158] Resource exhausted: OOM when allocating tensor with shape[5760,512,5,6]
 Traceback (most recent call last):
  File "bug1.py", line 558, in <module>
    train(args)
  File "bug1.py", line 507, in train
    sess.run(train_op, feed_dict=feed_dict)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
 tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1440,512,7,9]
         [[Node: gradients/fc1_voc12_c1/convolution_grad/Conv2DBackpropInput = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/fc1_voc12_c1/convolution_grad/Shape, fc1_voc12_c1/weights/read, gradients/fc1_voc12_c1/convolution/BatchToSpaceND_grad/SpaceToBatchND)]]

 Caused by op u'gradients/fc1_voc12_c1/convolution_grad/Conv2DBackpropInput', defined at:
  File "bug1.py", line 558, in <module>
    train(args)
  File "bug1.py", line 490, in train
    grads = tf.gradients(reduced_loss_with_l2, tf.trainable_variables())
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 540, in gradients
    grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 346, in _MaybeCompile
    return grad_fn()  # Exit early
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 540, in <lambda>
    grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/nn_grad.py", line 445, in _Conv2DGrad
    op.get_attr("data_format")),
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 488, in conv2d_backprop_input
    data_format=data_format, name=name)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

 ...which was originally created as op u'fc1_voc12_c1/convolution', defined at:
  File "bug1.py", line 558, in <module>
    train(args)
  File "bug1.py", line 458, in train
    is_training=is_training, num_classes=num_classes)
  File "bug1.py", line 78, in __init__
    self.setup(is_training, num_classes)
  File "bug1.py", line 408, in setup
    .atrous_conv(3, 3, num_classes, 12, padding='SAME', relu=False, name='fc1_voc12_c1'))
  File "bug1.py", line 53, in layer_decorated
    layer_output = op(self, layer_input, *args, **kwargs)
  File "bug1.py", line 203, in atrous_conv
    output = convolve(input, kernel)
  File "bug1.py", line 198, in <lambda>
    convolve = lambda i, k: tf.nn.atrous_conv2d(i, k, dilation, padding=padding)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 972, in atrous_conv2d
    name=name)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 670, in convolution
    op=op)
  File "/hik/home/houzhi/.conda/envs/houzhi/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 453, in with_space_to_batch
    result = op(input_converted, num_spatial_dims, "VALID")

 ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1440,512,7,9]
         [[Node: gradients/fc1_voc12_c1/convolution_grad/Conv2DBackpropInput = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/fc1_voc12_c1/convolution_grad/Shape, fc1_voc12_c1/weights/read, gradients/fc1_voc12_c1/convolution/BatchToSpaceND_grad/SpaceToBatchND)]]
diff --git a/problem_test_code.py b/problem_test_code.py
 from __future__ import print_function

 import numpy as np
 import tensorflow as tf

 # TODO change

 IMG_MEAN = np.array((104.00698793, 116.66876762, 122.67891434), dtype=np.float32)

 BATCH_SIZE = 10
 IGNORE_LABEL = 0
 INPUT_SIZE = '228, 304'
 LEARNING_RATE = 2.5e-4
 MOMENTUM = 0.9
 NUM_CLASSES = 4
 NUM_STEPS = 15001
 POWER = 0.9
 RANDOM_SEED = 1234

 SAVE_NUM_IMAGES = 2
 SAVE_PRED_EVERY = 1000
 MAX_TO_KEEP = 2
 WEIGHT_DECAY = 0.0005
 # from kaffe.tensorflow import Network

 import numpy as np
 import tensorflow as tf

 slim = tf.contrib.slim

 DEFAULT_PADDING = 'SAME'


 def layer(op):
    '''Decorator for composable network layers.'''

    def layer_decorated(self, *args, **kwargs):
        # Automatically set a name if not provided.
        name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
        # Figure out the layer inputs.
        if len(self.terminals) == 0:
            raise RuntimeError('No input variables found for layer %s.' % name)
        elif len(self.terminals) == 1:
            layer_input = self.terminals[0]
        else:
            layer_input = list(self.terminals)
        # Perform the operation and get the output.
        layer_output = op(self, layer_input, *args, **kwargs)
        # Add to layer LUT.
        self.layers[name] = layer_output
        # This output is now the input for the next layer.
        self.feed(layer_output)
        # Return self for chained calls.
        return self

    return layer_decorated


 class Network(object):
    def __init__(self, inputs, trainable=True, is_training=False, num_classes=21):
        # The input nodes for this network
        self.inputs = inputs
        # The current list of terminal nodes
        self.terminals = []
        # Mapping from layer names to layers
        self.layers = dict(inputs)
        # If true, the resulting variables are set as trainable
        self.trainable = trainable
        # Switch variable for dropout
        self.use_dropout = tf.placeholder_with_default(tf.constant(1.0),
                                                       shape=[],
                                                       name='use_dropout')
        self.setup(is_training, num_classes)

    def setup(self, is_training):
        '''Construct the network. '''
        raise NotImplementedError('Must be implemented by the subclass.')

    def load(self, data_path, session, ignore_missing=False):
        '''Load network weights.
        data_path: The path to the numpy-serialized network weights
        session: The current TensorFlow session
        ignore_missing: If true, serialized weights for missing layers are ignored.
        '''
        data_dict = np.load(data_path).item()
        for op_name in data_dict:
            with tf.variable_scope(op_name, reuse=True):
                for param_name, data in data_dict[op_name].iteritems():
                    try:
                        var = tf.get_variable(param_name)
                        session.run(var.assign(data))
                    except ValueError:
                        if not ignore_missing:
                            raise

    def feed(self, *args):
        '''Set the input(s) for the next operation by replacing the terminal nodes.
        The arguments can be either layer names or the actual layers.
        '''
        assert len(args) != 0
        self.terminals = []
        for fed_layer in args:
            if isinstance(fed_layer, str):
                try:
                    fed_layer = self.layers[fed_layer]
                except KeyError:
                    raise KeyError('Unknown layer name fed: %s' % fed_layer)
            self.terminals.append(fed_layer)
        return self

    def get_output(self):
        '''Returns the current network output.'''
        return self.terminals[-1]

    def get_unique_name(self, prefix):
        '''Returns an index-suffixed unique name for the given prefix.
        This is used for auto-generating layer names based on the type-prefix.
        '''
        ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
        return '%s_%d' % (prefix, ident)

    def make_var(self, name, shape):
        '''Creates a new TensorFlow variable.'''
        return tf.get_variable(name, shape, trainable=self.trainable)

    def validate_padding(self, padding):
        '''Verifies that the padding is one of the supported ones.'''
        assert padding in ('SAME', 'VALID')

    @layer
    def conv(self,
             input,
             k_h,
             k_w,
             c_o,
             s_h,
             s_w,
             name,
             relu=True,
             padding=DEFAULT_PADDING,
             group=1,
             biased=True):
        # Verify that the padding is acceptable
        self.validate_padding(padding)
        # Get the number of channels in the input
        c_i = input.get_shape()[-1]
        # Verify that the grouping parameter is valid
        assert c_i % group == 0
        assert c_o % group == 0
        # Convolution for a given input and kernel
        convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
        with tf.variable_scope(name) as scope:
            kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])
            if group == 1:
                # This is the common-case. Convolve the input without any further complications.
                output = convolve(input, kernel)
            else:
                # Split the input into groups and then convolve each of them independently
                input_groups = tf.split(3, group, input)
                kernel_groups = tf.split(3, group, kernel)
                output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]
                # Concatenate the groups
                output = tf.concat(3, output_groups)
            # Add the biases
            if biased:
                biases = self.make_var('biases', [c_o])
                output = tf.nn.bias_add(output, biases)
            if relu:
                # ReLU non-linearity
                output = tf.nn.relu(output, name=scope.name)
            return output

    @layer
    def atrous_conv(self,
                    input,
                    k_h,
                    k_w,
                    c_o,
                    dilation,
                    name,
                    relu=True,
                    padding=DEFAULT_PADDING,
                    group=1,
                    biased=True):
        # Verify that the padding is acceptable
        self.validate_padding(padding)
        # Get the number of channels in the input
        c_i = input.get_shape()[-1]
        # Verify that the grouping parameter is valid
        assert c_i % group == 0
        assert c_o % group == 0
        # Convolution for a given input and kernel
        convolve = lambda i, k: tf.nn.atrous_conv2d(i, k, dilation, padding=padding)
        with tf.variable_scope(name) as scope:
            kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])
            if group == 1:
                # This is the common-case. Convolve the input without any further complications.
                output = convolve(input, kernel)
            else:
                # Split the input into groups and then convolve each of them independently
                input_groups = tf.split(3, group, input)
                kernel_groups = tf.split(3, group, kernel)
                output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]
                # Concatenate the groups
                output = tf.concat(3, output_groups)
            # Add the biases
            if biased:
                biases = self.make_var('biases', [c_o])
                output = tf.nn.bias_add(output, biases)
            if relu:
                # ReLU non-linearity
                output = tf.nn.relu(output, name=scope.name)
            return output

    @layer
    def relu(self, input, name):
        return tf.nn.relu(input, name=name)

    @layer
    def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
        self.validate_padding(padding)
        return tf.nn.max_pool(input,
                              ksize=[1, k_h, k_w, 1],
                              strides=[1, s_h, s_w, 1],
                              padding=padding,
                              name=name)

    @layer
    def avg_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
        self.validate_padding(padding)
        return tf.nn.avg_pool(input,
                              ksize=[1, k_h, k_w, 1],
                              strides=[1, s_h, s_w, 1],
                              padding=padding,
                              name=name)

    @layer
    def lrn(self, input, radius, alpha, beta, name, bias=1.0):
        return tf.nn.local_response_normalization(input,
                                                  depth_radius=radius,
                                                  alpha=alpha,
                                                  beta=beta,
                                                  bias=bias,
                                                  name=name)

    @layer
    def concat(self, inputs, axis, name):
        return tf.concat(concat_dim=axis, values=inputs, name=name)

    @layer
    def add(self, inputs, name):
        return tf.add_n(inputs, name=name)

    @layer
    def fc(self, input, num_out, name, relu=True):
        with tf.variable_scope(name) as scope:
            input_shape = input.get_shape()
            if input_shape.ndims == 4:
                # The input is spatial. Vectorize it first.
                dim = 1
                for d in input_shape[1:].as_list():
                    dim *= d
                feed_in = tf.reshape(input, [-1, dim])
            else:
                feed_in, dim = (input, input_shape[-1].value)
            weights = self.make_var('weights', shape=[dim, num_out])
            biases = self.make_var('biases', [num_out])
            op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b
            fc = op(feed_in, weights, biases, name=scope.name)
            return fc

    @layer
    def softmax(self, input, name):
        input_shape = map(lambda v: v.value, input.get_shape())
        if len(input_shape) > 2:
            # For certain models (like NiN), the singleton spatial dimensions
            # need to be explicitly squeezed, since they're not broadcast-able
            # in TensorFlow's NHWC ordering (unlike Caffe's NCHW).
            if input_shape[1] == 1 and input_shape[2] == 1:
                input = tf.squeeze(input, squeeze_dims=[1, 2])
            else:
                raise ValueError('Rank 2 tensor input expected for softmax!')
        return tf.nn.softmax(input, name)

    @layer
    def batch_normalization(self, input, name, is_training, activation_fn=None, scale=True):
        with tf.variable_scope(name) as scope:
            output = slim.batch_norm(
                input,
                activation_fn=activation_fn,
                is_training=is_training,
                updates_collections=None,
                scale=scale,
                scope=scope)
            return output

    @layer
    def dropout(self, input, keep_prob, name):
        keep = 1 - self.use_dropout + (self.use_dropout * keep_prob)
        return tf.nn.dropout(input, keep, name=name)



 class Model(Network):
    def setup(self, is_training, num_classes):
        '''Network definition.

        Args:
          is_training: whether to update the running mean and variance of the batch normalisation layer.
                       If the batch size is small, it is better to keep the running mean and variance of
                       the-pretrained model frozen.
          num_classes: number of classes to predict (including background).
        '''
        last_both_name = 'bn4a_branch2c'

        (self.feed('data')
         .conv(7, 7, 64, 2, 2, biased=False, relu=False, name='conv1')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn_conv1')
         .max_pool(3, 3, 2, 2, name='pool1')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch1')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn2a_branch1'))

        (self.feed('pool1')
         .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2a_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2a_branch2a')
         .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2a_branch2b')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2a_branch2b')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn2a_branch2c'))

        (self.feed('bn2a_branch1',
                   'bn2a_branch2c')
         .add(name='res2a')
         .relu(name='res2a_relu')
         .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2b_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2b_branch2a')
         .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2b_branch2b')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2b_branch2b')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2b_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn2b_branch2c'))

        (self.feed('res2a_relu',
                   'bn2b_branch2c')
         .add(name='res2b')
         .relu(name='res2b_relu')
         .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2c_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2c_branch2a')
         .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2c_branch2b')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2c_branch2b')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2c_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn2c_branch2c'))

        (self.feed('res2a_relu')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4a_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4a_branch2a')
         .atrous_conv(3, 3, 256, 2, padding='SAME', biased=False, relu=False, name='res4a_branch2b')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4a_branch2b')
         .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res4a_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn4a_branch2c'))

        (self.feed('bn4a_branch2c')
         .relu(name='res4a_relu')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b1_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4b1_branch2a')
         .atrous_conv(3, 3, 256, 2, padding='SAME', biased=False, relu=False, name='res4b1_branch2b')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4b1_branch2b')
         .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res4b1_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn4b1_branch2c'))
        # segmentation

        (self.feed('res4a_relu',
                   'bn4b1_branch2c')
         .add(name='res5b')
         .relu(name='res5b_relu')
         .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res5c_branch2a')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn5c_branch2a')
         .atrous_conv(3, 3, 256, 4, padding='SAME', biased=False, relu=False, name='res5c_branch2b')
         .batch_normalization(activation_fn=tf.nn.relu, name='bn5c_branch2b', is_training=is_training)
         .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5c_branch2c')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bn5c_branch2c'))

        (self.feed(last_both_name)
         .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='bc_' + 'con1')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn1')
         .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='bc_' + 'con2')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn2')
         .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='bc_' + 'con3')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn3')
         .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='bc_' + 'con4')
         .batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn4')

         .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='bc_' + 'con5')
         .batch_normalization(is_training=is_training, activation_fn=None, name='bc_' + 'bn5')
         )

        (self.feed('res5b_relu',
                   'bn5c_branch2c', 'bc_' + 'bn5')
         .add(name='res5c')
         .relu(name='res5c_relu')
         .atrous_conv(3, 3, num_classes, 6, padding='SAME', relu=False, name='fc1_voc12_c0'))

        (self.feed('res5c_relu')
         .atrous_conv(3, 3, num_classes, 12, padding='SAME', relu=False, name='fc1_voc12_c1'))

        (self.feed('res5c_relu')
         .atrous_conv(3, 3, num_classes, 18, padding='SAME', relu=False, name='fc1_voc12_c2'))

        (self.feed('res5c_relu')
         .atrous_conv(3, 3, num_classes, 24, padding='SAME', relu=False, name='fc1_voc12_c3'))

        (self.feed('fc1_voc12_c0',
                   'fc1_voc12_c1',
                   'fc1_voc12_c2',
                   'fc1_voc12_c3')
         .add(name='fc1_voc12'))


 def prepare_label(input_batch, new_size, num_classes, one_hot=True):
    """Resize masks and perform one-hot encoding.

    Args:
      input_batch: input tensor of shape [batch_size H W 1].
      new_size: a tensor with new height and width.
      num_classes: number of classes to predict (including background).
      one_hot: whether perform one-hot encoding.

    Returns:
      Outputs a tensor of shape [batch_size h w 21]
      with last dimension comprised of 0's and 1's only.
    """
    with tf.name_scope('label_encode'):
        input_batch = tf.image.resize_nearest_neighbor(input_batch,
                                                       new_size)  # as labels are integer numbers, need to use NN interp.
        input_batch = tf.squeeze(input_batch, squeeze_dims=[3])  # reducing the channel dimension.
        if one_hot:
            input_batch = tf.one_hot(input_batch, depth=num_classes)
    return input_batch


 def train(args):
    """Create the model and start the training."""
    h, w = map(int, args.input_size.split(','))
    tf.set_random_seed(args.random_seed)
    num_classes = 14
    batch_size = 10

    image_ph = tf.placeholder(tf.float32, (10, h, w, 3))
    seg_ph = tf.placeholder(tf.uint8, (10, h, w, 1))
    print('begin test')
    # Create network.
    is_training = tf.placeholder(tf.bool)
    net = Model({'data': image_ph},
                is_training=is_training, num_classes=num_classes)

    print('Model size:{:,d}'.format(np.sum([np.prod(v.get_shape().as_list()) for v in tf.trainable_variables()])))

    # Predictions.
    raw_output = net.layers['fc1_voc12']
    ####################################################################################################################

    all_trainable = [v for v in tf.trainable_variables() if 'beta' not in v.name and 'gamma' not in v.name]

    ####################################################################################################################
    # Calculate loss
    # Segmentation loss.
    raw_prediction = tf.reshape(raw_output, [-1, num_classes])
    label_proc = prepare_label(seg_ph, tf.stack(raw_output.get_shape()[1:3]), num_classes=num_classes,
                               one_hot=False)  # [batch_size, h, w]
    raw_gt = tf.reshape(label_proc, [-1, ])
    indices = tf.squeeze(tf.where(tf.less_equal(raw_gt, num_classes - 1)), 1)
    gt = tf.cast(tf.gather(raw_gt, indices), tf.int32)
    prediction = tf.gather(raw_prediction, indices)

    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=prediction, labels=gt)
    l2_losses_seg = [args.weight_decay * tf.nn.l2_loss(v) for v in all_trainable if 'weights' in v.name]
    reduced_loss = tf.reduce_mean(loss)
    reduced_loss_with_l2 = reduced_loss + tf.add_n(l2_losses_seg)

    # Define loss and optimisation parameters.
    base_lr = tf.constant(args.learning_rate)
    step_ph = tf.placeholder(dtype=tf.float32, shape=())
    learning_rate = tf.scalar_mul(base_lr, tf.pow((1 - step_ph / args.num_steps), args.power))

    opt = tf.train.MomentumOptimizer(learning_rate, args.momentum)
    grads = tf.gradients(reduced_loss_with_l2, tf.trainable_variables())
    train_op = opt.apply_gradients(zip(grads, tf.trainable_variables()))

    print('Model size:{:,d}'.format(np.sum([np.prod(v.get_shape().as_list()) for v in tf.global_variables()])))

 ########################################################################################################################
    # When is_training is False, I have to set fraction with 0.5 to avoid OOM
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.4)
    config = tf.ConfigProto(gpu_options=gpu_options)
    with tf.Session(config=config) as sess:
        init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init_op)
        for step in range(11):
            image = np.random.random_integers(0, 255, 10 * h * w * 3).reshape((10, h, w, 3)).astype(np.float32)
            seg = np.random.random_integers(0, 14, 10 * h * w * 1).reshape((10, h, w, 1))
            # When feed is_training with False, The program will raise OOM
            feed_dict = {step_ph: step, is_training: False, image_ph: image, seg_ph: seg}
            sess.run(train_op, feed_dict=feed_dict)


 def get_arguments():
    """Parse all the arguments provided from the CLI.

    Returns:
      A list of parsed arguments.
    """
    import argparse
    parser = argparse.ArgumentParser(description="DeepLab-ResNet Network")
    parser.add_argument("--batch-size", type=int, default=BATCH_SIZE,
                        help="Number of images sent to the network in one step.")
    parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
                        help="The index of the label to ignore during the training.")
    parser.add_argument("--input-size", type=str, default=INPUT_SIZE,
                        help="Comma-separated string with height and width of images.")
    parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE,
                        help="Base learning rate for training with polynomial decay.")
    parser.add_argument("--momentum", type=float, default=MOMENTUM,
                        help="Momentum component of the optimiser.")
    parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
                        help="Number of classes to predict (including background).")
    parser.add_argument("--num-steps", type=int, default=NUM_STEPS,
                        help="Number of training steps.")
    parser.add_argument("--power", type=float, default=POWER,
                        help="Decay parameter to compute the learning rate.")
    parser.add_argument("--random-mirror", action="store_true",
                        help="Whether to randomly mirror the inputs during the training.")
    parser.add_argument("--random-scale", action="store_true",
                        help="Whether to randomly scale the inputs during the training.")
    parser.add_argument("--random-seed", type=int, default=RANDOM_SEED,
                        help="Random seed to have reproducible results.")
    parser.add_argument("--restore-model", action="store_true",
                        help="Whether to restore model from restore-from")
    parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES,
                        help="How many images to save.")
    parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY,
                        help="Save summaries and checkpoint every often.")
    parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY,
                        help="Regularisation parameter for L2-loss.")
    parser.add_argument("--model", type=str, default='model_joint4',
                        help="Model path")
    parser.add_argument("--train-list", type=str, default='',
                        help="train file list contains image file list")
    return parser.parse_args()


 if __name__ == '__main__':
    args = get_arguments()
    print(args)
    train(args)
	from __future__ import print_function

	import numpy as np
	import tensorflow as tf

	# TODO change

	IMG_MEAN = np.array((104.00698793, 116.66876762, 122.67891434), dtype=np.float32)

	BATCH_SIZE = 10
	IGNORE_LABEL = 0
	INPUT_SIZE = '228, 304'
	LEARNING_RATE = 2.5e-4
	MOMENTUM = 0.9
	NUM_CLASSES = 4
	NUM_STEPS = 15001
	POWER = 0.9
	RANDOM_SEED = 1234

	SAVE_NUM_IMAGES = 2
	SAVE_PRED_EVERY = 1000
	MAX_TO_KEEP = 2
	WEIGHT_DECAY = 0.0005
	# from kaffe.tensorflow import Network

	import numpy as np
	import tensorflow as tf

	slim = tf.contrib.slim

	DEFAULT_PADDING = 'SAME'


	def layer(op):
	'''Decorator for composable network layers.'''

	def layer_decorated(self, args, *kwargs):
	# Automatically set a name if not provided.
	name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
	# Figure out the layer inputs.
	if len(self.terminals) == 0:
	raise RuntimeError('No input variables found for layer %s.' % name)
	elif len(self.terminals) == 1:
	layer_input = self.terminals[0]
	else:
	layer_input = list(self.terminals)
	# Perform the operation and get the output.
	layer_output = op(self, layer_input, args, *kwargs)
	# Add to layer LUT.
	self.layers[name] = layer_output
	# This output is now the input for the next layer.
	self.feed(layer_output)
	# Return self for chained calls.
	return self

	return layer_decorated


	class Network(object):
	def __init__(self, inputs, trainable=True, is_training=False, num_classes=21):
	# The input nodes for this network
	self.inputs = inputs
	# The current list of terminal nodes
	self.terminals = []
	# Mapping from layer names to layers
	self.layers = dict(inputs)
	# If true, the resulting variables are set as trainable
	self.trainable = trainable
	# Switch variable for dropout
	self.use_dropout = tf.placeholder_with_default(tf.constant(1.0),
	shape=[],
	name='use_dropout')
	self.setup(is_training, num_classes)

	def setup(self, is_training):
	'''Construct the network. '''
	raise NotImplementedError('Must be implemented by the subclass.')

	def load(self, data_path, session, ignore_missing=False):
	'''Load network weights.
	data_path: The path to the numpy-serialized network weights
	session: The current TensorFlow session
	ignore_missing: If true, serialized weights for missing layers are ignored.
	'''
	data_dict = np.load(data_path).item()
	for op_name in data_dict:
	with tf.variable_scope(op_name, reuse=True):
	for param_name, data in data_dict[op_name].iteritems():
	try:
	var = tf.get_variable(param_name)
	session.run(var.assign(data))
	except ValueError:
	if not ignore_missing:
	raise

	def feed(self, *args):
	'''Set the input(s) for the next operation by replacing the terminal nodes.
	The arguments can be either layer names or the actual layers.
	'''
	assert len(args) != 0
	self.terminals = []
	for fed_layer in args:
	if isinstance(fed_layer, str):
	try:
	fed_layer = self.layers[fed_layer]
	except KeyError:
	raise KeyError('Unknown layer name fed: %s' % fed_layer)
	self.terminals.append(fed_layer)
	return self

	def get_output(self):
	'''Returns the current network output.'''
	return self.terminals[-1]

	def get_unique_name(self, prefix):
	'''Returns an index-suffixed unique name for the given prefix.
	This is used for auto-generating layer names based on the type-prefix.
	'''
	ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
	return '%s_%d' % (prefix, ident)

	def make_var(self, name, shape):
	'''Creates a new TensorFlow variable.'''
	return tf.get_variable(name, shape, trainable=self.trainable)

	def validate_padding(self, padding):
	'''Verifies that the padding is one of the supported ones.'''
	assert padding in ('SAME', 'VALID')

	@layer
	def conv(self,
	input,
	k_h,
	k_w,
	c_o,
	s_h,
	s_w,
	name,
	relu=True,
	padding=DEFAULT_PADDING,
	group=1,
	biased=True):
	# Verify that the padding is acceptable
	self.validate_padding(padding)
	# Get the number of channels in the input
	c_i = input.get_shape()[-1]
	# Verify that the grouping parameter is valid
	assert c_i % group == 0
	assert c_o % group == 0
	# Convolution for a given input and kernel
	convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
	with tf.variable_scope(name) as scope:
	kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])
	if group == 1:
	# This is the common-case. Convolve the input without any further complications.
	output = convolve(input, kernel)
	else:
	# Split the input into groups and then convolve each of them independently
	input_groups = tf.split(3, group, input)
	kernel_groups = tf.split(3, group, kernel)
	output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]
	# Concatenate the groups
	output = tf.concat(3, output_groups)
	# Add the biases
	if biased:
	biases = self.make_var('biases', [c_o])
	output = tf.nn.bias_add(output, biases)
	if relu:
	# ReLU non-linearity
	output = tf.nn.relu(output, name=scope.name)
	return output

	@layer
	def atrous_conv(self,
	input,
	k_h,
	k_w,
	c_o,
	dilation,
	name,
	relu=True,
	padding=DEFAULT_PADDING,
	group=1,
	biased=True):
	# Verify that the padding is acceptable
	self.validate_padding(padding)
	# Get the number of channels in the input
	c_i = input.get_shape()[-1]
	# Verify that the grouping parameter is valid
	assert c_i % group == 0
	assert c_o % group == 0
	# Convolution for a given input and kernel
	convolve = lambda i, k: tf.nn.atrous_conv2d(i, k, dilation, padding=padding)
	with tf.variable_scope(name) as scope:
	kernel = self.make_var('weights', shape=[k_h, k_w, c_i / group, c_o])
	if group == 1:
	# This is the common-case. Convolve the input without any further complications.
	output = convolve(input, kernel)
	else:
	# Split the input into groups and then convolve each of them independently
	input_groups = tf.split(3, group, input)
	kernel_groups = tf.split(3, group, kernel)
	output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]
	# Concatenate the groups
	output = tf.concat(3, output_groups)
	# Add the biases
	if biased:
	biases = self.make_var('biases', [c_o])
	output = tf.nn.bias_add(output, biases)
	if relu:
	# ReLU non-linearity
	output = tf.nn.relu(output, name=scope.name)
	return output

	@layer
	def relu(self, input, name):
	return tf.nn.relu(input, name=name)

	@layer
	def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
	self.validate_padding(padding)
	return tf.nn.max_pool(input,
	ksize=[1, k_h, k_w, 1],
	strides=[1, s_h, s_w, 1],
	padding=padding,
	name=name)

	@layer
	def avg_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
	self.validate_padding(padding)
	return tf.nn.avg_pool(input,
	ksize=[1, k_h, k_w, 1],
	strides=[1, s_h, s_w, 1],
	padding=padding,
	name=name)

	@layer
	def lrn(self, input, radius, alpha, beta, name, bias=1.0):
	return tf.nn.local_response_normalization(input,
	depth_radius=radius,
	alpha=alpha,
	beta=beta,
	bias=bias,
	name=name)

	@layer
	def concat(self, inputs, axis, name):
	return tf.concat(concat_dim=axis, values=inputs, name=name)

	@layer
	def add(self, inputs, name):
	return tf.add_n(inputs, name=name)

	@layer
	def fc(self, input, num_out, name, relu=True):
	with tf.variable_scope(name) as scope:
	input_shape = input.get_shape()
	if input_shape.ndims == 4:
	# The input is spatial. Vectorize it first.
	dim = 1
	for d in input_shape[1:].as_list():
	dim *= d
	feed_in = tf.reshape(input, [-1, dim])
	else:
	feed_in, dim = (input, input_shape[-1].value)
	weights = self.make_var('weights', shape=[dim, num_out])
	biases = self.make_var('biases', [num_out])
	op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b
	fc = op(feed_in, weights, biases, name=scope.name)
	return fc

	@layer
	def softmax(self, input, name):
	input_shape = map(lambda v: v.value, input.get_shape())
	if len(input_shape) > 2:
	# For certain models (like NiN), the singleton spatial dimensions
	# need to be explicitly squeezed, since they're not broadcast-able
	# in TensorFlow's NHWC ordering (unlike Caffe's NCHW).
	if input_shape[1] == 1 and input_shape[2] == 1:
	input = tf.squeeze(input, squeeze_dims=[1, 2])
	else:
	raise ValueError('Rank 2 tensor input expected for softmax!')
	return tf.nn.softmax(input, name)

	@layer
	def batch_normalization(self, input, name, is_training, activation_fn=None, scale=True):
	with tf.variable_scope(name) as scope:
	output = slim.batch_norm(
	input,
	activation_fn=activation_fn,
	is_training=is_training,
	updates_collections=None,
	scale=scale,
	scope=scope)
	return output

	@layer
	def dropout(self, input, keep_prob, name):
	keep = 1 - self.use_dropout + (self.use_dropout * keep_prob)
	return tf.nn.dropout(input, keep, name=name)



	class Model(Network):
	def setup(self, is_training, num_classes):
	'''Network definition.

	Args:
	is_training: whether to update the running mean and variance of the batch normalisation layer.
	If the batch size is small, it is better to keep the running mean and variance of
	the-pretrained model frozen.
	num_classes: number of classes to predict (including background).
	'''
	last_both_name = 'bn4a_branch2c'

	(self.feed('data')
	.conv(7, 7, 64, 2, 2, biased=False, relu=False, name='conv1')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn_conv1')
	.max_pool(3, 3, 2, 2, name='pool1')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch1')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn2a_branch1'))

	(self.feed('pool1')
	.conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2a_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2a_branch2a')
	.conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2a_branch2b')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2a_branch2b')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn2a_branch2c'))

	(self.feed('bn2a_branch1',
	'bn2a_branch2c')
	.add(name='res2a')
	.relu(name='res2a_relu')
	.conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2b_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2b_branch2a')
	.conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2b_branch2b')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2b_branch2b')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2b_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn2b_branch2c'))

	(self.feed('res2a_relu',
	'bn2b_branch2c')
	.add(name='res2b')
	.relu(name='res2b_relu')
	.conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2c_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2c_branch2a')
	.conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2c_branch2b')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn2c_branch2b')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2c_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn2c_branch2c'))

	(self.feed('res2a_relu')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4a_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4a_branch2a')
	.atrous_conv(3, 3, 256, 2, padding='SAME', biased=False, relu=False, name='res4a_branch2b')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4a_branch2b')
	.conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res4a_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn4a_branch2c'))

	(self.feed('bn4a_branch2c')
	.relu(name='res4a_relu')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b1_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4b1_branch2a')
	.atrous_conv(3, 3, 256, 2, padding='SAME', biased=False, relu=False, name='res4b1_branch2b')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn4b1_branch2b')
	.conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res4b1_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn4b1_branch2c'))
	# segmentation

	(self.feed('res4a_relu',
	'bn4b1_branch2c')
	.add(name='res5b')
	.relu(name='res5b_relu')
	.conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res5c_branch2a')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bn5c_branch2a')
	.atrous_conv(3, 3, 256, 4, padding='SAME', biased=False, relu=False, name='res5c_branch2b')
	.batch_normalization(activation_fn=tf.nn.relu, name='bn5c_branch2b', is_training=is_training)
	.conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5c_branch2c')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bn5c_branch2c'))

	(self.feed(last_both_name)
	.conv(3, 3, 64, 1, 1, biased=False, relu=False, name='bc_' + 'con1')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn1')
	.conv(3, 3, 64, 1, 1, biased=False, relu=False, name='bc_' + 'con2')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn2')
	.conv(3, 3, 128, 1, 1, biased=False, relu=False, name='bc_' + 'con3')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn3')
	.conv(3, 3, 128, 1, 1, biased=False, relu=False, name='bc_' + 'con4')
	.batch_normalization(is_training=is_training, activation_fn=tf.nn.relu, name='bc_' + 'bn4')

	.conv(1, 1, 512, 1, 1, biased=False, relu=False, name='bc_' + 'con5')
	.batch_normalization(is_training=is_training, activation_fn=None, name='bc_' + 'bn5')
	)

	(self.feed('res5b_relu',
	'bn5c_branch2c', 'bc_' + 'bn5')
	.add(name='res5c')
	.relu(name='res5c_relu')
	.atrous_conv(3, 3, num_classes, 6, padding='SAME', relu=False, name='fc1_voc12_c0'))

	(self.feed('res5c_relu')
	.atrous_conv(3, 3, num_classes, 12, padding='SAME', relu=False, name='fc1_voc12_c1'))

	(self.feed('res5c_relu')
	.atrous_conv(3, 3, num_classes, 18, padding='SAME', relu=False, name='fc1_voc12_c2'))

	(self.feed('res5c_relu')
	.atrous_conv(3, 3, num_classes, 24, padding='SAME', relu=False, name='fc1_voc12_c3'))

	(self.feed('fc1_voc12_c0',
	'fc1_voc12_c1',
	'fc1_voc12_c2',
	'fc1_voc12_c3')
	.add(name='fc1_voc12'))


	def prepare_label(input_batch, new_size, num_classes, one_hot=True):
	"""Resize masks and perform one-hot encoding.

	Args:
	input_batch: input tensor of shape [batch_size H W 1].
	new_size: a tensor with new height and width.
	num_classes: number of classes to predict (including background).
	one_hot: whether perform one-hot encoding.

	Returns:
	Outputs a tensor of shape [batch_size h w 21]
	with last dimension comprised of 0's and 1's only.
	"""
	with tf.name_scope('label_encode'):
	input_batch = tf.image.resize_nearest_neighbor(input_batch,
	new_size) # as labels are integer numbers, need to use NN interp.
	input_batch = tf.squeeze(input_batch, squeeze_dims=[3]) # reducing the channel dimension.
	if one_hot:
	input_batch = tf.one_hot(input_batch, depth=num_classes)
	return input_batch


	def train(args):
	"""Create the model and start the training."""
	h, w = map(int, args.input_size.split(','))
	tf.set_random_seed(args.random_seed)
	num_classes = 14
	batch_size = 10

	image_ph = tf.placeholder(tf.float32, (10, h, w, 3))
	seg_ph = tf.placeholder(tf.uint8, (10, h, w, 1))
	print('begin test')
	# Create network.
	is_training = tf.placeholder(tf.bool)
	net = Model({'data': image_ph},
	is_training=is_training, num_classes=num_classes)

	print('Model size:{:,d}'.format(np.sum([np.prod(v.get_shape().as_list()) for v in tf.trainable_variables()])))

	# Predictions.
	raw_output = net.layers['fc1_voc12']
	####################################################################################################################

	all_trainable = [v for v in tf.trainable_variables() if 'beta' not in v.name and 'gamma' not in v.name]

	####################################################################################################################
	# Calculate loss
	# Segmentation loss.
	raw_prediction = tf.reshape(raw_output, [-1, num_classes])
	label_proc = prepare_label(seg_ph, tf.stack(raw_output.get_shape()[1:3]), num_classes=num_classes,
	one_hot=False) # [batch_size, h, w]
	raw_gt = tf.reshape(label_proc, [-1, ])
	indices = tf.squeeze(tf.where(tf.less_equal(raw_gt, num_classes - 1)), 1)
	gt = tf.cast(tf.gather(raw_gt, indices), tf.int32)
	prediction = tf.gather(raw_prediction, indices)

	loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=prediction, labels=gt)
	l2_losses_seg = [args.weight_decay * tf.nn.l2_loss(v) for v in all_trainable if 'weights' in v.name]
	reduced_loss = tf.reduce_mean(loss)
	reduced_loss_with_l2 = reduced_loss + tf.add_n(l2_losses_seg)

	# Define loss and optimisation parameters.
	base_lr = tf.constant(args.learning_rate)
	step_ph = tf.placeholder(dtype=tf.float32, shape=())
	learning_rate = tf.scalar_mul(base_lr, tf.pow((1 - step_ph / args.num_steps), args.power))

	opt = tf.train.MomentumOptimizer(learning_rate, args.momentum)
	grads = tf.gradients(reduced_loss_with_l2, tf.trainable_variables())
	train_op = opt.apply_gradients(zip(grads, tf.trainable_variables()))

	print('Model size:{:,d}'.format(np.sum([np.prod(v.get_shape().as_list()) for v in tf.global_variables()])))

	########################################################################################################################
	# When is_training is False, I have to set fraction with 0.5 to avoid OOM
	gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.4)
	config = tf.ConfigProto(gpu_options=gpu_options)
	with tf.Session(config=config) as sess:
	init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
	sess.run(init_op)
	for step in range(11):
	image = np.random.random_integers(0, 255, 10 * h * w * 3).reshape((10, h, w, 3)).astype(np.float32)
	seg = np.random.random_integers(0, 14, 10 * h * w * 1).reshape((10, h, w, 1))
	# When feed is_training with False, The program will raise OOM
	feed_dict = {step_ph: step, is_training: False, image_ph: image, seg_ph: seg}
	sess.run(train_op, feed_dict=feed_dict)


	def get_arguments():
	"""Parse all the arguments provided from the CLI.

	Returns:
	A list of parsed arguments.
	"""
	import argparse
	parser = argparse.ArgumentParser(description="DeepLab-ResNet Network")
	parser.add_argument("--batch-size", type=int, default=BATCH_SIZE,
	help="Number of images sent to the network in one step.")
	parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
	help="The index of the label to ignore during the training.")
	parser.add_argument("--input-size", type=str, default=INPUT_SIZE,
	help="Comma-separated string with height and width of images.")
	parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE,
	help="Base learning rate for training with polynomial decay.")
	parser.add_argument("--momentum", type=float, default=MOMENTUM,
	help="Momentum component of the optimiser.")
	parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
	help="Number of classes to predict (including background).")
	parser.add_argument("--num-steps", type=int, default=NUM_STEPS,
	help="Number of training steps.")
	parser.add_argument("--power", type=float, default=POWER,
	help="Decay parameter to compute the learning rate.")
	parser.add_argument("--random-mirror", action="store_true",
	help="Whether to randomly mirror the inputs during the training.")
	parser.add_argument("--random-scale", action="store_true",
	help="Whether to randomly scale the inputs during the training.")
	parser.add_argument("--random-seed", type=int, default=RANDOM_SEED,
	help="Random seed to have reproducible results.")
	parser.add_argument("--restore-model", action="store_true",
	help="Whether to restore model from restore-from")
	parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES,
	help="How many images to save.")
	parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY,
	help="Save summaries and checkpoint every often.")
	parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY,
	help="Regularisation parameter for L2-loss.")
	parser.add_argument("--model", type=str, default='model_joint4',
	help="Model path")
	parser.add_argument("--train-list", type=str, default='',
	help="train file list contains image file list")
	return parser.parse_args()


	if __name__ == '__main__':
	args = get_arguments()
	print(args)
	train(args)