Created
January 30, 2017 14:09
-
-
Save rachtsingh/ff6c22a6dfadb41808e9586f8d905163 to your computer and use it in GitHub Desktop.
Batch Normalization
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[01/30/17 12:34:39 INFO] Using GPU(s): 1 | |
[01/30/17 12:34:39 INFO] Loading data from '../data/translate-train.t7'... | |
[01/30/17 12:34:44 INFO] * vocabulary size: source = 50004; target = 50004 | |
[01/30/17 12:34:44 INFO] * additional features: source = 0; target = 0 | |
[01/30/17 12:34:44 INFO] * maximum sequence length: source = 50; target = 51 | |
[01/30/17 12:34:44 INFO] * number of training sentences: 100000 | |
[01/30/17 12:34:44 INFO] * maximum batch size: 64 | |
[01/30/17 12:34:44 INFO] Building model... | |
[01/30/17 12:34:48 INFO] * using input feeding | |
[01/30/17 12:34:48 INFO] Initializing parameters... | |
[01/30/17 12:34:50 INFO] * number of parameters: 84814004 | |
[01/30/17 12:34:50 INFO] Preparing memory optimization... | |
[01/30/17 12:34:51 INFO] * sharing 69% of output/gradInput tensors memory between clones | |
[01/30/17 12:34:51 INFO] Start training... | |
[01/30/17 12:34:51 INFO] | |
[01/30/17 12:36:40 INFO] Epoch 1 ; Iteration 50/1588 ; Learning rate 1.0000 ; Source tokens/s 611 ; Perplexity 4847.73 | |
[01/30/17 12:37:58 INFO] Epoch 1 ; Iteration 100/1588 ; Learning rate 1.0000 ; Source tokens/s 745 ; Perplexity 2854.90 | |
[01/30/17 12:39:13 INFO] Epoch 1 ; Iteration 150/1588 ; Learning rate 1.0000 ; Source tokens/s 794 ; Perplexity 2136.63 | |
[01/30/17 12:40:33 INFO] Epoch 1 ; Iteration 200/1588 ; Learning rate 1.0000 ; Source tokens/s 840 ; Perplexity 1737.48 | |
[01/30/17 12:41:52 INFO] Epoch 1 ; Iteration 250/1588 ; Learning rate 1.0000 ; Source tokens/s 856 ; Perplexity 1491.10 | |
[01/30/17 12:43:12 INFO] Epoch 1 ; Iteration 300/1588 ; Learning rate 1.0000 ; Source tokens/s 873 ; Perplexity 1312.14 | |
[01/30/17 12:44:29 INFO] Epoch 1 ; Iteration 350/1588 ; Learning rate 1.0000 ; Source tokens/s 878 ; Perplexity 1191.59 | |
[01/30/17 12:45:42 INFO] Epoch 1 ; Iteration 400/1588 ; Learning rate 1.0000 ; Source tokens/s 884 ; Perplexity 1097.34 | |
[01/30/17 12:46:55 INFO] Epoch 1 ; Iteration 450/1588 ; Learning rate 1.0000 ; Source tokens/s 893 ; Perplexity 1021.58 | |
[01/30/17 12:48:06 INFO] Epoch 1 ; Iteration 500/1588 ; Learning rate 1.0000 ; Source tokens/s 894 ; Perplexity 957.77 | |
[01/30/17 12:49:22 INFO] Epoch 1 ; Iteration 550/1588 ; Learning rate 1.0000 ; Source tokens/s 899 ; Perplexity 902.44 | |
[01/30/17 12:50:38 INFO] Epoch 1 ; Iteration 600/1588 ; Learning rate 1.0000 ; Source tokens/s 903 ; Perplexity 852.25 | |
[01/30/17 12:51:50 INFO] Epoch 1 ; Iteration 650/1588 ; Learning rate 1.0000 ; Source tokens/s 905 ; Perplexity 807.01 | |
[01/30/17 12:53:08 INFO] Epoch 1 ; Iteration 700/1588 ; Learning rate 1.0000 ; Source tokens/s 908 ; Perplexity 768.79 | |
[01/30/17 12:54:31 INFO] Epoch 1 ; Iteration 750/1588 ; Learning rate 1.0000 ; Source tokens/s 914 ; Perplexity 732.96 | |
[01/30/17 12:55:47 INFO] Epoch 1 ; Iteration 800/1588 ; Learning rate 1.0000 ; Source tokens/s 917 ; Perplexity 701.21 | |
[01/30/17 12:56:57 INFO] Epoch 1 ; Iteration 850/1588 ; Learning rate 1.0000 ; Source tokens/s 916 ; Perplexity 676.49 | |
[01/30/17 12:58:08 INFO] Epoch 1 ; Iteration 900/1588 ; Learning rate 1.0000 ; Source tokens/s 917 ; Perplexity 652.33 | |
[01/30/17 12:59:21 INFO] Epoch 1 ; Iteration 950/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 630.01 | |
[01/30/17 13:00:36 INFO] Epoch 1 ; Iteration 1000/1588 ; Learning rate 1.0000 ; Source tokens/s 917 ; Perplexity 610.12 | |
[01/30/17 13:01:53 INFO] Epoch 1 ; Iteration 1050/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 590.27 | |
[01/30/17 13:03:09 INFO] Epoch 1 ; Iteration 1100/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 573.21 | |
[01/30/17 13:04:17 INFO] Epoch 1 ; Iteration 1150/1588 ; Learning rate 1.0000 ; Source tokens/s 920 ; Perplexity 557.00 | |
[01/30/17 13:05:30 INFO] Epoch 1 ; Iteration 1200/1588 ; Learning rate 1.0000 ; Source tokens/s 919 ; Perplexity 541.43 | |
[01/30/17 13:06:41 INFO] Epoch 1 ; Iteration 1250/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 526.93 | |
[01/30/17 13:07:55 INFO] Epoch 1 ; Iteration 1300/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 513.11 | |
[01/30/17 13:09:14 INFO] Epoch 1 ; Iteration 1350/1588 ; Learning rate 1.0000 ; Source tokens/s 919 ; Perplexity 499.66 | |
[01/30/17 13:10:36 INFO] Epoch 1 ; Iteration 1400/1588 ; Learning rate 1.0000 ; Source tokens/s 921 ; Perplexity 486.83 | |
[01/30/17 13:11:57 INFO] Epoch 1 ; Iteration 1450/1588 ; Learning rate 1.0000 ; Source tokens/s 924 ; Perplexity 474.97 | |
[01/30/17 13:13:10 INFO] Epoch 1 ; Iteration 1500/1588 ; Learning rate 1.0000 ; Source tokens/s 925 ; Perplexity 464.14 | |
[01/30/17 13:14:26 INFO] Epoch 1 ; Iteration 1550/1588 ; Learning rate 1.0000 ; Source tokens/s 925 ; Perplexity 453.61 | |
[01/30/17 13:15:38 INFO] Validation perplexity: 234.03 | |
[01/30/17 13:15:38 INFO] Saving checkpoint to 'model_epoch1_234.03.t7'... | |
[01/30/17 13:15:41 INFO] | |
[01/30/17 13:16:56 INFO] Epoch 2 ; Iteration 50/1588 ; Learning rate 1.0000 ; Source tokens/s 926 ; Perplexity 201.09 | |
[01/30/17 13:18:09 INFO] Epoch 2 ; Iteration 100/1588 ; Learning rate 1.0000 ; Source tokens/s 918 ; Perplexity 201.03 | |
[01/30/17 13:19:23 INFO] Epoch 2 ; Iteration 150/1588 ; Learning rate 1.0000 ; Source tokens/s 928 ; Perplexity 197.92 | |
[01/30/17 13:20:35 INFO] Epoch 2 ; Iteration 200/1588 ; Learning rate 1.0000 ; Source tokens/s 921 ; Perplexity 196.31 | |
[01/30/17 13:21:57 INFO] Epoch 2 ; Iteration 250/1588 ; Learning rate 1.0000 ; Source tokens/s 936 ; Perplexity 196.82 | |
[01/30/17 13:23:11 INFO] Epoch 2 ; Iteration 300/1588 ; Learning rate 1.0000 ; Source tokens/s 933 ; Perplexity 194.64 | |
[01/30/17 13:24:21 INFO] Epoch 2 ; Iteration 350/1588 ; Learning rate 1.0000 ; Source tokens/s 929 ; Perplexity 191.81 | |
[01/30/17 13:25:41 INFO] Epoch 2 ; Iteration 400/1588 ; Learning rate 1.0000 ; Source tokens/s 933 ; Perplexity 190.83 | |
[01/30/17 13:27:00 INFO] Epoch 2 ; Iteration 450/1588 ; Learning rate 1.0000 ; Source tokens/s 937 ; Perplexity 189.19 | |
[01/30/17 13:28:13 INFO] Epoch 2 ; Iteration 500/1588 ; Learning rate 1.0000 ; Source tokens/s 939 ; Perplexity 187.49 | |
[01/30/17 13:29:32 INFO] Epoch 2 ; Iteration 550/1588 ; Learning rate 1.0000 ; Source tokens/s 940 ; Perplexity 186.26 | |
[01/30/17 13:30:49 INFO] Epoch 2 ; Iteration 600/1588 ; Learning rate 1.0000 ; Source tokens/s 943 ; Perplexity 185.08 | |
[01/30/17 13:32:04 INFO] Epoch 2 ; Iteration 650/1588 ; Learning rate 1.0000 ; Source tokens/s 944 ; Perplexity 183.53 | |
[01/30/17 13:33:20 INFO] Epoch 2 ; Iteration 700/1588 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 181.81 | |
[01/30/17 13:34:35 INFO] Epoch 2 ; Iteration 750/1588 ; Learning rate 1.0000 ; Source tokens/s 946 ; Perplexity 180.63 | |
[01/30/17 13:35:50 INFO] Epoch 2 ; Iteration 800/1588 ; Learning rate 1.0000 ; Source tokens/s 945 ; Perplexity 178.76 | |
[01/30/17 13:37:06 INFO] Epoch 2 ; Iteration 850/1588 ; Learning rate 1.0000 ; Source tokens/s 946 ; Perplexity 177.39 | |
[01/30/17 13:38:21 INFO] Epoch 2 ; Iteration 900/1588 ; Learning rate 1.0000 ; Source tokens/s 946 ; Perplexity 175.86 | |
[01/30/17 13:39:37 INFO] Epoch 2 ; Iteration 950/1588 ; Learning rate 1.0000 ; Source tokens/s 947 ; Perplexity 174.15 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment