Last active
March 7, 2020 07:54
-
-
Save jonathan-taylor/a4311d4c0f662c4e97f99475f389ef90 to your computer and use it in GitHub Desktop.
Comparison to regular sparse group LASSO
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"Loading required package: rms\n", | |
"Loading required package: Hmisc\n", | |
"Loading required package: lattice\n", | |
"Loading required package: survival\n", | |
"Loading required package: Formula\n", | |
"Loading required package: ggplot2\n", | |
"\n", | |
"Attaching package: ‘Hmisc’\n", | |
"\n", | |
"The following objects are masked from ‘package:base’:\n", | |
"\n", | |
" format.pval, units\n", | |
"\n", | |
"Loading required package: SparseM\n", | |
"\n", | |
"Attaching package: ‘SparseM’\n", | |
"\n", | |
"The following object is masked from ‘package:base’:\n", | |
"\n", | |
" backsolve\n", | |
"\n", | |
"Loading required package: mgcv\n", | |
"Loading required package: nlme\n", | |
"This is mgcv 1.8-28. For overview type 'help(\"mgcv-package\")'.\n", | |
"Warning message in FUN(X[[i]], ...):\n", | |
"“946 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“992 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“989 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1001 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1039 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“987 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1053 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1050 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“905 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“989 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1023 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1064 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1123 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1072 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1046 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1152 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n" | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1092 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1083 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1006 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”Warning message in FUN(X[[i]], ...):\n", | |
"“1149 observations have drawn durations\n", | |
" at the minimum or maximum possible value. Generating coefficients\n", | |
" and other quantities of interest are unlikely to be returned\n", | |
" by models due to truncation. Consider making user-supplied coefficients\n", | |
" smaller, making T bigger, or decreasing the variance of the X variables.”" | |
] | |
} | |
], | |
"source": [ | |
"#install.packages('coxed', repos='http://cloud.r-project.org')\n", | |
"library(coxed)\n", | |
"set.seed(0)\n", | |
"K = 20\n", | |
"p = 5000\n", | |
"X_list = list()\n", | |
"censor_list = list()\n", | |
"y_list = list()\n", | |
"true_beta = matrix(rnorm(K*p),K,p)\n", | |
"\n", | |
"for (i in 1:K){\n", | |
" N = (1000+10*i)\n", | |
" X = matrix(as.numeric(rbinom(N*p, 1, 0.5)), N, p)\n", | |
" simdata = sim.survdata(T=120, num.data.frames=1, X = X, beta=true_beta[i,])\n", | |
" df = simdata$data\n", | |
" y = df$y\n", | |
" censor_list[[i]] = as.numeric(df$failed)\n", | |
" X_list[[i]] = X\n", | |
" y_list[[i]] = df$y\n", | |
"}\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"save(X_list, y_list, censor_list, file='instance.RData')\n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Loss\n", | |
"\n", | |
"So, our loss is \n", | |
"$$\n", | |
"\\sum_{k=1}^K \\frac{1}{{\\tt nrow}(X_k)} \\text{CoxLoss}(X_k\\beta[,k], Y_k ,\\delta_k)\n", | |
"$$\n", | |
"with penalty\n", | |
"$$\n", | |
"\\lambda (\\alpha \\|\\beta\\|_1 + (1 - \\alpha) \\sum_{j=1}^p \\sqrt{k} \\|\\beta[j,]\\|_2 \n", | |
"$$\n", | |
"?\n", | |
"\n", | |
"And your code is doing 10 equally spaced on logscale from 1 to 0.1?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"jupytext": { | |
"cell_metadata_filter": "all,-slideshow" | |
}, | |
"kernelspec": { | |
"display_name": "R", | |
"language": "R", | |
"name": "ir" | |
}, | |
"language_info": { | |
"codemirror_mode": "r", | |
"file_extension": ".r", | |
"mimetype": "text/x-r-source", | |
"name": "R", | |
"pygments_lexer": "r", | |
"version": "3.6.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment