- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- BIRS Workshop Lecture Videos /
- Selection of Variables and Functional Forms in Multivariable...
Open Collections
BIRS Workshop Lecture Videos
BIRS Workshop Lecture Videos
Selection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions Harrell, Frank
Description
This talk begins with a contrast of exploratory data analysis (a la Tukey) and formal analysis. Challenges of "too many variables and too few subjects" are briefly discussed in this context. The discussion turns to ways in which variable selection is misleading, contrasting feature selection with successful "kitchen sink" machine learning approaches. This leads to a statistical analogy of Maxwell's demon in which some of the information in the system is "stolen" by feature selection. An example in which the bootstrap is useful in quantifying the difficulty of the task will be shown; this involves getting confidence intervals for importance ranks for predictors. Instead of feature selection, pooled tests of overlapping predictors is advocated for assisting in model interpretation.
Some issues relating to fitting predictor functional form will be addressed, and the statistical advantages of pre-specifying knot locations in regression splines will be outlined. Many statistical analysts are unaware that modern methods for high-dimensional data such as lasso and elastic net frequently trade one set of problems for another, especially related to predictor transformations. This talk attempts to bring these issues more in the open, mentioning how a Bayesian might operate. Finally, some future directions in interaction modeling will be covered.
Item Metadata
Title |
Selection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions
|
Creator | |
Publisher |
Banff International Research Station for Mathematical Innovation and Discovery
|
Date Issued |
2016-07-04T16:49
|
Description |
This talk begins with a contrast of exploratory data analysis (a la Tukey) and formal analysis. Challenges of "too many variables and too few subjects" are briefly discussed in this context. The discussion turns to ways in which variable selection is misleading, contrasting feature selection with successful "kitchen sink" machine learning approaches. This leads to a statistical analogy of Maxwell's demon in which some of the information in the system is "stolen" by feature selection. An example in which the bootstrap is useful in quantifying the difficulty of the task will be shown; this involves getting confidence intervals for importance ranks for predictors. Instead of feature selection, pooled tests of overlapping predictors is advocated for assisting in model interpretation.
Some issues relating to fitting predictor functional form will be addressed, and the statistical advantages of pre-specifying knot locations in regression splines will be outlined. Many statistical analysts are unaware that modern methods for high-dimensional data such as lasso and elastic net frequently trade one set of problems for another, especially related to predictor transformations. This talk attempts to bring these issues more in the open, mentioning how a Bayesian might operate. Finally, some future directions in interaction modeling will be covered. |
Extent |
35 minutes
|
Subject | |
Type | |
File Format |
video/mp4
|
Language |
eng
|
Notes |
Author affiliation: Vanderbilt University
|
Series | |
Date Available |
2017-01-03
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0340468
|
URI | |
Affiliation | |
Peer Review Status |
Unreviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International