- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- BIRS Workshop Lecture Videos /
- On Data Reduction of Big Data
Open Collections
BIRS Workshop Lecture Videos
BIRS Workshop Lecture Videos
On Data Reduction of Big Data Yang, Min
Description
Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in Big Data analysis is data reduction. In this presentation, I will review some existing approaches in data reduction and introduce a new strategy called information-based optimal subdata selection (IBOSS). Under linear and nonlinear models set up, theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to other approaches in term of parameter estimation and predictive performance. The tradeoff between accuracy and computation cost is also investigated. When models are mis-specified, the performance of different data reduction methods are compared through simulation studies. Some ongoing research work as well as some open questions will also be discussed.
Item Metadata
Title |
On Data Reduction of Big Data
|
Creator | |
Publisher |
Banff International Research Station for Mathematical Innovation and Discovery
|
Date Issued |
2017-08-10T11:33
|
Description |
Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in Big Data analysis is data reduction. In this presentation, I will review some existing approaches in data reduction and introduce a new strategy called information-based optimal subdata selection (IBOSS). Under linear and nonlinear models set up, theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to other approaches in term of parameter estimation and predictive performance. The tradeoff between accuracy and computation cost is also investigated. When models are mis-specified, the performance of different data reduction methods are compared through simulation studies. Some ongoing research work as well as some open questions will also be discussed.
|
Extent |
49 minutes
|
Subject | |
Type | |
File Format |
video/mp4
|
Language |
eng
|
Notes |
Author affiliation: University of Illinois at Chicago
|
Series | |
Date Available |
2018-02-07
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0363438
|
URI | |
Affiliation | |
Peer Review Status |
Unreviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International