BIRS Workshop Lecture Videos
Recent Software Development for Big Data Analysis Yan, Jun
A Partial Review of Software for Big Data Statistics Big data brings challenges to even simple statistical analysis because of the barriers in computer memory and computing time. The computer memory barrier is usually handled by a database connection that extracts data in chunks for processing. The computing time barrier is handled by parallel computing, often accelerated by graphical processing units. In this partial review, we summarize the open source R packages that break the computer memory limit such as biglm and bigmemory, as well as the academic version of the commercial Revolution R, and R packages that support parallel computing. Products from commercial software will also be sketched for completeness. Joint work with Ming-Hui Chen, Elizabeth Schifano, Chun Wang, and Jing Wu of University of Connecticut.
Item Citations and Data
Attribution-NonCommercial-NoDerivs 2.5 Canada