ASHRAE Global Thermal Comfort Database II Földváry Ličina, Veronika; Cheung, Toby; Zhang, Hui; de Dear, Richard; Parkinson, Thomas; Arens, Edward; Chun, Chungyoon; Schiavon, Stefano; Luo, Maohui; Brager, Gail; Li, Peixian; Kaam, Soazig
Recognizing the value of open-source research databases in advancing the art and science of HVAC, in 2014 the ASHRAE Global Thermal Comfort Database II project was launched under the leadership of University of California at Berkeley’s Center for the Built Environment and The University of Sydney’s Indoor Environmental Quality (IEQ) Laboratory.
The exercise began with a systematic collection and harmonization of raw data from the last two decades of thermal comfort field studies around the world. The final database is comprised of field studies conducted between 1995 and 2015 from around the world, with contributors releasing their raw data to the project for wider dissemination to the thermal comfort research community. After the quality-assurance process, there was a total of 81 846 rows of data of paired subjective comfort votes and objective instrumental measurements of thermal comfort parameters. An additional 25 617 rows of data from the original ASHRAE RP-884 database are included, bringing the total number of entries to 107 463.
The database is intended to support diverse inquiries about thermal comfort in field settings. To achieve this goal, two web-based tools were developed to accompany the database:
1. Interactive visualization tool: provides a user-friendly interface for researchers and practitioners to explore and navigate their way around the large volume of data in ASHRAE Global Thermal Comfort Database II
2. Query builder tool: allows users to filter the database according to a set of selection criteria, and then download the results of that query in a generic comma-separated-values (.csv) file
In order to ensure that the quality of the database would permit end-users to conduct robust hypothesis testing, the team built the data collection methodology on specific requirements, as follows:Data needed to come from field experiments rather than climate chamber research, so that it represented research conducted in “real” buildings occupied by “real” people doing their normal day-to-day activities, rather than paid college students sitting in a controlled indoor environment of a climate chamber. Both instrumental (indoor climatic) and subjective (questionnaire) data were required, such that they were recorded in the same space at the same time The database needed to be built up from the raw data files generated by the original researchers, instead of their processed or published findings. The raw data needed to come with a supporting codebook explaining the coding conventions used by the data contributor, to allow harmonization with the standardized data formatting within the database. Data must have been published either in a peer-reviewed journal or conference paper.
All datasets from individual studies were subject to a stringent quality assurance process (Figure 1) before being assimilated into the database. The research team conducted a final validation by first comparing each raw dataset with its related publication provided by the data contributor to prevent transmission errors. Systematic quality control of each study was performed to ensure that records within the database were reasonable. Firstly, distributions of each variable were visualized to identify aberrant values. Then, cross-plots between two variables (e.g. thermal sensation and thermal comfort) were used to check for incorrectly coded data. Finally, a few rows from each study were randomly selected to verify consistency between the original dataset and the standardized database. Since the data came from multiple independent studies, every record did not necessarily include all of the thermal comfort variables. Where data were missing, that particular range of cells was filled with a null value.; Usage notes
The dataset is provided as a comma-separated value (.csv) file using UTF-8 character encoding. The first row contains human-readable column headers. Each row represents an individual’s questionnaire responses, and the associated instrumental measurements, thermal index values and outdoor meteorological observations where available. Full details can be found in the related work.