UBC Faculty Research and Publications

Metadata for Discovery : Disciplinary Standards and Crosswalk Progress Report Leahey, Amber; Barsky, Eugene; Brosz, John; Garnett, Alex; Gray, Vincent; Hafner, Joseph; Handren, Kara; Harrigan, Amanda; Lacroix, Christian; Pascoe, Julienne; Sahadath, Catelynne; Savard, Dany; Senior, Andrew; Towell, Barbara; Wilson, Lee 2017-09

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


52383-Leahey_A_et_al_Metadata_crosswalk_progress.pdf [ 919.83kB ]
JSON: 52383-1.0355406.json
JSON-LD: 52383-1.0355406-ld.json
RDF/XML (Pretty): 52383-1.0355406-rdf.xml
RDF/JSON: 52383-1.0355406-rdf.json
Turtle: 52383-1.0355406-turtle.txt
N-Triples: 52383-1.0355406-rdf-ntriples.txt
Original Record: 52383-1.0355406-source.json
Full Text

Full Text

   1 www.portagenetwork.ca     Metadata for Discovery:  Disciplinary Standards and Crosswalk Progress Report    Prepared by the Portage Network, Metadata Working Group of the Data Discovery Expert Group on behalf of the Canadian Association of Research Libraries (CARL)  Amber Leahey (Scholars Portal, Chair) Eugene Barsky (University of British Columbia) John Brosz (University of Calgary) Alex Garnett (Simon Fraser University) Vincent Gray (Western University) Joseph Hafner (McGill University) Kara Handren (Scholars Portal) Amanda Harrigan (University of Alberta) Christian Lacroix (Université Laval) Julienne Pascoe (Canadiana.org) Catelynne Sahadath (University of Ottawa)  Dany Savard (York University) Andrew Senior (McGill University) Barbara Towell (University of British Columbia) Lee Wilson (Portage/ACENET)   SEPTEMBER 2017    Portage Network Canadian Association of Research Libraries portage@carl-abrc.ca   PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 2 Table of Contents Background ........................................................................................................................................... 3 Progress Report - Phase 1: 2017 .................................................................................................... 4 List of Standards ............................................................................................................................ 4 FRDR Discovery Profile ............................................................................................................... 7 Standards Evaluation and Mapping / Crosswalk .............................................................. 9 Recommendations for Improved Discovery in FRDR .................................................... 13 Next Steps - Phase 2 2017-2018 .............................................................................................. 15  Appendix 1 ............................................................................................................................................ 16 	     PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 3 Background Crossing a wide range of disciplines, the purpose of the Portage Data Discovery Expert Group is to support research data creators and curators in planning, producing, and managing descriptive metadata for effective discovery and reuse. The group facilitates discussions about data discovery in Canada, to promote the use of metadata standards for research data that support both machine-to-machine and human-to-machine discovery activities.  The Data Discovery Metadata Working Group (DDMWG)1 works to define the scope of metadata standards and considerations for developing the Canadian Federated Research Data Repository (FRDR)2, a national data discovery and repository service (in beta stage as of July 2017). The main purpose of the DDMWG is to identify metadata standards used by Canadian data repositories and to develop detailed crosswalks making research data discoverable. The work includes gathering a list of descriptive metadata standards, evaluating, at a granular-level, metadata elements in use across disciplines and repositories, and making recommendations for a core set of elements for discovery in FRDR. The work scoped out by this group is not only for the benefit of the FRDR project, but also for general interest by anyone involved in the management of metadata for discovery systems and research data repositories. Some discussions have also focused on outreach with other national initiatives and organizations such as Research Data Canada3, Linked Open Data in Libraries, Archives, and Museums4, and the Association of Research Libraries (ARL) SHARE Project5, for broader understanding and knowledge sharing about open research data discovery systems and tools. Moving forward, closer ties to relevant national communities and linked data initiatives may be required.                                              1 Portage Data Discovery Expert Group - Metadata Working Group- https://portagenetwork.ca/working-with-portage/network-of-expertise/data-discovery/call-for-participation/metadata-wg 2 More information about this national infrastructure project to support discovery of research is provided online through CARL Portage (https://portagenetwork.ca/frdr-dfdr)  3 Research Data Canada https://www.rdc-drc.ca 4 LODLAM http://lodlam.net 5 ARL SHARE Project http://www.share-research.org  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 4 Progress Report - Phase 1: 2017 List of Standards DDMWG formed in February 2017 and started evaluating metadata standards used by a variety of Canadian research data repositories, including Open Data Canada, Oceans Network Canada, and Canadian Dataverses6. Since this kind of work is not unique, the group evaluated existing disciplinary metadata standards mappings, crosswalks, data models, ontologies, et cetera for research data. These were shared and discussed via our community meeting calls and e-mail listserv. Since the work of the FRDR had already began several months before the formation of the DDMWG, a recommendation for a data model was determined to be out-of-scope for the group (at least in the initial stages of our work). The group focused on gathering standards, including looking at repositories listed in the CISTI National Gateway to Research Data registry7 and the registry of global repositories Re3data.org8, to further identify metadata standards in use across disciplinary repositories. Ultimately, relying on the rich and varied expertise within the group (ranging from metadata expertise in the social sciences, geosciences, health sciences, cultural heritage and library sectors, etc.), led to a list of over twenty disciplinary metadata standards for further evaluation (see Figure 1).  Figure 1 - List of General and Disciplinary Metadata Standards (considered)  Metadata Standard  General/Discipline Organization and URL Audio-MD  Library of Congress https://www.loc.gov/standards/amdvmd CIDOC  ICOM International Committee for Documentation http://network.icom.museum/cidoc Darwin Core Biodiversity information Biodiversity Information Standards (TDWG) http://rs.tdwg.org/dwc/terms/simple/#simpledwcasxml  Data Documentation Initiative (Codebook and Lifecycle versions) General / Social Science DDI Alliance https://www.ddialliance.org Data Tag Suite (DATS) General / Health Sciences BioCADDIE https://biocaddie.org                                              6 Dataverse Project, installations map: https://dataverse.org 7 https://dr-dn.cisti-icist.nrc-cnrc.gc.ca/eng/home/collection/Gateway%20to%20Research%20Data 8 http://www.re3data.org  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 5 DataCite General, issuing DOIs DataCite https://www.datacite.org EAD3 “International metadata transmission standard for hierarchical descriptions of archival records.” Society of American Archivists https://www2.archivists.org/sites/all/files/TagLibrary-VersionEAD3.pdf  Ecological Markup Language (EML) Ecology The Knowledge Network for Biocomplexity https://knb.ecoinformatics.org/#external//emlparser/docs/index.html  FGDC (CSDG) Geographic Federal Geographic Data Committee https://www.fgdc.gov/metadata/csdgm-standard  ISAAR/ EAC-CPF “General rules for the standardization of archival descriptions of records creators and the context of records creation.” International Council on Archives http://www.ica.org/sites/default/files/CBPS_Guidelines_ISAAR_Second-edition_EN.pdf  ISAD(G) General guidance for the preparation of archival descriptions. International Council on Archives http://www.ica.org/en/isadg-general-international-standard-archival-description-second-edition  ISO 19115 (NAP) Geographic Natural Resources Canada http://nap.geogratis.gc.ca/metadata/napMetadata-eng.html  MARC 21 Bibliographic records description. Library of Congress https://www.loc.gov/marc/marcdocz.html  MODS Bibliographic element set Library of Congress http://www.loc.gov/standards/mods NetCDF CF Metadata Conventions 1.6  Climate and Forecast / Geospatial Climate and Forecast Conventions http://cfconventions.org Open Data Canada Profile (DCAT) General W3C https://www.w3.org/TR/vocab-dcat Protocol Data Element Definitions Health Sciences. Used to describe interventional studies (clinical trials) and observational studies U.S. National Institutes of Health https://clinicaltrials.gov  RAD General. Description of archives. Canadian Committee on Archival Description http://www.cdncouncilarchives.ca/archdesrules.html  Sensor Model Language (SensorML) Multidisciplinary / Geoscience Open Geospatial Consortium http://www.opengeospatial.org/standards/sensorml Video-MD  Library of Congress https://www.loc.gov/standards/amdvmd VRA-CORE General. Used for the “description of works of visual culture as well as the images that document them.” Library of Congress https://www.loc.gov/standards/vracore  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 6 Working from the definition of research data9 provided by the Data Discovery Collections Development Working Group (DDCDWG)10, a related sub-group of the Portage DDEG, the DDMWG decided to focus on just those standards that describe digital research data. Some mapping work was completed for the additional metadata standards listed above, however this report focuses on research data standards. Working group members signed-up for standards and provided detailed field-level notes and analysis in a shared spreadsheet. As the standards covered a range of disciplines and data types, such as social science, health and life science, and geoscience, it was important that the group took the time to review and learn about each of the disciplinary standards before any metadata mapping or cross walking began. In many situations, members had deep knowledge and experience with a particular standard, making for very interesting discussion and cross-domain knowledge exchange.   In some cases, it was straightforward to gather full field-level metadata for the different standards, while in other cases it required additional investigation including retrieval of sample metadata from known repositories, standards organization’s websites, and even conceptual documents referencing the metadata elements. In all, we ended up with thirteen complete sets of descriptive metadata standards to evaluate and then crosswalk. A simple Dublin Core set (model) was chosen for the FRDR data model11 before the initial WG formed, so this was the common standard used for the mapping.                                               9 Definition: Data that are used as primary sources to support technical or scientific enquiry, research, scholarship, or artistic activity, and that are used as evidence in the research process and/or are commonly accepted in the research community as necessary to validate research findings and results. (based on the CASRAI definition http://dictionary.casrai.org/Research_data)  10 https://portagenetwork.ca/wp-content/uploads/2017/04/DDEG-CollectionsWG-TOR-EN.pdf  11 Although this was chosen with the understanding that any domain metadata that could be harvested natively (or provided by depositors) would always be preserved and indexed in the FRDR backend with its original namespace, so that discovery would not be limited to Dublin Core -- rather, it is used as a minimum baseline.  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 7 FRDR Discovery Profile  The FRDR baseline Discovery Profile is based on Simple Dublin Core and is defined using the Open Archives Initiative (OAI) Dublin Core (OAI_DC) standard which is widely used for harvesting metadata from digital repositories. Some elements from the DataCite Schema are also used to define geographic information relating to the data, such as place or location. Together these elements make up the FRDR Metadata Profile.   The full profile contains 18 elements that are searchable in FRDR:  FRDR contains a metadata harvesting system (see Figure 2) that works in the backend to harvest metadata from research data repositories that use a variety of disciplinary metadata standards. This is primarily done using open metadata APIs such as CKAN and OAI-PMH12. Custom harvesters rely on the conceptual metadata crosswalk to the FRDR Discovery Profile and JSON schematic mapping13, in order to display consistent metadata for searching conducted in the FRDR discovery platform.                                              12 https://www.openarchives.org/pmh 13 OAI JSON schema mappings stored in GitHub https://github.com/axfelix/globus_oai dc:title dc:date dc:relation dc:creator  dc:type  dc:coverage dc:subject  dc:format dc:rights  dc:description  dc:identifier DataCite_geolocationPlace dc:publisher dc:source DataCite_geolocationPoint dc:contributor  dc:language DataCite_geolocationBox  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 8 Figure 2 - FRDR Harvesting14   In the initial discussions about the mapping to the FRDR Discovery Profile, the group had several concerns related to how each member would map disciplinary standards to a simple Dublin Core set of elements. Concerns about consistency in the intellectual mapping process across all standards was raised, since individual members of the group were performing different mappings. The second concern was the interpretation of the Dublin Core fields and in what way we would be using them in terms of the discovery interface in the FRDR. This included discussions about access links back to the original data provider / repository, contributor definitions and roles, and relationships and linkages between research data and other related resources. Figure 3 provides information about how certain Dublin Core elements were interpreted and used by the working group members.                                                14 Figure reproduced with permission from Compute Canada.  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 9 Figure 3 – Interpretation and Application of Dublin Core Elements for Mapping  Dublin Core Element Interpretation and Application dc:contributor A contributor is defined as a person or organization that has contributed, either directly to the intellectual content, or, in providing the dissemination and access to the resource (e.g. distributor role, etc. as defined in some standards). Repeatable.   dc:coverage  Coverage is used to define the temporal coverage or time period that this dataset may cover or occur within, and can include other types of coverages not defined here, but not spatial coverage which is managed using a set of DataCite GeoLocation elements, in conjunction with the FRDR Discovery Profile.   dc:source  Source is defined as the original source metadata that has been presented for discovery and reuse in the FRDR repository. This is a mandatory field that can be derived and also auto generated as part of the harvesting / metadata exchange process. The conditions for this field are that it must be a URL that is a resolvable unique resource location for the system to point users to access the original source metadata.   Additional note: This could be the same as the dc:identifier field in some cases, however dc:source must be web resolvable.   dc:relation  Relations include references, citations, in either structured or unstructured form, to related materials, publications, documentation, studies, etc., that may be related to this item.    Standards Evaluation and Mapping / Crosswalk First, the group developed criteria to codify the elements to assist with the mapping process. A colour coding scheme was used during the evaluation of elements, and was entered directly into the metadata spreadsheet. This included the identification of elements thought to be closely related to Dublin Core, unique identifiers or related resources linkages that may provide linkages to sources and other related studies, as well as, elements considered to be discipline-specific.     PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 10 Figure 4 - DATS Metadata Mapping to Dublin Core (example selection)  FRDR-MD (OAI_Dublin Core) http://www.openarchives.org/OAI/2.0/oai_dc.xsd   Refinement DATS  dc:title  entity:Dataset  property:title "title: : "Recurrent somatic mutations in POLR2A define a distinct subset of meningiomas [RNA-seq]" dc:title Alternative entity:Dataset property:alternateIdentifiers  dc:creator  entity:Dataset property:creators  dc:subject  entity:Dataset property:keywords "keywords" : "functional genomics" dc:description Abstract entity:Dataset property:description "description" : "RNA polymerase II mediates the transcription of all protein-coding genes in eukaryotic cells, a process that is fundamental to life..." dc:publisher  entity:Person property:fullName or entity:Organization property:name  dc:contributor Distributor entity: DataRepository property:name  dc:contributor Contact entity:Person property:fullName  dc:date  entity:Dataset property:dates (The type of date is specified in the dateType field, following the DataCite practice.) "dateModified" : "09-12-2016" dc:type  entity:Dataset  property:type "types" : "gene expression"   Close mapping to Dublin Core  Linkage potential  Discipline specific  Evaluating the thirteen data standards required discussion among group members to gain understanding about the inherent differences between the standards, such as the granularity and discipline-specific language used to define elements. A problem identified for the group was weighing the benefits of including parts of  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 11 various standards that did not map to Dublin Core, with some degree of flexibility. For example, if a repository uses the DDI metadata standard (which provides descriptive fields down to the dataset variable and value level), it would be a shame to lose this information in the discovery interface. At the same time, establishing a standard model for discovery to fit all disciplines and to assume that such metadata will always be included is challenging. Variations in the metadata received would make searches on fields incomplete and potentially misleading.   Next, after the initial mapping had been performed for each of the standards, the group combined these mappings into a super-mapping that provided a high-level crosswalk of all the disciplinary and general metadata standards, to the FRDR Discovery Profile (see Figure 5; see Appendix 1 for full-size version). This high-level crosswalk provides a comparative view of the conceptual mapping across disciplinary standards. It is helpful for determining what elements are core across all standards, and how we may fill in the gaps or look to developing a flexible approach for displaying metadata across disciplines in a single discovery interface.  Figure 5 - High-level Disciplinary Metadata Crosswalk (FRDR)15                                               15 See Appendix 1 for full-sized version. FRDR-MD (OAI_Dublin Core) http://www.openarchives.org/OAI/2.0/oai_dc.xsd Datacite DCATS Open Data Canada Darwin Core EMLDATS Protocol Data Element DefinitionsSensorML CF 1.6 DDI 3.2 DDI 2.5 FGDC ISO19115dc:title title title10.65 resource_name_en dwc:datasetName titleentity:Dataset property:title;entity:Dataset property:alternateIdentifiersOfficial Title;Brief Title<gml:name>title; long_name<r:Title>;<r:SubTitle>;<r:AlternateTitle><titl>;<subTitl>;<altTitl> title <gmd:title>dc:creator creator 10.10 creatordwc:recordedBy; dwc:identifiedBy creatorentity:Dataset property:creators Overall Official<sml:contacts>; <sml:contact>; <sml:ResponsibleParty> institution<r:Creator>;<r:ResearcherID> <AuthEnty> originator<gmd:citedResponsibleParty> <gmd:role> "PrincipalInvestigator" or "Author"dc:subject subjectdct:subject; dcat:theme10.92 topic_category10.87 subjectdwc:genus; dwc:subgenus keywordentity:Dataset property:keywords Conditions;Keywords<sml:classification>; <sml:classified>; <sml:keywords><r:Topical Coverage>;<r:Subject>;<r:Keyword><keyword>;<topcClas> subject <gmd:topicCategory>dc:description description10.16 notes_en, 10.17 notes_fr dc:description abstractentity:Dataset property:description Study Purpose<gml:description>comment; cell_methods; source; history<r:Abstract> <abstract>abstract; purpose; progress; currentness reference <gmd:abstract>dc:publisher publisherdct:publisher 10.56 owner_org10.59 org_title_at_publication_en10.60 org_title_at_publication_fr dwc:institutionCode publisherentity:Person property:fullName;entity:Organization property:nameN/A<r:Publisher> <producer> publisher<gmd:citedResponsibleParty> <gmd:role> "Publisher"dc:contributor contributor10.8 contributor_en, 10.9 contributor_fr dc:contributor metadataProviderentity:Person property:fullNameentity: DataRepository property:nameCollaborators<r:Contributor><distrbtr>;<othId> datacred<gmd:citedResponsibleParty> <gmd:role> "Collaborator" or "Distributor"dc:date publicationyeardct:issued 10.11 date_captured10.69 resource_date_publisheddwc:eventDate; dwc:dateIdentified; dcterms:modified pubdateentity:Dataset property:dates First Received;Last Updated;Last Changed Date<sml:validTime><PublicationDate>;<r:Date>;<r:SimpleDate>;<r:StartDate>;<r:EndDate><prodDate>;<collDate>; <distDate>;<depDate> date <gmd:date>dc:type resourcetypedcat:mediaType10.76 resource_type dcterms:typedataset,citation,protocol,softwareentity:dataset property:type Available Study Data/Documents:Type [from list : Individual Participant Data Set, Study Protocol, Statistical Analysis Plan, Informed Consent Form, Clinical Study Report, Analytic Code, Other (specify)]featureType; char; byte; short; int; float; real; double<dc:type>;<r:KindOfData> <dataKind> resdesc, digform<gmd:spatialRepresentationType>dc:format sizedct:format 10.70 resource_formatphysicalentity:DatasetDistribution property:formatsN/A <sml:characteristics name="generalProperties">.nc (NetCDF file extension) <pd:FileFormat><fileType>;<format> digform, formname<gmd:fileType>, <gmd:resourceFormat>dc:identifier identifier dcat:identifier10.71 resource_unique_identifier, 10.41 iddwc:collectionCode; dwc:catalogNumber; dwc:recordNumber; dwc:organismIDpackageId, alternateIdentifier, & URL to EML documententity:Dataset property:identifier;entity:Dataset property:alternateidentifier;entity:Dataset property:relatedidentifierNCT ID<gml:identifier> standard_name<r:UserID>;<r:InternationalIdentifier> <IDNo> <gmd:fileIdentifier>dc:sourcedcat:downloadURL 10.77 resource_url, 10.18 digital_object_identifier dataSourceentity:Access property:landingpage<dc:source> <sources> srccite <gmd:source>dc:language languagedcat:language 10.62 resource_language dcterms:language languageN/A N/Axml:lang="en" <r:Language> language <gmd:language>dc:relation RelatedIdentifierdcat:landingPagedcat:landingPage10.27 program_page_url_en10.28 program_page_url_fr, 10.64 resource_related_relationship, 10.63 resource_record_type10.67 resource_url dwc:associatedReferences citationentity:Dataset property:primarypublications;entity:Publication;entity:Dataset property:relatedidentifierPublication Citation<sml:documentation> references <r:Relationship><othrStdyMat> <otherMat> crossrefdc:coveragedct:temporal; 10.88 time_period_coverage_end10.89 time_period_coverage_start dwc:eventDatecoverage, temporalCoverageentity:Dataset property:dates Study Start Date;Primary Completion Datecalendar; (T); timeSeries<r:TemporalCoverage>;<r:SpatialCoverage>;<r:country><timePrd>;<geogCover>;<nation> temporal <gmd:extent>dc:rights rightsdct:licence 10.35 license_iddcterms:rightsHolder intellectualRightsentity: Dataset property: licenses Available Study Data/Documents: Comments <sml:legalConstraints> <r:Copyright> <copyright>access constraints; use constraints <gmd:resourceContraints>DataCite_geolocationPlace geolocationPlacedct:spatial10.24 geographic_regiondwc:locationID; dwc:continent; dwc:country; dwc:stateProvince; dwc:locality geographicCoverageentity: DataAcquisition (subclass of Activity) property:locationsCity; State/Province; Country<sml:location>; region<r:GeographicLocation>;<r:SpatialCoverage>; <r:country><geogCover>;<nation> place keyword<gmd:keyword> <gmd:MD_KeywordTypeCode> "place"DataCite_geolocationPoint geolocationPoint dwc:verbatimCoordinatesN/A N/A<sml:point>axis; coordinates; unit; (Z); (Y); (X) <r:Point> <point> horizsys <gmd:EX_Extent>DataCite_geolocationBox geolocationBoxN/A N/Abounds; cell_measures<r:GeographicBoundary>;<r:BoundingBox> <geoBndBox> bounding<gmd:EX_GeographicBoundingBox> PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 12 The Phase 1 recommendations from the DDCDWG are to include ten Canadian data repositories in FRDR for discovery. These repositories cover a range of academic disciplines. In some cases, further evaluation of the repositories standards and subjects will be required to assess whether the initial metadata mappings and crosswalk provided by the DDMWG will be sufficient for harvesting from these ten repositories.  Figure 6 - Repository Shortlist with Subject and Standard   Research Data Repository Top-level Subject (using re3data.org) Standard(s) Included in FRDR Beta Y/N Canadian Opinion Research Archive (http://www.queensu.ca/cora)  Humanities and Social Sciences OAI / DDI-Codebook No Ocean Networks Canada (http://www.oceannetworks.ca)  Life Sciences - Natural Sciences - Engineering Science N/A - still investigating  No Polar Data Catalogue (https://www.polardata.ca)  Humanities and Social Sciences - Life Sciences OAI / FGDC  Yes Canadian Dataverses  (http://dataverse.scholarsportal.info, https://dataverse.library.ualberta.ca, http://dvn.library.ubc.ca/dvn) Humanities and Social Sciences - Life Sciences - Natural Sciences - Engineering Sciences OAI /DC / DDI Yes Mouse Atlas of Gene Expression (http://www.mouseatlas.org/mouseatlas_index_html)  Life Sciences  N/A  No World Ozone and Ultraviolet Radiation Data Centre (http://woudc.org)  Natural Sciences OAI /  ISO 19115 No Ocean Tracking Network (OTN) (http://oceantrackingnetwork.org) Life Sciences - Natural Sciences CSW No Hakai Institute (https://www.hakai.org)  Natural Sciences N/A No BC Conservation Data Centre (http://www2.gov.bc.ca/gov/content/environment/plants-animals-ecosystems/conservation-data-centre)  Life Sciences OAI /  CKAN API Yes – through Open Data B.C.  Mapping new repositories will continue into the fall of 2017 as we may need to consider if a custom schema is used by the repository. If so, additional mapping to the FRDR Profile will be required.      PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 13 Recommendations for Improved Discovery in FRDR In order to support the granularity and disciplinary specific elements required for rich discovery in the FRDR discovery interface, the DDMWG recommends additional support be developed in the FRDR system.   Recommendations for improved data discovery in FRDR: ●   Support for subject faceting / browsing of search results with faceting ●   Flexible and granular disciplinary metadata support (e.g. dataset variable-level) ●   Enhanced linkages to related resources and source datasets   There may be several ways to achieve these recommendations for improved discovery in FRDR, some of which may also require additional investigation and research by the DDMWG and others, in close collaboration with the FRDR development team. So far, the DDMWG has evaluated approaches that may provide solutions for improved discovery in FRDR, including the use of a standard subject classification such as the re3data.org subjects accessible via an open metadata API for assigning subjects to all repositories’ metadata (for those listed in re3data.org)16, and/or, the use of data services such as the OCLC FAST Service17 for programmatic access to Library of Congress Subject Headings. Nevertheless, there are significant challenges associated with maintaining additional enhanced metadata for external metadata resources. The sustainability of enhanced metadata would not be likely, since each repository may use different standards for assigning subjects to metadata / data sources, thus requiring additional and resource intensive maintenance of subject mappings to FRDR core subjects (if adopted).   The group has discussed the need for FRDR to support ‘Disciplinary Views’ for different standards in order to be flexible and offer up metadata elements to the search interface as needed. For example, some repositories include rich metadata down to the dataset’s variable / element level, which can be very helpful to researchers searching for particular data elements. Storing original source                                             16 Re3data.org API http://www.re3data.org/api/doc  17 OCLC FAST Search http://fast.oclc.org/searchfast  PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 14 metadata in its entirety and developing flexible metadata viewers to be called upon from the FRDR discovery interface will be necessary to support this kind of discovery in the system.   Linking data to related publications and resources on the web is a growing area of concern within libraries, archives, and for researchers. It is important to present research data within context of other related research outputs available in repositories, and with citations and resource links on the web. Discovery of research data across a variety of systems will be a requirement for understanding the full research lifecycle and tracking research outputs for evaluation of tenure and other academic achievement. To support a model where research data are linked to related research outputs, including publication, structured linkages must exist between resources on the web. This includes structured approaches to including linkages to publications, related studies, and source data, within adopted data models and discovery frameworks. Currently, FRDR does not support a linked data model such as the Dublin Core Resource Description Framework (RDF) schema18, which describes the semantic relationships between elements defined in Dublin Core, and, to other related resources and ontologies found on the web.   The group also discussed the use of other well documented linked data models such as DCATS19 and the Portland Common Data Model20. While it may take considerable resources to achieve a comprehensive linked data model for research data in FRDR, we recommend considering worthwhile partnerships with other organizations aiming to achieve similar goals. The ARL-OSF SHARE Project, is one such organization that has demonstrated interest in working closely with Portage, and beginning in 2017, will be addressing linked data with its current data model review.                                               18 DC-RDF http://dublincore.org/documents/dc-rdf 19 DCATS https://www.w3.org/TR/vocab-dcat 20 PCDM https://github.com/duraspace/pcdm/wiki   PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 15 Next Steps - Phase 2 2017-2018 Over the next few months and into early 2018 the DDMWG will work closely with the FRDR development team to provide support with metadata mapping and crosswalking for the remaining repositories for inclusion. This will include identifying metadata standards in use where there is not identified standards currently. We will also work to achieve better understanding and pathways forward about best approaches for providing subject faceting and browsing in FRDR, granular and flexible discovery of disciplinary metadata in the system, and approaches to linked data, as they take shape within the broader metadata community.      PORTAGE NETWORK / ASSOCIATION OF CANADIAN RESEARCH LIBRARIES 16 Appendix 1 High-level Disciplinary Metadata Crosswalk (FRDR)  FRDR-MD (OAI_Dublin Core) http://www.openarchives.org/OAI/2.0/oai_dc.xsdDataciteDCATSOpen Data CanadaDarwin CoreEMLDATSProtocol Data Element DefinitionsSensorMLCF 1.6DDI 3.2DDI 2.5 FGDCISO19115dc:titletitletitle10.65 resource_name_endwc:datasetNametitleentity:Dataset property:title;entity:Dataset property:alternateIdentifiersOfficial Title;Brief Title<gml:name>title; long_name<r:Title>;<r:SubTitle>;<r:AlternateTitle><titl>;<subTitl>;<altTitl>title<gmd:title>dc:creatorcreator10.10 creatordwc:recordedBy; dwc:identifiedBycreatorentity:Dataset property:creatorsOverall Official<sml:contacts>; <sml:contact>; <sml:ResponsibleParty>institution<r:Creator>;<r:ResearcherID><AuthEnty>originator<gmd:citedResponsibleParty> <gmd:role> "PrincipalInvestigator" or "Author"dc:subjectsubjectdct:subject; dcat:theme10.92 topic_category10.87 subjectdwc:genus; dwc:subgenuskeywordentity:Dataset property:keywordsConditions;Keywords<sml:classification>; <sml:classified>; <sml:keywords><r:Topical Coverage>;<r:Subject>;<r:Keyword><keyword>;<topcClas>subject<gmd:topicCategory>dc:descriptiondescription10.16 notes_en, 10.17 notes_frdc:descriptionabstractentity:Dataset property:descriptionStudy Purpose<gml:description>comment; cell_methods; source; history<r:Abstract><abstract>abstract; purpose; progress; currentness reference<gmd:abstract>dc:publisherpublisherdct:publisher10.56 owner_org10.59 org_title_at_publication_en10.60 org_title_at_publication_frdwc:institutionCodepublisherentity:Person property:fullName;entity:Organization property:nameN/A<r:Publisher><producer>publisher<gmd:citedResponsibleParty> <gmd:role> "Publisher"dc:contributorcontributor10.8 contributor_en, 10.9 contributor_frdc:contributormetadataProviderentity:Person property:fullNameentity: DataRepository property:nameCollaborators<r:Contributor><distrbtr>;<othId>datacred<gmd:citedResponsibleParty> <gmd:role> "Collaborator" or "Distributor"dc:datepublicationyeardct:issued10.11 date_captured10.69 resource_date_publisheddwc:eventDate; dwc:dateIdentified; dcterms:modifiedpubdateentity:Dataset property:dates First Received;Last Updated;Last Changed Date<sml:validTime><PublicationDate>;<r:Date>;<r:SimpleDate>;<r:StartDate>;<r:EndDate><prodDate>;<collDate>; <distDate>;<depDate> date<gmd:date>dc:typeresourcetypedcat:mediaType10.76 resource_typedcterms:typedataset,citation,protocol,softwareentity:dataset property:typeAvailable Study Data/Documents:Type [from list : Individual Participant Data Set, Study Protocol, Statistical Analysis Plan, Informed Consent Form, Clinical Study Report, Analytic Code, Other (specify)]featureType; char; byte; short; int; float; real; double<dc:type>;<r:KindOfData><dataKind>resdesc, digform<gmd:spatialRepresentationType>dc:formatsizedct:format10.70 resource_formatphysicalentity:DatasetDistribution property:formatsN/A<sml:characteristics name="generalProperties">.nc (NetCDF file extension)<pd:FileFormat><fileType>;<format> digform, formname<gmd:fileType>, <gmd:resourceFormat>dc:identifieridentifierdcat:identifier10.71 resource_unique_identifier, 10.41 iddwc:collectionCode; dwc:catalogNumber; dwc:recordNumber; dwc:organismIDpackageId, alternateIdentifier, & URL to EML documententity:Dataset property:identifier;entity:Dataset property:alternateidentifier;entity:Dataset property:relatedidentifierNCT ID<gml:identifier>standard_name<r:UserID>;<r:InternationalIdentifier><IDNo><gmd:fileIdentifier>dc:sourcedcat:downloadURL10.77 resource_url, 10.18 digital_object_identifierdataSourceentity:Access property:landingpage<dc:source><sources>srccite<gmd:source>dc:languagelanguagedcat:language10.62 resource_languagedcterms:languagelanguageN/AN/Axml:lang="en"<r:Language>language<gmd:language>dc:relationRelatedIdentifierdcat:landingPagedcat:landingPage10.27 program_page_url_en10.28 program_page_url_fr, 10.64 resource_related_relationship, 10.63 resource_record_type10.67 resource_urldwc:associatedReferencescitationentity:Dataset property:primarypublications;entity:Publication;entity:Dataset property:relatedidentifierPublication Citation<sml:documentation>references<r:Relationship><othrStdyMat> <otherMat>crossrefdc:coveragedct:temporal; 10.88 time_period_coverage_end10.89 time_period_coverage_startdwc:eventDatecoverage, temporalCoverageentity:Dataset property:datesStudy Start Date;Primary Completion Datecalendar; (T); timeSeries<r:TemporalCoverage>;<r:SpatialCoverage>;<r:country><timePrd>;<geogCover>;<nation>temporal<gmd:extent>dc:rightsrightsdct:licence10.35 license_iddcterms:rightsHolderintellectualRightsentity: Dataset property: licensesAvailable Study Data/Documents: Comments<sml:legalConstraints><r:Copyright><copyright>access constraints; use constraints<gmd:resourceContraints>DataCite_geolocationPlacegeolocationPlacedct:spatial10.24 geographic_regiondwc:locationID; dwc:continent; dwc:country; dwc:stateProvince; dwc:localitygeographicCoverageentity: DataAcquisition (subclass of Activity) property:locationsCity; State/Province; Country<sml:location>;region<r:GeographicLocation>;<r:SpatialCoverage>; <r:country><geogCover>;<nation>place keyword<gmd:keyword> <gmd:MD_KeywordTypeCode> "place"DataCite_geolocationPointgeolocationPointdwc:verbatimCoordinatesN/AN/A<sml:point>axis; coordinates; unit; (Z); (Y); (X)<r:Point><point>horizsys<gmd:EX_Extent>DataCite_geolocationBoxgeolocationBoxN/AN/Abounds; cell_measures<r:GeographicBoundary>;<r:BoundingBox><geoBndBox>bounding<gmd:EX_GeographicBoundingBox>


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items