Open Collections

UBC Library and Archives

Keynote event : the case for Open Data and eScience : establishing a university data management program.. Choudhury, G. Sayeed 2010-10-22

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


Choudhury_Sayeed_Keynote_Event_The_Case_for_Open_Data.wmv [ 242.87MB ]
OAW2010_and_BCRLG.jpg [ 13.56kB ]
Choudhury_Sayeed_The_Case_for_Open_Data_and_eScience.pdf [ 1.67MB ]
JSON: 1.0058456.json
JSON-LD: 1.0058456+ld.json
RDF/XML (Pretty): 1.0058456.xml
RDF/JSON: 1.0058456+rdf.json
Turtle: 1.0058456+rdf-turtle.txt
N-Triples: 1.0058456+rdf-ntriples.txt
Original Record: 1.0058456 +original-record.json
Full Text

Full Text

The Case for Open Data and eScience–Establishing a University Data Management Program at Johns HopkinsSayeed Choudhury sayeed@jhu.eduBCRLG/Open Access WeekOctober 22, 2010VisionIn the beginning…• Digital Knowledge Center founded in 1997• Mission specifically emphasized research and development• Non-traditional manager, staff and culture• Early grants from US National Science Foundation, Andrew W. Mellon Foundation and US Institute of Museum and Library ServicesEarly principles• Automated systems instead of automation• Emphasis on new processes that raised human involvement or intervention to higher level• Engagement with new communities or researchers• Diversity of funding sources including venture capital group and corporateCreative Tension• Cultural dissonance• Challenges of managing R&D projects within operational environment• Benefits of managing R&D projects within operational environment…service oriented R&D• Gaining the respect of faculty and associated credibilityInitial ProjectsMeanwhile…• The faculty vanguards were pushing their own frontiers• Growing interest in digital collections and services• Little emphasis on infrastructure• Initially inspired by specific disciplinary problems or needsA Repository by Any Other Name…• With funding from the Mellon Foundation, we conducted an analysis of DSpace, Fedora, and Digital Commons• Locally, we deferred our specific choice while we conducted the analysis• We engaged the community by gathering use cases• Ultimately, we made a better choice –not without controversyPixel data collected by telescopeSent to Fermilab for processingData Flow (Levels of Data)Beowulf Clusterproduces catalogLoaded in a SQL databaseData and Publication CurationAuthorPublisherArchiveJHU-based eResearch…not a rigid road map but principles of navigation. There is no one way to design cyberinfrastructure, but there are tools we can teach the designers to help them appreciate the true size of the solution space –which is often much larger than they may think, if they are tied into technical fixes for all problems. “…The natural path of industrialization: invention, propagation, adoption, control”--Chris Anderson, Wired MagazineData Conservancy• One of two current awards through the US National Science Foundation’s DataNetprogram• 5 year, $20 million award• Second phase of program could result in three additional awards forming DataNetfederationData Conservancy partnersData CurationThe Data Conservancy embraces a shared vision: data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society.OAIS Mapping to DC ArchitectureTechnological Results• OAIS based architecture and PLANETS based data model• Storageframework• Ingested data from Sloan Digital Sky Survey and Dry Valleys Project• Integration with services from:Domain coverage/methods• Multi-site user research methods are a blend of:– Case study & domain comparisons– Depth & breadth– Local & globalAstronomy Earth Sciences Life Sciences Social SciencesUCAR Task-based design and usability testing ⇒Use cases, data requirements, system recommendationsUCARUCLA Ethnography, virtual ethnography, oral histories ⇒Use cases, data requirementsInterviews, Surveys, Worksheets, Content analysis ⇒Curation requirements, taxonomy, metadata/provenance frameworkUIUCInformation science researchEducational Results• Data Curation Summer Institute at Illinois• New courses at Illinois and UCLA• Illinois has a Data Curation Education Program• Data scientist within Sheridan Libraries at Johns HopkinsSustainability• Incorporate findings from Blue Ribbon Task Force on Sustainable Digital Preservation and Access• Memoranda of Understanding with Astrophysical Research Consortium (and Walters Art Museum)• Carey Business School capstone projects• Data management plans• “Business” partnersPBSJ.comLessons Learned –Innovation• Innovationarises from chaos–Urgency is the mother of innovation• One or a few individuals typically initiate innovation, but only an organizational commitment will foster and advance innovation• Innovation requires courage, including acceptance of “failure”• Innovation can lead to long-term, unanticipated results but those aren’t the drivers for initial activityWhat allowed us to be innovative?• Leadershipbeing willing to support yet defer –“You are the expert”• Trustbetween decision-makers and the doers• Funding –provides the “release” time and validation• Knowing when to leadand when to follow• Being aware of globalissues while being cognizant of localneedsLessons Learned –Cultural• Is there a single library culture?• Perhaps even more important question: Is there a “correct” library culture?• It’s probably more important to unlearnthan it is to learn• Human interoperability is a lot harder than machine interoperability• Librarians can become the human dimension of infrastructureData Management Program • Critical to identify needs and associated requirements• Focus on service provision, but consider scientific data as the new special collections• Essential to engage faculty or researcher champions who become ambassadors –Institute for Data Intensive Engineering and Science (IDIES) • Choose carefully –scope is importantVisionThank you!• Questions?• Comments?• Suggestions?• “The future is already here.  It’s just not widely distributed yet.”– William Gibson


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics

Country Views Downloads
China 20 30
United States 18 2
Russia 10 0
Japan 4 0
France 3 1
United Kingdom 2 0
City Views Downloads
Shenzhen 19 28
Saint Petersburg 10 0
San Francisco 6 0
Ashburn 6 0
Unknown 5 12
Tokyo 4 0
Southend-on-Sea 2 0
University Park 2 0
Redmond 1 0
Seattle 1 0
Beijing 1 2

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items