Open Collections

UBC Library and Archives

Keynote event : the case for Open Data and eScience : establishing a university data management program.. Choudhury, G. Sayeed 2010-10-22

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


2689-Choudhury_Sayeed_Keynote_Event_The_Case_for_Open_Data.wmv [ 242.87MB ]
2689-OAW2010_and_BCRLG.jpg [ 13.56kB ]
2689-Choudhury_Sayeed_The_Case_for_Open_Data_and_eScience.pdf [ 1.67MB ]
JSON: 2689-1.0058456.json
JSON-LD: 2689-1.0058456-ld.json
RDF/XML (Pretty): 2689-1.0058456-rdf.xml
RDF/JSON: 2689-1.0058456-rdf.json
Turtle: 2689-1.0058456-turtle.txt
N-Triples: 2689-1.0058456-rdf-ntriples.txt
Original Record: 2689-1.0058456-source.json
Full Text

Full Text

The Case for Open Data and eScience – Establishing a University Data Management Program at Johns Hopkins Sayeed Choudhury BCRLG/Open Access Week October 22, 2010  Vision  In the beginning… • Digital Knowledge Center founded in 1997 • Mission specifically emphasized research and development • Non-traditional manager, staff and culture • Early grants from US National Science Foundation, Andrew W. Mellon Foundation and US Institute of Museum and Library Services  Early principles • Automated systems instead of automation • Emphasis on new processes that raised human involvement or intervention to higher level • Engagement with new communities or researchers • Diversity of funding sources including venture capital group and corporate  Creative Tension • Cultural dissonance • Challenges of managing R&D projects within operational environment • Benefits of managing R&D projects within operational environment…service oriented R&D • Gaining the respect of faculty and associated credibility  Initial Projects  Meanwhile… • The faculty vanguards were pushing their own frontiers • Growing interest in digital collections and services • Little emphasis on infrastructure • Initially inspired by specific disciplinary problems or needs  A Repository by Any Other Name… • With funding from the Mellon Foundation, we conducted an analysis of DSpace, Fedora, and Digital Commons • Locally, we deferred our specific choice while we conducted the analysis • We engaged the community by gathering use cases • Ultimately, we made a better choice – not without controversy  Data Flow (Levels of Data) Pixel data collected by telescope Sent to Fermilab for processing Beowulf Cluster produces catalog Loaded in a SQL database  Data and Publication Curation Publisher Author  Archive  JHU-based eResearch  …not a rigid road map but principles of navigation. There is no one way to design cyberinfrastructure, but there are tools we can teach the designers to help them appreciate the true size of the solution space – which is often much larger than they may think, if they are tied into technical fixes for all problems.  “…The natural path of industrialization: invention, propagation, adoption, control” -- Chris Anderson, Wired Magazine  Data Conservancy • One of two current awards through the US National Science Foundation’s DataNet program • 5 year, $20 million award • Second phase of program could result in three additional awards forming DataNet federation  Data Conservancy partners  Data Curation The Data Conservancy embraces a shared vision: data curation is a means to collect, organize, validate and preserve data so that scientists can find new ways to address the grand research challenges that face society.  OAIS Mapping to DC Architecture  Technological Results • OAIS based architecture and PLANETS based data model • Storage framework • Ingested data from Sloan Digital Sky Survey and Dry Valleys Project • Integration with services from:  Domain coverage/methods • Multi-site user research methods are a blend of: – Case study & domain comparisons – Depth & breadth – Local & global  Astronomy  Earth Sciences  Life Sciences  Social Sciences  UCAR  Task-based design and usability testing ⇒ Use cases, data requirements, system recommendations  UCAR  UCLA  Ethnography, virtual ethnography, oral histories ⇒ Use cases, data requirements  UIUC  Interviews, Surveys, Worksheets, Content analysis ⇒ Curation requirements, taxonomy, metadata/provenance framework  Information science research  Educational Results • Data Curation Summer Institute at Illinois  • New courses at Illinois and UCLA • Illinois has a Data Curation Education Program • Data scientist within Sheridan Libraries at Johns Hopkins  Sustainability • Incorporate findings from Blue Ribbon Task Force on Sustainable Digital Preservation and Access • Memoranda of Understanding with Astrophysical Research Consortium (and Walters Art Museum) • Carey Business School capstone projects • Data management plans • “Business” partners  Lessons Learned – Innovation • Innovation arises from chaos – Urgency is the mother of innovation • One or a few individuals typically initiate innovation, but only an organizational commitment will foster and advance innovation • Innovation requires courage, including acceptance of “failure” • Innovation can lead to long-term, unanticipated results but those aren’t the drivers for initial activity  What allowed us to be innovative? • Leadership being willing to support yet defer – “You are the expert” • Trust between decision-makers and the doers • Funding – provides the “release” time and validation • Knowing when to lead and when to follow • Being aware of global issues while being cognizant of local needs  Lessons Learned – Cultural • Is there a single library culture? • Perhaps even more important question: Is there a “correct” library culture? • It’s probably more important to unlearn than it is to learn • Human interoperability is a lot harder than machine interoperability • Librarians can become the human dimension of infrastructure  Data Management Program • Critical to identify needs and associated requirements • Focus on service provision, but consider scientific data as the new special collections • Essential to engage faculty or researcher champions who become ambassadors – Institute for Data Intensive Engineering and Science (IDIES) • Choose carefully – scope is important  Vision  Thank you! • Questions? • Comments? • Suggestions? • “The future is already here. It’s just not widely distributed yet.” – William Gibson  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items