USGIN Best Practices

This page contains a list of best practices for each step of the USGIN Data Provider Workflow. Though not mandatory, these practices are highly recommended.

Best Practices for Data Assessment

Best practices for data assessment include the following:

Best Practices for Data Integration

Data integration can be divided into two steps: schema mapping and vocabulary mapping.

During schema mapping, datasets should be mapped to an interchange format structured by the GeoSciML-Portrayal schema. It is further recommended that individual feature classes in be mapped to GeoSciML-Portrayal Simple Feature Schemas. For more information, see the GeoSciML-Portrayal Cookbook.

The process of vocabulary mapping is slightly more complicated. Though the ultimate goal is the provision of web-accessible definitions for each vocabulary term used in a given dataset to facilitate clear communication and understanding, there are several ways to do this, listed in order from most optimal to least optimal:

  1. Usage of GeoSciML CGI vocabulary terms, identified by HTTP-URIs that dereference to vocabulary definitions; optimally, these vocabulary terms are accessible as a vocabulary service
    • A web-accessible directory containing the *.rtf and *.ttl files used to create a CGI vocabulary service can be found here
    • A sub-directory containing *.HTML representations of CGI Simple Feature vocabulary term definitions can be found here; these *.HTML files can be used to gain a sense of the way in which the CGI Simple Feature vocabulary terms work. For example, the CGI Simple Lithology Categories *.HTML document contains definitions for the CGI simple lithology vocabulary definitions
  2. If GeoSciML CGI vocabulary terms are not used, then any vocabulary terms used in the dataset should be furnished with hyperlinks to a glossary website containing definitions of any and all vocabulary terms used
  3. If it is not possible for the user to provide a glossary website, users should to furnish vocabulary terms in their dataset with hyperlinks to one or more generic websites containing definitions for vocabulary terms in the dataset; a website such as Wikipedia is acceptable here
  4. If a suitable generic website containing vocabulary terms cannot be found, users are encouraged to add one or more fields to the dataset; when applicable, these fields will contain vocabulary definitions for terms present in each record in the dataset

Best Practices for Data Deployment

Data deployment can likewise be broken down into two parts:

  • Hosting web-accessible data via web service server software
  • Registering web-accessible data via a metadata record in the USGIN Catalog

A recommended software package is the OpenGeo Suite (http://opengeo.org/), which includes Postgres and PostGIS (database management systems) and OpenLayers (a JavaScript library that provides JavaScript applications with the ability to perform map-related functions, including data consumption from web map services and web feature services).