How to retrieve metadata about MOBY resources

Proposal for how to obtain metadata, and the minimum content of that metadata, for MOBY resources

Last modified Feb 23, 2005

There is metadata available to describe entities in the Object, Service and Namespace Ontologies, as well as individual Service Instances in the MOBY Central registry. There are two ways to obtain this metadata:

  1. To retrieve entire ontologies (Object, Namespace, Service), or the complete set of Service Instances in the MOBY Central registry, one can execute an HTTP GET of a Resource URL
  2. For individual entities (e.g. individual ontology terms, or individual service instances), through metadata-resolution of an LSID specifying that entity

In both cases, the metadata takes the form of an RDF document.

  • Case 1: Retrieving Full Ontologies or Full Registry Metadata. To retrieve metadata describing the Object, Service, Namespace ontologies, or the MOBY Central Registry:
    1. Use the retrieveResourceURLs method of the MOBY Central API
    2. From the response XML document select the ontology you are interested in
    3. Retrieve the ontology through HTTP GET of the corresponding Resource URL. The returned document is in RDF-XML format.
    4. The content of this document depends on which ontology you are retrieving, but uses the same predicate sets as discussed in detail below for Objects, Namespaces, Service types, and Service Instances (i.e. All currently registered Service Instances in the MOBY Central registry).
  • Case 2: Retrieving metadata for a single entity. Rather than retrieving a complete ontology, or the complete contents of the registry, it is possible to retrieve the metadata relating to an individual ontology node, or an individual service instance in the registry. This will be accomplished by LSID resolution using the LSID getMetadata method. The returned document is in RDF-XML format. The content of this document depends on which type of entity is described by the LSID, and is disussed in detail below for Objects, Namespaces, Service types, and Service Instances.

Metadata Content for Object, Namespace, and Service Ontologies, and Service Instances

Objects

The metadata returned by getMetadata resolution of LSID’s representing Objects is an RDF-formatted document describing the ontological relationships traversing from the given node back to the root node (‘Object’). The resources (a.k.a. “nodes”) of the RDF document take the form of URL’s in URL#Fragment format. The URL is identical to that returned for Objects by the retrieveResourceURLs method of the MOBY Central API, and resolve by HTTP GET to a complete RDF document in RDF-XML format, representing the entire respective ontology; the #Fragment represents a particular resource within that RDF document.

RDF Predicates:

for ease of reading, the following namespaces XML are defined:
xmlns:moby=”http://biomoby.org/RESOURCES/MOBY-S/ServiceDescription#”
xmlns:mobyObject=”http://biomoby.org/RESOURCES/MOBY-S/Objects#”

rdfs:subClassOf This represents the MOBY “isa” relationship, indicating that the Subject of the triple is a sub-class of the Object, and carries all transitive properties of that object
moby:has This indicates that the Object of the triple is in a 1:many relationship with the Subject; i.e. that the MOBY Object is defined as containing at least one instance of the designated Object. This is a transitive property.
moby:hasa This indicates that the Object of the triple is in a 1:1 relationship with the Subject; i.e. that the MOBY Object is defined as containing a single instance of the designated Object. This is a transitive property.
moby:articleName The articleName is a “label” attached to contained Objects (i.e. Objects in a moby:has or moby:hasa relation to another object) that indicates their semantic relationship with the containing Object class. This also identifies the articleName attribute to be used in the XML representation of an Object instance.

EXAMPLE


<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:moby="http://biomoby.org/RESOURCES/MOBY-S/Predicates#"
    xmlns:mobyObject="http://biomoby.org/RESOURCES/MOBY-S/Objects#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:xs="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#Object">
          <rdfs:subClassOf>
               <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Predicates#DataComponent"/>
          </rdfs:subClassOf>
          <rdfs:comment xml:lang="en">a base object class consisting of a namespace and an identifier</rdfs:comment>
          <rdfs:label xml:lang="en">Object</rdfs:label>
     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#GenericSequence">
          <rdfs:label xml:lang="en">GenericSequence</rdfs:label>
          <rdfs:subClassOf>
               <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#VirtualSequence"/>
          </rdfs:subClassOf>
          <moby:hasa>
               <mobyObject:String>
                    <moby:articleName rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
                    >SequenceString</moby:articleName>
               </mobyObject:String>
          </moby:hasa>
          <rdfs:comment xml:lang="en">Represents any molecule that can be described as a 'Sequence', including nucleotide (DNA, RNA, other), amino acid, or any other molecule with a sequence.  This is most useful for cases where it is not known what specific type of molecule will be represented in the object.</rdfs:comment>
     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#DNASequence">
          <rdfs:label xml:lang="en">DNASequence</rdfs:label>
          <rdfs:comment xml:lang="en">Lightweight representation a DNA sequence</rdfs:comment>
          <rdfs:subClassOf>
               <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#NucleotideSequence"/>
          </rdfs:subClassOf>
     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#NucleotideSequence">
          <rdfs:subClassOf rdf:resource="http://biomoby.org/RESOURCES/MOBY-S/Objects#GenericSequence"/>
          <rdfs:label xml:lang="en">NucleotideSequence</rdfs:label>
          <rdfs:comment xml:lang="en">Lightweight representation of any type of nucleotide sequence (DNA, RNA, etc)</rdfs:comment>

     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Objects#VirtualSequence">
          <rdfs:label xml:lang="en">VirtualSequence</rdfs:label>
          <rdfs:comment xml:lang="en">This root of the Sequence object hierarchy carries only the length of the sequence.  Can be used, for example, to pre-screen results based on sequence size before retrieving a more heavyweight sequence object</rdfs:comment>
          <moby:hasa>
               <mobyObject:Integer>
                    <moby:articleName rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
                    >Length</moby:articleName>
               </mobyObject:Integer>
          </moby:hasa>
          <rdfs:subClassOf rdf:resource="http://biomoby.org/RESOURCES/MOBY-S/Objects#Object"/>
     </rdfs:Class>
</rdf:RDF>

or as N3

     @prefix :  .
     @prefix dc:  .
     @prefix moby:  .
     @prefix rdf:  .
     @prefix rdfs:  .
     @prefix xs:  .

    :DNASequence     a rdfs:Class;
         rdfs:comment "Lightweight representation a DNA sequence"@en;
         rdfs:label "DNASequence"@en;
         rdfs:subClassOf :NucleotideSequence .

    :GenericSequence     a rdfs:Class;
         moby:hasa  [
             a :String;
             moby:articleName "SequenceString"^^xs:string ];

         rdfs:comment "Represents any molecule that can be described as a 'Sequence', including nucleotide (DNA, RNA, other), amino acid, or any other molecule with a sequence.  This is most useful for cases where it is not known what specific type of molecule will be represented in the object."@en;
         rdfs:label "GenericSequence"@en;
         rdfs:subClassOf :VirtualSequence .

    :NucleotideSequence     a rdfs:Class;
         rdfs:comment "Lightweight representation of any type of nucleotide sequence (DNA, RNA, etc)"@en;
         rdfs:label "NucleotideSequence"@en;
         rdfs:subClassOf :GenericSequence .

    :Object     a rdfs:Class;
         rdfs:comment "a base object class consisting of a namespace and an identifier"@en;
         rdfs:label "Object"@en;
         rdfs:subClassOf moby:DataComponent .

    :VirtualSequence     a rdfs:Class;
         moby:hasa  [
             a :Integer;
             moby:articleName "Length"^^xs:string ];
         rdfs:comment "This root of the Sequence object hierarchy carries only the length of the sequence.  Can be used, for example, to pre-screen results based on sequence size before retrieving a more heavyweight sequence object"@en;
         rdfs:label "VirtualSequence"@en;
         rdfs:subClassOf :Object .

    moby:DataComponent     a rdfs:Class .
    

Namespaces

The metadata returned by getMetadata resolution of LSID’s representing Namespaces is an RDF-formatted document describing the ontological relationships traversing from the given node back to the root node (‘Namespace’). The resources (a.k.a. “nodes”) of the RDF document take the form of URL’s in URL#Fragment format. The URL is identical to that returned for Namespaces by the retrieveResourceURLs method of the MOBY Central API, and resolve by HTTP GET to a complete RDF document in RDF-XML format, representing the entire respective ontology; the #Fragment represents a particular resource within that RDF document.

At this time, no MOBY-specific predicates are defined for Namespaces.

EXAMPLE


<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:xs="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Namespaces#NCBI_gi">
          <rdfs:subClassOf>
               <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Namespaces#Namespace"/>
          </rdfs:subClassOf>
          <rdfs:label xml:lang="en">NCBI_gi</rdfs:label>
          <rdfs:comment xml:lang="en">NCBI databases.</rdfs:comment>
     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Namespaces#Namespace">
          <rdfs:label xml:lang="en">Namespace</rdfs:label>
          <rdfs:comment xml:lang="en">a base Namespace identifier, never instantiated</rdfs:comment>
     </rdfs:Class>
</rdf:RDF>

Or as N3

     @prefix :  .
     @prefix rdf:  .
     @prefix rdfs:  .
     @prefix xs:  .

    :NCBI_gi     a rdfs:Class;
         rdfs:comment "NCBI databases."@en;
         rdfs:label "NCBI_gi"@en;
         rdfs:subClassOf :Namespace .

    :Namespace     a rdfs:Class;
         rdfs:comment "a base Namespace identifier, never instantiated"@en;
         rdfs:label "Namespace"@en .

Service types

The metadata returned by getMetadata resolution of LSID’s representing Service types is an RDF-formatted document describing the ontological relationships traversing from the given node back to the root node (‘Service’). The resources (a.k.a. “nodes”) of the RDF document take the form of URL’s in URL#Fragment format. The URL is identical to that returned for Service types by the retrieveResourceURLs method of the MOBY Central API, and resolve by HTTP GET to a complete RDF document in RDF-XML format, representing the entire respective ontology; the #Fragment represents a particular resource within that RDF document.

At this time, no MOBY-specific predicates are defined for Service type metadata descriptions.

EXAMPLE


<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:xs="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Services#Retrieval">
          <rdfs:comment xml:lang="en">The base retrieval type</rdfs:comment>
          <rdfs:subClassOf>
               <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Services#Service"/>
          </rdfs:subClassOf>
          <rdfs:label xml:lang="en">Retrieval</rdfs:label>
     </rdfs:Class>
     <rdfs:Class rdf:about="http://biomoby.org/RESOURCES/MOBY-S/Services#Service">
          <rdfs:comment xml:lang="en">a base Service class, never instantiated</rdfs:comment>
          <rdfs:label xml:lang="en">Service</rdfs:label>
     </rdfs:Class>
</rdf:RDF>

Or as N3

     @prefix :  .
     @prefix rdf:  .
     @prefix rdfs:  .
     @prefix xs:  .

    :Retrieval     a rdfs:Class;
         rdfs:comment "The base retrieval type"@en;
         rdfs:label "Retrieval"@en;
         rdfs:subClassOf :Service .

    :Service     a rdfs:Class;
         rdfs:comment "a base Service class, never instantiated"@en;
         rdfs:label "Service"@en .

Service Instances

The metadata returned by getMetadata resolution of LSID’s representing Service Instances is an RDF-formatted document describing the Service instance, it’s input and output MOBY objects, and their namespaces, the service endpoint, the service type, as well as information about the service provider, and the author of the service description. This document conforms to both the MOBY and the myGrid metadata representations for Service Instances, and has the following predicates:

for ease of reading, the following namespaces XML are defined:
xmlns:mygrid=”http://www.mygrid.org.uk/mygrid-moby-service#”
xmlns:moby=”http://biomoby.org/RESOURCES/MOBY-S/ServiceDescription#”
xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#”

mygrid:hasServiceDescriptionLocation
Domain mygrid:serviceDescription
Range String Literal
The value is a URL to a document describing this Service Instance. For MOBY services, resolving this URL through HTTP GET will retrieve an RDF-XML document using (at least) the same predicate set as the current document, but potentially with additional/extended metadata from the service provider.
mygrid:hasOperation
Domain mygrid:serviceDescription
Range mygrid:operation
The value of this predicate is an instance of a mygrid:operation, which describes a service interface, including the inputs, outputs, and service type. For MOBY services, there is only one operation per service, so there will be only one hasOperation predicate in the RDF.
mygrid:performsTask
Domain mygrid:operation
Range moby:Services || mygrid:operationTask
For a given service operation, it performs a task as specified by this predicate. In MOBY, the value of this predicate is a URL indicating a Service Ontology node, of the form http://biomoby.org/RESOURCES/MOBY-S/Services#ontology_term
mygrid:outputParameter
Domain mygrid:operation
Range mygrid:parameter
A given operation has zero or more output parameters, each one indicated by this predicate.
mygrid:inputParameter
Domain mygrid:operation
Range mygrid:parameter
A given operation has zero or more input parameters, each one indicated by this predicate.
mygrid:hasParameterType
Domain mygrid:parameter
Range mygrid:parameterType
For MOBY Services, the range will be one of mygrid:simpleParameter, mygrid:collectionParameter, or mygrid:secondaryParameter
mygrid:inNamespaces
Domain mygrid:parameter
Range mygrid:parameterNamespace
inNamespaces points to an instance of a parameterNamespace, which in turn will have zero or more mygrid:namespaceType predicates, each of which points to an ontology term defined in the Namespace ontology, of the form http://biomoby.org/RESOURCES/MOBY-S/Namespaces#ontology_term
mygrid:namespaceType
Domain mygrid:parameterNamespace
Range moby:Namespaces
The range of valid values for this predicate are the URL’s corresponding to the terms in the MOBY Namespace ontology, in the form http://biomoby.org/RESOURCES/MOBY-S/Namespaces#ontology_term
mygrid:objectType
Domain mygrid:parameter
Range moby:Objects
The range of valid values for this predicate are the URL’s corresponding to the terms in the MOBY Object ontology, in the form http://biomoby.org/RESOURCES/MOBY-S/Objects#ontology_term
mygrid:hasParameterNameText
Domain mygrid:parameter
Range String Literal
This is the parameter name for the input or output parameter of a service
mygrid:providedBy
Domain mygrid:serviceDescription
Range mygrid:organization
A description of the service provider
dc:creator
Domain mygrid:organization
Range String Literal
The email address of the service provider
dc:publisher
Domain mygrid:organization
Range String Literal
the distinct URI (authURI) of the service provider
mygrid:authoritative
Domain mygrid:organization
Range String Literal
One of “true” or “false” indicating whether the service provider is “authoritative” for the service they are providing, or if they are (for example) an unofficial mirror or secondary site.
mygrid:hasServiceNameText
Domain mygrid:serviceDescription
Range String Literal
The service name as a string – in MOBY, this is equivalent to the soapAction required to invoke the service.
mygrid:hasServiceDescriptionText
Domain mygrid:serviceDescription
Range String Literal
A wordy description of what the service does.
mygrid:locationURI
Domain mygrid:serviceDescription
Range String Literal
The endpoint of the service, as a string
dc:format
Domain mygrid:serviceDescription
Range Ontology Term from myGrid ontology indicating the technical type of service.
Currently, the only valid value for this parameter in MOBY is “moby”, to indicate that this service uses the MOBY messaging system. In the future, the MOBY Central registry may be expanded to include web services using non-MOBY messaging, at which time the range of valid values will be expanded.
mygrid:hasOperationNameText
Domain mygrid:operation
Range String Literal
For MOBY services, the hasOperationNameText will be identical to the hasServiceNameText; this predicate is necessary for myGrid compatibility, where a service can have more than one operation.

An example of a service instance RDF document follows:

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
    xmlns:serviceInstances="http://biomoby.org/RESOURCES/MOBY-S/ServiceInstances#"
    xmlns:mygrid="http://www.mygrid.org.uk/mygrid-moby-service#"
    xmlns:moby="http://biomoby.org/RESOURCES/MOBY-S/ServiceDescription#"
    xmlns:mobyObject="http://biomoby.org/RESOURCES/MOBY-S/Objects#"
    xmlns:mobyNamespace="http://biomoby.org/RESOURCES/MOBY-S/Namespaces#"

    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:mobyService="http://biomoby.org/RESOURCES/MOBY-S/Services#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
     <mygrid:serviceDescription rdf:about="urn:lsid:biomoby.org:serviceinstance:bioinfo.icapture.ubc.ca,getSHound3DNeighboursFromGi">
          <mygrid:hasServiceDescriptionText>

Retrieves a list of protein BLAST neighbours possessing 3-D structure. Uses redundancy information for the query protein. The BLAST protein
neighbours were calculated using 0.01 maximum E-value cutoff.
	</mygrid:hasServiceDescriptionText>
          <mygrid:hasServiceNameText>getSHound3DNeighboursFromGi</mygrid:hasServiceNameText>
          <dc:format>moby</dc:format>
          <mygrid:hasServiceDescriptionLocation></mygrid:hasServiceDescriptionLocation>

          <mygrid:locationURI>http://mobycentral.icapture.ubc.ca/cgi-bin/Services/Services.cgi</mygrid:locationURI>
          <mygrid:hasOperation>
               <mygrid:operation>
                    <mygrid:hasOperationNameText>getSHound3DNeighboursFromGi</mygrid:hasOperationNameText>
                    <mygrid:outputParameter>

                         <mygrid:parameter>
                              <mygrid:hasParameterNameText>3d_neighbours</mygrid:hasParameterNameText>
                              <mygrid:hasParameterType>
                                   <mygrid:collectionParameter/>
                              </mygrid:hasParameterType>
                              <mygrid:inNamespaces>

                                   <mobyNamespace:PDB>
                                        <rdf:type rdf:resource="http://www.mygrid.org.uk/mygrid-moby-service#parameterNamespace"/>
                                   </mobyNamespace:PDB>
                              </mygrid:inNamespaces>
                              <mygrid:objectType>
                                                  <mobyObject: Object/>
                              </mygrid:objectType>

                         </mygrid:parameter>
                    </mygrid:outputParameter>
                    <mygrid:performsTask>
                         <mygrid:operationTask>
                              <rdf:type rdf:resource="http://biomoby.org/RESOURCES/MOBY-S/Services#Retrieval"/>

                         </mygrid:operationTask>
                    </mygrid:performsTask>
                    <mygrid:inputParameter>
                         <mygrid:parameter>
                              <mygrid:hasParameterNameText>record_id</mygrid:hasParameterNameText>
                              <mygrid:objectType>
                                   <mobyObject:Object/>
                              </mygrid:objectType>

                              <moby:objectType rdf:resource="http://biomoby.org/RESOURCES/MOBY-S/Objects#Object"/>

                              <mygrid:inNamespaces>
                                   <mobyNamespace:NCBI_gi>
                                        <rdf:type rdf:resource="http://www.mygrid.org.uk/mygrid-moby-service#parameterNamespace"/>
                                   </mobyNamespace:NCBI_gi>
                              </mygrid:inNamespaces>

                              <mygrid:hasParameterType>
                                   <mygrid:simpleParameter/>
                              </mygrid:hasParameterType>
                         </mygrid:parameter>
                    </mygrid:inputParameter>
               </mygrid:operation>

          </mygrid:hasOperation>
          <mygrid:providedBy>
               <mygrid:organisation>
                    <mygrid:authoritative>false</mygrid:authoritative>
                    <dc:publisher>bioinfo.icapture.ubc.ca</dc:publisher>

                    <dc:creator>markw@illuminae.com</dc:creator>
               </mygrid:organisation>
          </mygrid:providedBy>
     </mygrid:serviceDescription>
</rdf:RDF>


Leave a Reply