Wrapping WSDL-Described Web Services as Moby Services (using SAWSDL)

What is this?

This document explains how, as a service provider, you can publish your existing Web Services, described with a WSDL file, as Moby (semantic) services. The wrapping approach described here requires no Java coding, just the adding of info to the WSDL file (a.k.a. SAWSDL markup), and probably some XSLT. The servlet used here proxies requests to the original Web Service.

For scientific publications using this methdology, please cite:
Gordon P.M.K., Sensen C.W. (2008) Creating Bioinformatics Semantic Web Services from Existing Web Services: A real-world application of SAWSDL. In: Proceedings of the IEEE International Conference on Web Services (Beijing, China), September 23-26, 2008.

Sections

What are the prerequisites?

The following should be (pre-)installed:

  1. A Java Runtime Environment 5.0+
  2. A Java Servlet container, such as Apache Tomcat. If you don't have one set up, here's a very quick guide.
  3. The SAWSDLServlet Web Archive (WAR) that will actually provide the service.
  4. An existing WSDL file, with the Web Service deployed of course

Step 1: Customize the servlet configuration file

You need to tell SAWSDLServlet where your WSDL file is located. This is done by modifying the WEB-INF/web.xml file in the SAWSDLServlet WAR you just downloaded. First, extract the file from the WAR:

jar xvf SAWSDLServlet.war WEB-INF/web.xml
Then customize it for 1) the target Web Service operation and 2) the location of your WSDL file, using your favorite text editor: Here's the section relevant to the target WSDL operation (KEGG's get_genes_by_enzyme in our example), with data to customize in bold red.
    <display-name>BioMoby Web Service (WSDL): KEGG - get_genes_by_enzyme</display-name>
    <servlet>
      <servlet-class>ca.ucalgary.services.WSDLService</servlet-class>
      <servlet-name>get_genes_by_enzyme</servlet-name>
    </servlet>
    <servlet-mapping>
      <servlet-name>get_genes_by_enzyme</servlet-name>
      <url-pattern>/</url-pattern>
    </servlet-mapping>

    <!-- Tell the servlet where the WSDL for the operation being wrapped is located. -->
    <context-param>
      <param-name>wsdlURL</param-name>
      <param-value>http://.../</param-value>
    </context-param>

Now, put the updated configuration file back into the servlet WAR. For simplicity of deployment, copy the original WAR to new a file named after the Moby service that will be published (in our example, TrimSequenceVector):

cp SAWSDLServlet.war get_genes_by_enzyme.war
jar uvf get_genes_by_enzyme.war WEB-INF/web.xml

Step 2: Markup the WSDL file

Creating the Moby service involves three main steps:
  1. Provide the operation metadata
  2. Provide the primary parameter semantic types
  3. Provide the secondary parameter sources
Declare the sawsdl and moby namespaces.
<?xml version="1.0"?>
<definitions
    name="KEGG_v6.2"
    xmlns:typens="SOAP/KEGG"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
    xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
    xmlns:sawsdl="http://www.w3.org/ns/sawsdl"
    xmlns:moby="http://www.biomoby.org/moby"
    xmlns="http://schemas.xmlsoap.org/wsdl/"
    targetNamespace="SOAP/KEGG">...
Provide operation metadata:
  <operation name="get_genes_by_enzyme">
    <sawsdl:attrExtensions sawsdl:modelReference="urn:lsid:biomoby.org:servicetype:Retrieval"
         moby:serviceName="getGenesByECNumber"
         moby:serviceAuthority="genome.jp"
         moby:serviceContact="ktym@hgc.jp"
         moby:serviceDesc="Retrieves the list of genes in a given organism that are annotated as having a particular EC (enzyme) number"/>
    <input message="typens:get_genes_by_enzymeRequest"/>
    <output message="typens:get_genes_by_enzymeResponse"/>
  </operation>
Provide the primary parameter semantic types: Mark up the inputs with LSIDs pointing to the Moby ontology terms, and the (bioxml.info) rules for converting incoming Moby XML to plain text ("dem rule" XSLT).
  <message name="get_genes_by_enzymeRequest">
    <part name="enzyme_id" type="xsd:string"
          sawsdl:modelReference="urn:lsid:biomoby.org:namespacetype:EC"
          sawsdl:loweringSchemaMapping="urn:lsid:bioxml.info:mobyLoweringSchemaMapping:EC2ecPrefixedString"/>
    <part name="org" type="xsd:string"
          sawsdl:modelReference="urn:lsid:biomoby.org:secondaryParamClass:String"
          sawsdl:loweringSchemaMapping="urn:lsid:bioxml.info:mobyLoweringSchemaMapping:KeggDefSecondaryString2String"
          moby:secondaryParamSource="list_organisms"/>
  </message>
  <message name="get_genes_by_enzymeResponse">
    <part name="return" type="typens:ArrayOfstring"
          sawsdl:modelReference="urn:lsid:biomoby.org:namespace:KEGG_GENES"
          sawsdl:liftingSchemaMapping="urn:lsid:bioxml.info:mobyLiftingSchemaMapping:string2KEGG_GENES"/>
  </message>
Mark up the output item with an LSID pointing to the Moby ontology term, and the (bioxml.info) rules for converting the plain text into semantic Moby XML (a "mob rule"). Note that the schema lifting and lower mappings don't have to be LSIDs, they could be URLs too that point directly to XSLT files. The fact that the output's type is an Array in XML Schema means that it will implicitly be wrapped as a Moby "Collection" of base Objects in the KEGG_GENES namespace.
<!-- list_organisms -->
  <message name="list_organismsRequest"/>
  <message name="list_organismsResponse">
    <part name="return" type="typens:ArrayOfDefinition"/> 
  </message>
Provide the secondary parameter sources, defining a lifting mapping so we can use definitions as secondary params for other services:
<xsd:complexType name="Definition" sawsdl:liftingSchemaMapping="urn:lsid:bioxml.info:mobyLiftingSchemaMapping:Definition2KeggDefSecondaryString">
        <xsd:all>
          <xsd:element name="entry_id" type="xsd:string"/>
          <xsd:element name="definition" type="xsd:string"/>
        </xsd:all>
      </xsd:complexType>

      <xsd:complexType name="ArrayOfDefinition">
        <xsd:complexContent>
          <xsd:restriction base="soapenc:Array">
            <xsd:attribute ref="soapenc:arrayType" wsdl:arrayType="typens:Definition[]"/>
          </xsd:restriction>
        </xsd:complexContent>
      </xsd:complexType>
Array of definition, when used a secondary, automatically becomes an enumerated string parameter.

Step 3: Write transformation rules (if necessary)

If when you test your servlet in Step 4 an exception is thrown, you may be trying to use a Moby object for which no mapping to the WSDL's XML Schema data type is available or vice versa. In this case, you will need to write either a DEM (XSLT) or MOB (regular expression) rule to do the conversion. Instructions on how to write these rules can be found here. We hope to create a Web-based MOB and DEM rule repository in the near future.

Step 4: Deploy the servlet

Your WAR file is now ready to host your MOBY-S Web Service, and can to be deployed in the Servlet container. How you do this depends on the container. For Tomcat, the easiest way is to use the management Web interface (e.g. http://your.servlet.host:8080/manager/html, but change 8080 appropriately if you had to follow Step 0) and upload the WAR.


1. Specification of WAR Upload

2. Successful deployment
If you don't have a nice Web interface available to do this, here are the alternatives (where $NAME is where you installed the container):
Servlet ContainerWhere to put TrimSequenceVector.warFurther action
Tomcat $CATALINA_HOME/webapps Restart Tomcat
WebLogic $WEBLOGIC_HOME/config/mydomain/applications Nothing, it loads the WAR automatically
JBOSS $JBOSS_HOME/server/default/deploy Nothing, it loads the WAR automatically

Step 5: Register the service

You should test your service to make sure it works in the servlet environment. A testing client program is automagically included in your WAR, so type (with the fully qualified host name, and change 8080 appropriately if you had follow Step 0):

java -jar get_genes_by_enzyme.war http://your.servlet.host:8080/get_genes_by_enzyme/ keggTest.xml
If the service fails, a useful error message should be printed to help you diagnose the problem. This would usually be an incorrect web.xml, the location of WSDL was incorrectly specified, or no data transformation rule exists to create the input or output required (bad SAWSDL markup, see Step 2).

Finally, if the output looks reasonable (no errors or faults, a real object is returned), you're ready to register your service with MobyCentral so anyone can use it. To do this, simply do the previous command, with an extra argument:

java -jar get_genes_by_enzyme.war http://your.servlet.host:8080/get_genes_by_enzyme/ keggTest.xml registerPermanent

This will run the service to make sure it works, then register it with MOBY Central. You'll see the message "Service Successfully Registered!".

Important Notes (Please Read!)

Permanent services: Your service hosts its own RDF specification (retrievable by doing a GET on the servlet URL). An automated agent at MOBY Central retrieves that RDF a few times a day, and if it fails to retrieve it several times in a row, the service is automatically deregistered. Therefore, if you stop or undeploy your servlet for a few days, it will automatically be deregistered. This "passive" deregistration method is for security purposes: no one can (intentionally or accidentally) deregister the service except those who control the servlet container itself.

Syncing with MOBY Central: Your service downloads the MOBY Central ontologies when started, therefore if you want your service to catch an important update to the ontologies, simply restart it (e.g. the "reload" link in the Tomcat manager interface shown previously).

Using a different MOBY Central: By default, the servlet assumes that the default public Moby Central will be used (this is determined automagically by a combination of hardcoding and fetching info from biomoby.org). If you want to use a different Moby Central, e.g. a test registry, uncomment the mobyCentralURL parameter in the web.xml.