Configuring the RDF Agent

Overview

The RDFAgent is a program for the automatic checking of the service signatures between mobycentral databases and servers of the service providers. The agent's main purpose is to retrieve RDF documents from URLs, parse these documents and compare their contents with what is available in the mobycentral registry.

RDF documents that contain modified service descriptions will have their descriptions updated in the registry. RDF documents with new services contained within them, will cause the agent to register those services with mobycentral. Finally, services that are missing, i.e. used to be located in the RDF document, will be removed by the agent. The addition, modification and removal of any service is done using the mobycentral API.

The RDFagent is very flexible. A registry administrator can choose to have the agent email service providers if their service is invalid or if their service has been modified. The agent can email service providers using SMTP or using 'mail', a UNIX/Linux/*nix email client. Moreover, The agent can choose to ignore those services that do not contain signature urls or to remove them.

This document is broken down into the following sections:

 

Getting Started

  1. A Java Virtual Machine version 1.5 or later is required to run the RDFAgent.

  2. You can build the latest RDFAgent from the cvs (or download a packaged version as a zip or tar.gz file and go to step 3!)
        cd /path/to/moby-live/Java
        ant bindist-rdfagent

    Need to check out the code from the cvs? Click Here!

  3. The newly created archive will be placed at /moby-live/Java/docs/dist/, with the filename 'rdfagent-yyyy-mm-dd.zip' or 'rdfagent-yyyy-mm-dd.tar.gz'

    Place the RDFagent archive in your /path_to/rdfagent_home/ directory,

    For example: /home/agents/ and then unpack the program.

    The directory /path_to/rdfagent_home/rdfagent should automatically be created and populated.

  4. Setting up run-RDFAgent and/or run-RDFAgent.bat:
    The 2 scripts that run the agent both contain placeholders for the
    variables:

    	
    	JAVA_HOME = actual directory for your jdk (not the bin dir), 
    	
    	  			and
    	
    	RDF_AGENT_HOME = the agents home directory, i.e. /home/agents/rdfagent
    	

    These variable must be set for the agent to work properly with these scripts.

    Also, note that it is important to place the variable JAVA_HOME in the environment of the web server you are using so that Mobycentral can access it!

  5. Add the path to the RDFagent to the [mobycentral] section of your mobycentral.config file:
    	  rdfagent = /path/to-your/rdfagent/home/directory
    For example:
    	  
          [mobycentral]
          username = username
          password = password
          url = localhost
          port = 3306
          dbname = mobycentral
          rdfagent = /home/agents/rdfagent
          keyphrase = this is the phrase that i will use to remove services
    	  
  6. Add the deregistration keyphrase to the [mobycentral] section of your mobycentral.config file:
          keyphrase = this is the phrase that i will use to remove services
          
    For example:
    	  
          [mobycentral]
          username = username
          password = password
          url = localhost
          port = 3306
          dbname = mobycentral
          rdfagent = /home/agents/rdfagent/
          keyphrase = this is the phrase that i will use to remove services
          
    More on this phrase in the config section of this document.

  7. Add a new table "service_validation" to mobycentral database:
    	  
          run *reset    (from /rdfagent directory)
          > cd /RDFagentHomeDirectory/rdfagent/
          > ./reset
          password: your root password for mySQL
    	  
  8. Add a new column(signatureURL) to service_instance table (if it's missing), as mySQL root
    	
        mysql> ALTER TABLE service_instance 
        ADD signatureURL varchar(255) default null;
    	
  9. Adding the Moby Admin dispatcher:
    The script MOBY-Admin.pl must be copied /moby-live/Perl/scripts/ to /path/to/your/biomoby/central/cgi-bin/.

    This cgi-bin would most likely be the same place that you placed MOBY-Central.pl.

    In addition, you should create password protection for this file. To do so, create a .htaccess file and place it in the same directory as MOBY-Admin.pl.

    For a template file, please take a look at the .htaccess file included with the agent or read the section on the MOBY-Admin Dispatcher.

  10. Assuming that you have configured the agent via the RDFagent_config.txt file, you are now ready to use the agent.

    The agent has 3 modes of operation,
    1. a registry wide validator that iterates through the registry determining which services have been added, removed or updated,
    2. a url mode that specifically tells the agent which url to visit, and
    3. a file mode that contains a list of newline delimited urls that the agent will attempt to visit.

    The registry wide validator would be the usual mode of operation used by the Mobycentral registry administrator.

    To invoke the agent in registry wide mode do the following:

    On a *nix machine

    	cd /path/to/rdfagent/
    	./run-RDFagent
    	    


    On a windows machine

    	cd /path/to/rdfagent/
    	./run-RDFagent.bat
    	    


    To invoke the agent on a specific url do the following:

    On a *nix machine

    	cd /path/to/rdfagent/
        ./run-RDFagent -url www.someURL.com
    	    


    On a windows machine

     	cd /path/to/rdfagent/
    	run-RDFagent.bat -url www.someURL.com
    	    
  11. To invoke the agent on a file containing urls do the following:

    On a *nix machine,
    	cd /path/to/rdfagent/
    	./run-RDFagent -file /path/to/file/myFile
    	  
    On a windows machine
    	cd /path/to/rdfagent/
    	run-RDFagent.bat -file C:\path\to\file\myFile
          

  1. Note: If the path contains whitespace please place the path in quotes or the path will not be interpreted correctly.

  2. Brief note regarding the behaviour of -url and -file. If the agent is invoked with a -url or -file, the agent will attempt to perform a HTTP GET on that url to retrieve a file, parse that file into services. If the url is invalid, the agent will remove all of the services that are said to contain the signature url equal to url.

Configuring the Agent

The agent is highly configurable. Every user of the agent will have to configure the agent to their particular registrys needs. The configuration file is called RDFagent_config.txt, and is made up of key = value pairs.

Below is a listing of the parameters:

admin.name The name of the registry administrator. This name will be used on out going emails sent by the agent
admin.email The email address of the registry administrator
admin.registry.conf The path to the mobycentral.config file. This is the file that contains the mobycentral registrys' database configuration
admin.registry.url The registry endpoint; the URL location of the mobycentral registry
admin.registry.uri The registry namespace; the URI of the registry
admin.registry.ignore.null This tells the agent whether or not to ignore services that do not have a signatureURL. By default, all null signatureURLs are ignored, but for those of you wishing to purge them from your local registry this is the option for you.
Acceptable values are yes or no
admin.registry.removal.endpoint The endpoint of the MOBY-Admin dispatcher
admin.registry.removal.uri The namespace of the MOBY-Admin dispatcher
admin.registry.removal.username The username that the agent should use to access the MOBY-Admin dispatcher
admin.registry.removal.password The password that the agent should use to access the MOBY-Admin dispatcher
admin.registry.removal.keyphrase The keyphrase that the agent will give to the dispatcher when trying to remove a service. This phrase must be equal to the one in the mobycentral.config configuration file or the removal of services will fail
admin.mail.smtp.enable This tells the agent whether or not it should attempt to send email using SMTP. If SMTP is disabled, then the agent will attempt to use the program 'mail' if it exists.
Acceptable values are yes or no.
admin.mail.smtp.server The SMTP server address
admin.mail.smtp.login The user login that the agent should use while sending mail via SMTP
admin.mail.smtp.password The user password need for sending SMTP mail.
registry.email This tells the agent whether or not it should send a service provider an email message if their services are modified.
Acceptable values are yes or no
registry.deregister This tells the agent whether or not it should remove services that are canidates for removal from the registry.
Acceptable values are yes or no
registry.deregister.from_url This tells the agent whether or not it should remove services that are canidates for removal when the agent is invoked on a particular URL.
Acceptable values are yes or no
registry.update.services This tells the agent whether or not it should attempt to update the registry with modified services that are found in service providers RDF documents. Note that if the modified service is invalid, the old unmodified version of the service will be re-registered.
If the old unmodified version of the service is itself invalid or rejected by the registry, the service will no longer be present in the registry.
Acceptable values are yes or no
registry.count This tells the agen how many chances a signatureURL should have before it is considered unreachable or invalid.
Acceptable values are any non-negative integers.
log.level This specifies the level of logging that the agent should perform.
Acceptable values are all, info, warn or severe
log.enable This tells the agent whether or not logging is enabled
agent.log.directory This tells the agent where to store the log files. The path must be a valid path or the agent will not be able to save its log files.

Note that it may be necessary for you to create this directroy and set up the appropriate permissions so that the agen can write to it when invoked by the registry.

Note: Comments are initiated with a # and the following characters must be escaped if you intend on using them:

 
		# = :
		

You can escape the characters by 'prepending' a \ to the character, i.e \=.

Setting up a Cron Job

If you are interested in setting up a cron for automatical running the RDFAgent, you will need to do the following:

    1. Create a text file that contains the following text:

      05 8 1 * * source /homedir/rdfagent/run-RDFagent

      where
      	  
            05 specifies the minute
            8  specifies the hour (24 hour clock)
            1  specifies the day of the month
      	  

      The last two *s denote
      number of the month (or abbreviation like Jan,Feb,etc)
      day of the week as a number (or abbreviation like Mon, etc)

      So,in our example cron would run on the first day of every month at
      8:05am.

    2. Save this text file in /etc/some_name, e.g. /etc/moby.cron

    3. As root, create a crontable for user:
      	  crontab -u username /etc/moby.cron (assuming the cron is called
            moby.cron and is located at /etc/.
      	  

      User's crontab files save in /var/spool/cron directory and should not be edited directly. They should only be accessed via the crontab command.

    4. That's all. If you would see whether you set up the cron correctly, you can restart the crond service by issuing the following commands:
            /sbin/service crond stop
            /sbin/service crond start

MOBY-Admin Dispatcher

The MOBY-Admin dispatcher is necessary for service removal. This dispatcher should only be accessible through the use of authentication.

In order to set up authentication, you will have to place a file with the following contents with in same directory as the MOBY-Admin.pl dispatcher:

  
AuthName "MOBY Central Admin"
AuthType Basic

# path to the config file
AuthUserFile c:/apache2/Apache2/users # protect the Moby Central Admin dispatcher by a password <Files "MOBY-Admin.pl"> # require a valid-user (specified in users) Require valid-user </Files>

This file must be called '.htaccess'.

Assuming that the MOBY-Admin.pl file is located in the cgi-bin directory of apache, you will have to add the following line to httpd.conf where the cgi-bin directory is defined:

AllowOverride AuthConfig

i.e.

<Directory "/usr/local/apache2/cgi-bin">
	Options Indexes
	# AllowOverride None # <- old value
	AllowOverride AuthConfig # <- new value
	Order allow,deny
	Allow from all
	<Files *.html>
		SetHandler type-map
	</Files>
	SetEnvIf Request_URI ^/manual/(de|en|es|fr|ja|ko|ru)/ prefer-language=$1
	RedirectMatch 301 ^/manual(?:/(de|en|es|fr|ja|ko|ru)){2,}(/.*)?$ /manual/$1$2
</Directory>

To create a username and password to access the dispatcher with, do the following:

  cd /path/to/apache/bin
  htpasswd -c /path/that/exists/users username
  

The -c specifies to create a *new* password file & to place that file at /path/that/exists/

username would be the newly created login name

You will be prompted for a password.

This login name and password will have to be entered into the agents configuration file.

Don't forget to restart apache each and every time that you modify the httpd.conf file!

Uses for the Agent

The following are some uses for the agent:

  1. Curate a mobycentral registry so that abandoned services are removed.

  2. Initializing an empty registry with services.

    If one has an RDF document describing the services they would like to keep in their registry, all the user has to do is place that document on the web and invoke the agent on that url using the -url option. Their registry will now contain the services described in that document.

    This can be done using the rdf obtained from the RESOURCES script running on mobycentral to create a registry mirroring the mobycentral registry.

  3. As mentioned in 2 above, the agent could be used for mirroring of registries.

  4. Creating a registry that was filtered for only a certain number of services. If a user had a script that outputted a signature url for each service type, or authority, etc and saved these urls in a file, then they could invoke the agent using the -file option and have a filtered set of services in their custom registry.


If you have any other uses for the agent, please let me know.

FAQ

Below are the frequently asked questions. If you have a question not on this list, please let me know!

  1. I have installed everything according to this document, but when I run the agent, I get an error stating:

        "An empty value was found where a yes or no was expected in the config file ..."

    What did I do wrong?

    A.
    First, please ensure that all of the values are filled out properly. If that wasn't the problem, then try adding the following line to the bottom of the config file:

        agent.output.flatfile = yes 

    Once you have added that value to the config file, try testing the agent again.

  2. I am getting the following error:

        org.xml.sax.SAXException: Bad types (class java.lang.String -> int)

    What's going on?

    A.
    This happens whenever you provide the wrong value for admin.registry.removal.endpoint. Make sure that this url points to MOBY-Admin.pl. Also, make sure that admin.registry.removal.uri looks something like

        http://your.domain/MOBY/Admin

    Once you have fixed that value, try testing the agent again.


Edward Kawas
Last modified: Thu Feb 14 15:07:15 2008

$Date$