# Configuring the RDF Agent

### Overview

The RDFAgent is a program for the automatic checking of the service signatures between mobycentral databases and servers of the service providers. The agent's main purpose is to retrieve RDF documents from URLs, parse these documents and compare their contents with what is available in the mobycentral registry.

RDF documents that contain modified service descriptions will have their descriptions updated in the registry. RDF documents with new services contained within them, will cause the agent to register those services with mobycentral. Finally, services that are missing, i.e. used to be located in the RDF document, will be removed by the agent. The addition, modification and removal of any service is done using the mobycentral API.

The RDFagent is very flexible. A registry administrator can choose to have the agent email service providers if their service is invalid or if their service has been modified. The agent can email service providers using SMTP or using 'mail', a UNIX/Linux/*nix email client. Moreover, The agent can choose to ignore those services that do not contain signature urls or to remove them.

### Getting Started

1. A Java Virtual Machine version 1.5 or later is required to run the RDFAgent.

2. You can build the latest RDFAgent from the cvs (or download a packaged version as a zip or tar.gz file and go to step 3!)
    cd /path/to/moby-live/Java
ant bindist-rdfagent

3. The newly created archive will be placed at /moby-live/Java/docs/dist/, with the filename 'rdfagent-yyyy-mm-dd.zip' or 'rdfagent-yyyy-mm-dd.tar.gz'

Place the RDFagent archive in your /path_to/rdfagent_home/ directory,

For example: /home/agents/ and then unpack the program.

The directory /path_to/rdfagent_home/rdfagent should automatically be created and populated.

4. Setting up run-RDFAgent and/or run-RDFAgent.bat:
The 2 scripts that run the agent both contain placeholders for the
variables:


JAVA_HOME = actual directory for your jdk (not the bin dir),

and

RDF_AGENT_HOME = the agents home directory, i.e. /home/agents/rdfagent


These variable must be set for the agent to work properly with these scripts.

Also, note that it is important to place the variable JAVA_HOME in the environment of the web server you are using so that Mobycentral can access it!

5. Add the path to the RDFagent to the [mobycentral] section of your mobycentral.config file:
	  rdfagent = /path/to-your/rdfagent/home/directory
For example:

[mobycentral]
url = localhost
port = 3306
dbname = mobycentral
rdfagent = /home/agents/rdfagent
keyphrase = this is the phrase that i will use to remove services

6. Add the deregistration keyphrase to the [mobycentral] section of your mobycentral.config file:
      keyphrase = this is the phrase that i will use to remove services

For example:

[mobycentral]
url = localhost
port = 3306
dbname = mobycentral
rdfagent = /home/agents/rdfagent/
keyphrase = this is the phrase that i will use to remove services

More on this phrase in the config section of this document.

7. Add a new table "service_validation" to mobycentral database:

run *reset    (from /rdfagent directory)
> cd /RDFagentHomeDirectory/rdfagent/
> ./reset

8. Add a new column(signatureURL) to service_instance table (if it's missing), as mySQL root

mysql> ALTER TABLE service_instance

The script MOBY-Admin.pl must be copied /moby-live/Perl/scripts/ to /path/to/your/biomoby/central/cgi-bin/.

This cgi-bin would most likely be the same place that you placed MOBY-Central.pl.

In addition, you should create password protection for this file. To do so, create a .htaccess file and place it in the same directory as MOBY-Admin.pl.

For a template file, please take a look at the .htaccess file included with the agent or read the section on the MOBY-Admin Dispatcher.

10. Assuming that you have configured the agent via the RDFagent_config.txt file, you are now ready to use the agent.

The agent has 3 modes of operation,
1. a registry wide validator that iterates through the registry determining which services have been added, removed or updated,
2. a url mode that specifically tells the agent which url to visit, and
3. a file mode that contains a list of newline delimited urls that the agent will attempt to visit.

The registry wide validator would be the usual mode of operation used by the Mobycentral registry administrator.

To invoke the agent in registry wide mode do the following:

On a *nix machine

	cd /path/to/rdfagent/
./run-RDFagent


On a windows machine

	cd /path/to/rdfagent/
./run-RDFagent.bat


To invoke the agent on a specific url do the following:

On a *nix machine

	cd /path/to/rdfagent/
./run-RDFagent -url www.someURL.com


On a windows machine

 	cd /path/to/rdfagent/
run-RDFagent.bat -url www.someURL.com

11. To invoke the agent on a file containing urls do the following:

On a *nix machine,
	cd /path/to/rdfagent/
./run-RDFagent -file /path/to/file/myFile

On a windows machine
	cd /path/to/rdfagent/
run-RDFagent.bat -file C:\path\to\file\myFile


1. Note: If the path contains whitespace please place the path in quotes or the path will not be interpreted correctly.

2. Brief note regarding the behaviour of -url and -file. If the agent is invoked with a -url or -file, the agent will attempt to perform a HTTP GET on that url to retrieve a file, parse that file into services. If the url is invalid, the agent will remove all of the services that are said to contain the signature url equal to url.

### Configuring the Agent

The agent is highly configurable. Every user of the agent will have to configure the agent to their particular registrys needs. The configuration file is called RDFagent_config.txt, and is made up of key = value pairs.

Below is a listing of the parameters:

Note: Comments are initiated with a # and the following characters must be escaped if you intend on using them:


# = :


You can escape the characters by 'prepending' a \ to the character, i.e \=.

### Setting up a Cron Job

If you are interested in setting up a cron for automatical running the RDFAgent, you will need to do the following:

1. Create a text file that contains the following text:

05 8 1 * * source /homedir/rdfagent/run-RDFagent

where

05 specifies the minute
8  specifies the hour (24 hour clock)
1  specifies the day of the month


The last two *s denote
number of the month (or abbreviation like Jan,Feb,etc)
day of the week as a number (or abbreviation like Mon, etc)

So,in our example cron would run on the first day of every month at
8:05am.

2. Save this text file in /etc/some_name, e.g. /etc/moby.cron

3. As root, create a crontable for user:
	  crontab -u username /etc/moby.cron (assuming the cron is called
moby.cron and is located at /etc/.


User's crontab files save in /var/spool/cron directory and should not be edited directly. They should only be accessed via the crontab command.

4. That's all. If you would see whether you set up the cron correctly, you can restart the crond service by issuing the following commands:
      /sbin/service crond stop
/sbin/service crond start

The MOBY-Admin dispatcher is necessary for service removal. This dispatcher should only be accessible through the use of authentication.

In order to set up authentication, you will have to place a file with the following contents with in same directory as the MOBY-Admin.pl dispatcher:


AuthType Basic

# path to the config file
AuthUserFile c:/apache2/Apache2/users

# require a valid-user (specified in users)
Require valid-user
</Files>


This file must be called '.htaccess'.

Assuming that the MOBY-Admin.pl file is located in the cgi-bin directory of apache, you will have to add the following line to httpd.conf where the cgi-bin directory is defined:

AllowOverride AuthConfig

i.e.

<Directory "/usr/local/apache2/cgi-bin">
Options Indexes
# AllowOverride None # <- old value
AllowOverride AuthConfig # <- new value
Order allow,deny
Allow from all
<Files *.html>
SetHandler type-map
</Files>
SetEnvIf Request_URI ^/manual/(de|en|es|fr|ja|ko|ru)/ prefer-language=$1 RedirectMatch 301 ^/manual(?:/(de|en|es|fr|ja|ko|ru)){2,}(/.*)?$ /manual/$1$2
</Directory>


To create a username and password to access the dispatcher with, do the following:

  cd /path/to/apache/bin


The -c specifies to create a *new* password file & to place that file at /path/that/exists/

You will be prompted for a password.

This login name and password will have to be entered into the agents configuration file.

Don't forget to restart apache each and every time that you modify the httpd.conf file!

### Uses for the Agent

The following are some uses for the agent:

1. Curate a mobycentral registry so that abandoned services are removed.

2. Initializing an empty registry with services.

If one has an RDF document describing the services they would like to keep in their registry, all the user has to do is place that document on the web and invoke the agent on that url using the -url option. Their registry will now contain the services described in that document.

This can be done using the rdf obtained from the RESOURCES script running on mobycentral to create a registry mirroring the mobycentral registry.

3. As mentioned in 2 above, the agent could be used for mirroring of registries.

4. Creating a registry that was filtered for only a certain number of services. If a user had a script that outputted a signature url for each service type, or authority, etc and saved these urls in a file, then they could invoke the agent using the -file option and have a filtered set of services in their custom registry.

If you have any other uses for the agent, please let me know.

### FAQ

Below are the frequently asked questions. If you have a question not on this list, please let me know!

1. I have installed everything according to this document, but when I run the agent, I get an error stating:

    "An empty value was found where a yes or no was expected in the config file ..."

What did I do wrong?

A.
First, please ensure that all of the values are filled out properly. If that wasn't the problem, then try adding the following line to the bottom of the config file:

    agent.output.flatfile = yes

Once you have added that value to the config file, try testing the agent again.

2. I am getting the following error:

    org.xml.sax.SAXException: Bad types (class java.lang.String -> int)

What's going on?

A.
This happens whenever you provide the wrong value for admin.registry.removal.endpoint. Make sure that this url points to MOBY-Admin.pl. Also, make sure that admin.registry.removal.uri looks something like

    http://your.domain/MOBY/Admin

Once you have fixed that value, try testing the agent again.

Edward Kawas
$Date$