                             User's Manual
                             ~~~~~~~~~~~~~
                     RDFagent version 2 - February 06
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                       =-=-=-=-=-=-=-=-=-=-=-=-=-=-=
                       Welcome to the Moby RDFAgent!
                       -=-=-=-=-=-=-=-=-=-=-=-=-=-=-


******************** TOC ********************

Section 1   ........................ Overview
Section 2   ................. Getting Started
Section 3   ........... Configuring the Agent
Section 4   ........... Setting up a Cron Job
Section 5   ........... MOBY-Admin Dispatcher
Section 6   .............. Uses for the Agent

******************** TOC ********************


Overview

	The RDFAgent is a program for the automatic checking of the service signatures
between mobycentral databases and servers of the service providers. The agent's
main purpose is to retrieve RDF documents from URLs, parse these documents and
compare their contents with what is available in the mobycentral registry.

	RDF documents that contain modified service descriptions will have their
descriptions updated in the registry. RDF documents with new services contained
within them, will cause the agent to register those services with mobycentral.
Finally, services that are missing, i.e. used to be located in the RDF document,
will be removed by the agent. The addition, modification and removal of any
service is done using the mobycentral API.

	The RDFagent is very flexible. A registry administrator can choose to have
the agent email service providers if their service is invalid or if their
service has been modified. The agent can email service providers using SMTP or
using 'mail', a UNIX/Linux/*nix email client. Moreover, The agent can choose to 
ignore those services that do not contain signature and warn urls or remove 
them.
     

Getting Started

	1. A Java Virtual Machine version 1.4.2 or later is required to run the 
	   RDFAgent

	2. You can build the latest RDFAgent from the cvs using 
		on a *nix box		/moby-live/Java/build.sh bindist_rdfagent
					or
		on a windows box	/moby-live/Java/build.bat bindist_rdfagent
		
		The archive will be located at /moby-live/Java/docs/dist/, called
			rdfagent-yyyy-mm-dd.zip or rdfagent-yyyy-mm-dd.tar.gz
			
	   	Place the RDFagent archive in your /path_to/rdfagent_home/ directory,
	   	for example: /home/agents/ and then unpack the program. The directory 
	   	/path_to/rdfagent_home/rdfagent should automatically be created and 
	   	populated.

	3. Setting up run-RDFAgent and/or run-RDFAgent.bat:
		The 2 scripts that run the agent both contain placeholders for the
		variables:
		
	    	JAVA_HOME = actual directory for your jdk (not the bin dir), and
	    	AGENT_HOME = the agents home directory, i.e. /home/agents/rdfagent
	    
	    These variable must be set for the agent to work properly with these
	    scripts
	
	4. Add the path to the RDFagent home directory to the [mobycentral] section 
	   of your mobycentral.config file:
	
	    rdfagent = /path/to-your/rdfagent/home/directory
	    
	    For example:
	
	    [mobycentral]
	    username = username
	    password = password
	    url = localhost
	    port = 3306
	    dbname = mobycentral
	    rdfagent = /home/agents/rdfagent/
	    keyphrase = this is the phrase that i will use to remove services
	
	5. Add the deregistration keyphrase to the [mobycentral] section 
	   of your mobycentral.config file:
	
	    keyphrase = this is the phrase that i will use to remove services
	    
	    For example:
	
	    [mobycentral]
	    username = username
	    password = password
	    url = localhost
	    port = 3306
	    dbname = mobycentral
	    rdfagent = /home/agents/rdfagent/run-RDFagent
	    keyphrase = this is the phrase that i will use to remove services
	    
	   More on this phrase in the config section of this document.
	   
	6. Add a new table "service_validation" to mobycentral database:
	
		  run *reset    (from /rdfagent directory)
		  > cd /RDFagentHomeDirectory/rdfagent/
		  > ./reset
	  	  password: your root password for mySQL
	  
	6. Add a new column(signatureURL) to service_instance table (if it's 
	   missing), as mySQL root:
	
		 mysql> ALTER TABLE service_instance 
		 	ADD signatureURL varchar(255) default null;

	7. Adding the Moby Admin dispatcher:
		The script MOBY-Admin.pl must be copied /moby-live/Perl/scripts/ to
		/path/to/your/biomoby/central/cgi-bin/. This cgi-bin would most likely
		be the same place that you placed MOBY-Central.pl.
		
		In addition, you should create password protection for this file. To do
		so, create a .htaccess file and place it in the same directory as
		MOBY-Admin.pl. For a template file, please take a look at the .htaccess
		file included with the agent or read the section on the 
		MOBY-Admin Dispatcher.

	8. Assuming that you have configured the agent via the RDFagent_config.txt
	   file, you are now ready to use the agent.
	   
	   The agent has 3 modes of operation, a registry wide validator that
	   iterates through the registry determining which services have been added,
	   removed or updated, a url mode that specifically tells the agent which
	   url to visit and a file mode that contains a list of newline delimited
	   urls that the agent will attempt to visit.
	   
	   The registry wide validator would be the usual mode of operation used by
	   the administrator.
	   
	   To invoke the agent in registry wide mode do the following:
	   		1. cd /path/to/rdfagent/
	   		2. On a *nix machine
	   		
	   			./run-RDFagent
	   		
	   		   or on a windows machine
	   		    
	   		    run-RDFagent.bat
	   		
	   To invoke the agent on a specific url do the following:
	   		1. cd /path/to/rdfagent/
	   		2. On a *nix machine
	   		
	   			./run-RDFagent -url www.someURL.com
	   		
	   		   or on a windows machine
	   		    
	   		    run-RDFagent.bat -url www.someURL.com

	   To invoke the agent on a file containing urls do the followign:
	   		1. cd /path/to/rdfagent/
	   		2. On a *nix machine
	   		
	   			./run-RDFagent -file /path/to/file/myFile
	   		
	   		   or on a windows machine
	   		    
	   		    run-RDFagent.bat -file C:\path\to\file\myFile
	   		    
	   		    Note: If the path contains whitespace please place the path in 
	   		    	quotes or the path will not be interpreted correctly.
	
	Brief note regarding the behaviour of -url and -file.
		If the agent is invoked with a -url or -file, the agent will attempt to
		perform a HTTP GET on that url to retrieve a file, parse that file into
		services. If the url is invalid, the agent will remove all of the
		services that are said to contain the signature url equal to url.
		
			
Configuring the Agent
	
		The agent is highly configurable. Every user of the agent will have to
	configure the agent to their particular registry. The configuration file
	is called RDFagent_config.txt, and is made up of key = value pairs.

	Below is a listing of the parameters:
		
		admin.name - The name of the registry administrator. This name will be
					 used on out going emails sent by the agent.

		admin.email - The email address of the registry administrator.

		admin.registry.conf - The path to the mobycentral.config file. This is
							  the file that contains the mobycentral registrys
							  database configuration.

		admin.registry.url - The registry endpoint, the url location of the
							 mobycentral registry.

		admin.registry.uri - The registry namespace, the URI of the registry.
		
		admin.registry.ignore.null - This tells the agent whether or not to 
									 ignore services that do not have a signature
									 url. If you choose to ignore these services
									 the agent will warn you everytime one is
									 found. Acceptable values are yes or no.
		
		admin.registry.removal.endpoint - The endpoint of the MOBY-Admin
										  dispatcher.

		admin.registry.removal.uri - The namespace of the MOBY-Admin dispatcher.

		admin.registry.removal.username - The username that the agent should use 
										  to access the MOBY-Admin dispatcher.

		admin.registry.removal.password - The password that the agent should use
										  to access the MOBY-Admin dispatcher.

		admin.registry.removal.keyphrase - The keyphrase that the agent will
										   give to the dispatcher when trying to
										   remove a service. This phrase must be
										   equal to the one in the 
										   mobycentral.config or the removal of
										   the service will fail.

		admin.mail.smtp.enable - This tells the agent whether or not it should
								 attempt to send email using SMTP. If SMTP is
								 disabled, then the agent will attempt to use
								 the program 'mail' if it exists.
								 Acceptable values are yes or no.

		admin.mail.smtp.server - The SMTP server address.

		admin.mail.smtp.login - The user login that the agent should use while
								sending mail via SMTP.

		admin.mail.smtp.password - The user password needed for the login.
		
		registry.email	    - This tells the agent whether or not it should send
							  a service provider an email message if their
							  services are modified. Acceptable values are
							  yes or no.
		
		registry.deregister - This tells the agent whether or not it should
							  remove services that are invalid. Acceptable values
							  are yes or no.
							  
		registry.deregister.from_url - This tells the agent whether or not it should
							  		   remove services that are invalid when invoked 
							  		   on a particular url. Acceptable values
							  		   are yes or no.
		
		registry.update.services - This tells the agent whether or not it should
								   attempt to update the registry with modified
								   services that are found in service providers
								   RDF documents. Note that if the modified
								   service is invalid, the old unmodified
								   version of the service will be re-registered.
								   If this service is invalid or rejected by the
								   registry, the service will no longer be
								   present in the registry. Acceptable values
								   are yes or no.
								   
		registry.count - This tells the agent how many chances a signature url
						 should have before it is considered unreachable or
						 invalid. Acceptable values are any non-negative
						 integers.
		
		log.level - This specifies the level of logging that the agent should
					perform. Acceptable values are all, info, warn or severe.

		log.enable - This tells the agent whether or not logging is enabled.
		
		agent.log.directory - This tells the agent where to store the log files.
							  The path must be a valid path or the agent will
							  not be able to save its log files. Note that it
							  may be necessary for you to create this directory
							  and set up the appropriate permissions so that the
							  agent can write to it when invoked by mobycentral.
				
		Note: Comments are initiated with # and the following characters are
		must be escaped if you intend on using them: # = :
		You can escape the characters by 'prepending' a \ to the character, i.e
		\=.

Setting up a Cron Job
	
	If you are interested in setting up a cron for automatical running the 
	RDFAgent, you will need to do the following:
	
	  1. Create a text file that contains the following text:
	
	     05 8 1 * * source /homedir/rdfagent/run-RDFagent
	
	     	where
	     		 05 specifies the minute
	     		 8  specifies the hour (24 hour clock)
	     		 1  specifies the day of the month
	     		 
	     The last two *s denote
	     		number of the month (or abbreviation like Jan,Feb,etc)
				day of the week as a number (or abbreviation like Mon, etc)  
	     
	     So,in our example cron would run on the first day of every month at 
	     8:05am.
	
	  2. Save this text file in /etc/some_name, e.g. /etc/moby.cron
	  
	  3. As root, create a crontable for user:
	        crontab -u username /etc/moby.cron (assuming the cron is called
	        moby.cron and is located at /etc/.
	   
	    User's crontab files save in /var/spool/cron directory and should not be
	    edited directly. They should only be accessed via the crontab command.
	 
	  4. That's all. If you would see whether you set up the cron correctly,
	  	 you can restart the crond service by issuing the following commands:
	  	 
	      	/sbin/service crond stop
		    /sbin/service crond start
		    
		   
MOBY-Admin Dispatcher

		The MOBY-Admin dispatcher is necessary for service removal. This 
	dispatcher should only be accessible through the use of authentication.
		
		In order to set up authentication, you will have to place a file with
	the following contents with in same directory as the MOBY-Admin.pl 
	dispatcher:
			
		AuthName "MOBY Central Admin"
		AuthType Basic
		
		# path to the config file
		AuthUserFile c:/apache2/Apache2/users
		
		# protect the Moby Central Admin dispatcher by a password
		<Files "MOBY-Admin.pl">
		 # require a valid-user (specified in users)
		  Require valid-user
		</Files>
		
	
		This file must be called '.htaccess'.
		
		Assuming that the MOBY-Admin.pl file is located in the cgi-bin directory
	of apache, you will have to add the following line to http.conf where the
	cgi-bin directory is defined:
		
			AllowOverride AuthConfig
		
		i.e.
		
		<Directory "/usr/local/apache2/cgi-bin">
		    Options Indexes
		  # AllowOverride None # <- old value
			AllowOverride AuthConfig # <- new value
		    Order allow,deny
		    Allow from all
		    <Files *.html>
		        SetHandler type-map
		    </Files>
		    SetEnvIf Request_URI ^/manual/(de|en|es|fr|ja|ko|ru)/ prefer-language=$1
		    RedirectMatch 301 ^/manual(?:/(de|en|es|fr|ja|ko|ru)){2,}(/.*)?$ /manual/$1$2
		 </Directory>
	
		To create a username and password to access the dispatcher with, do the
	following:

    			cd /path/to/apache/bin
	   		    htpasswd -c /path/that/exists/users username
				
			The -c specifies to create a *new* password file & to place that 
		file at /path/that/exists/
			
			username would be the newly created login name
			
			You will be prompted for a password.
		
		This login name and password will have to be entered into the agents
		configuration file.

Uses for the Agent
		
		The following are some uses for the agent:
		
			1. Curate a mobycentral registry so that abandoned services are
			   removed.
			
			2. Initializing an empty registry with services.
				If one has an RDF document describing the services they would
				like to keep in their registry, all the user has to do is place
				that document on the web and invoke the agent on that url using
				the -url option. Their registry will now contain the services
				described in that document. 
				
				This can be done using the rdf obtained from the RESOURCES 
				script running on mobycentral to create a registry mirroring
				the mobycentral.
			
			3. As mentioned in 2 above, the agent could be used for mirroring
				of registries.
				
			4. Creating a registry that was filtered for only a certain number
			   of services. If a user had a script that outputted a signature
			   url for each service type, or authority, etc and saved these urls
			   in a file, then they could invoke the agent using the -file 
			   option and have a filtered set of services in their custom 
			   registry.
			   
		If you have any other uses for the agent, please let me know.
		


Edward Kawas
edward [dot] kawas [at] gmail [dot] com