Australian Spatial Data Infrastructure logo
Header image.

Australian Spatial Data Directory (ASDD)

Site search:
home | about | feedback
Modified: 2005-02-09

Upgrading existing ASDD Isite nodes


Current situation

  • Many nodes are using software that has known minor problems
  • It is difficult to build consistent gateways to the ASDD because some nodes do not respond to all Record Syntaxes (XML, HTML and SUTRS)
  • Some ASDD nodes are not properly configured and so do not respond to all client search queries (see Use Attributes)

Need to upgrade

  • All Isite-based nodes need to be using the current version of the Isite software with current configuration files
  • The gateway interfaces at Geoscience Australia could be enhanced, but that cannot be done until all nodes have reliable configuration
  • Standardise the basic capabilities of each node and response to client requests
  • Testing and bug fixing cannot begin in earnest until all nodes have the same setup

The expected time to carry out this upgrade (allowing for some holdups and thorough testing) is half a day. It would be a good idea to read this document before beginning and follow the sections as you upgrade.

Download the Isite package

The current Isite2 is available at A/WWW Enterprises (follow the link from the home page to "Isite Distribution", then to the "Isite2" directory).

Unpack the software into a preparation directory on your system. Follow the Isite2 instructions if you are building from source.

Create some directories

Set up the following directories. They can be anywhere on your server and do not need to be under the HTTP server documents root directory. Probably in a different place to your current installation ...

  • $ISITE_BUILD ... where you unpacked the Isite distribution
  • $ISITE_BIN ... where Isite binaries and configuration files will be installed
  • $ISITE_CONF ... you might prefer to keep the configuration files separate to the binaries
  • $ISITE_DATA ... where the collections of metadata documents are located
  • $ISITE_DB ... where Isite databases will be built during indexing

Compiling

If you are using the pre-compiled binaries, then you can skip to the next section.

To compile and build Isite you will need to have the "gcc/g++" compiler and libraries properly installed.

  • Change directory to where you unpacked Isite ($ISITE_BUILD)
  • You may need to edit the top-level Makefile to set the appropriate compiler flags for your platform
  • Type "make" (do not use "make install" unless you are very familiar)
  • Compiling should be straight-forward, if you have problems then technical assistance is available via the ISITE-L email list server
  • The binaries that are produced will be in the $ISITE_BUILD/bin directory

Installation

Installation is currently a manual process.

Copy the following binaries from $ISITE_BUILD/bin to $ISITE_BIN ...

  • zserver ... the Isite Z39.50 server
  • zbatch ... command-line client for conducting batch queries
  • zping ... command-line client for testing if a server is alive
  • zclient ... command-line client for conducting queries
  • izclient ... interactive text-based Z39.50 client
  • Iindex ... used to build the Isite index databases
  • Isearch ... command-line search client
  • Iutil ... utility for interrogating Isite databases

Configuration

There are example configuration files included in the Isite2 distribution at ./conf/anzmeta/

Compare those configuration files with your current set and ensure consistency.

The only two configuration files that should differ from the distribution are sapi.ini and zserver.ini. All configuration files are described below ...

sapi.ini

  • Defines attributes of each document collection that is available to Isearch.
  • Defines the location of the database on the file system.
  • Identifies which method to use to search the collection:
    • Keyword ISEARCH says to use this Isearch database that was created by running Iindex (which used the relevant Isearch doctype to parse the structured data files).
  • Defines which mapping files (see below) to use with the collection.
  • Example sapi.ini

zserver.ini

  • Used for initialisation when the Zserver is started.
  • Defines which databases (that are described in sapi.ini) are to be mounted and made available for searching via the Zserver.
  • You could define a different port number in this file if you want to leave your existing Zserver running.
  • Example zserver.ini

anzlic.fields

  • The Isearch doctype uses this file to define the data type of each particular metadata element (field) within the XML document.
  • The anzlic.fields file is specified in the Iindex command at indexing time.
  • Values: num (numeric fields e.g. <northbc>), date (single date e.g. <begdate>, <metd>), date-range (a range of dates i.e. <timperd>), gpoly (greater bounding polygon e.g. <bounding>).
  • Any field that is not defined in this configuration file is assumed to be a text field.
  • Example anzlic.fields

"Use Attribute" maps

  • Z39.50 insulates the user from the actual schema (structure and field names) of each document management system.
  • These configuration files enable the standard interface "Use Attribute" numbers (the numbers behind the WWW interface pick-list names) to be mapped to the actual field names in each particular document collection.
  • Each collection of documents can have various different mapping files.
  • These mapping files are specified in the sapi.ini configuration file.
  • Examples for use with the Isearch doctype "anzmeta" for ANZMETA DTD XML files

Prepare the metadata collection

There is no need to touch your existing collection of metadata documents unless it is in an old format. This is only an upgrade of software and configuration files. However, every node does need to have XML, HTML and plain text versions of the metadata available for presentation (contact technical assistance if you do not).

A single collection of metadata documents can be indexed and served by two different Z39.50 servers on the same machine. So you can leave your existing ASDD server running and index the collection again with this new software .

All metadata should be available in XML format that should validated against the ANZMETA Document Type Definition (DTD)

  • should have .xml filename extension to allow the Z39.50 server to present the XML file if requested by a capable client

Each type of document collection will be indexed by a different Isearch "doctype".

Index the XML documents

The "Isearch" component of Isite uses software called "doctypes" to read and interpret the XML metadata documents. The doctypes have multiple roles: to index the metadata to create a searchable database, to conduct searching, and to present the results in whatever form that is requested by the client.

Each collection of dataset descriptions has three files for each dataset description ...

  • basename.xml ... the structured metadata in valid XML format
  • basename.html ... the HTML file which is used at presentation time (could have .htm extension)
  • basename.txt ... the plain text file (SUTRS) which is used at presentation time

Below are example Iindex commands to prepare a searchable database of your metadata. You could place the command in a shell script. The examples assume that all documents are in one data directory. If your data is in separate directories then you could use the UNIX commands "find" and "sed" to automatically prepare the list of pathnames to feed into Iindex.

To index a collection of XML files that conform to the ANZMETA Document Type Definition (DTD) ...

# the name for your index database of dataset descriptions
DB_NAME=test1
# run Iindex to parse the XML files
# using the Isearch doctype called "anzmeta" 
$ISITE_BIN/Iindex -d $ISITE_DB/$DB_NAME -t anzmeta -m 4 \
-o fieldtype=$ISITE_BIN/anzlic.fields $ISITE_DATA/*.xml

There are some issues with indexing large collections of documents ...

  • refer to the "Tutorial" in the Isite documentation
  • use the "find" command to feed a list of files to Iindex
  • the -m option tells Iindex how many megabytes of metadata to read into memory for indexing
  • if you use a small value for -m, then you will build a fragmented index (more than one .inx file) that will be slightly slower to search (we have not found this to be a problem)
  • do not use the "optimize" command anymore
  • here is an example indexing command for large collections ...
    find $ISITE_DATA -name '*.xml' -print | \
    $ISITE_BIN/Iindex -d $ISITE_DB/$DB_NAME -t anzmeta -m 8 \
    -o fieldtype=$ISITE_BIN/anzlic.fields -f -
    
  • Be patient, it may take a long time.
  • If you have a large collection, then please discuss your situation with one of the other ASDD node managers (and please provide feedback on your findings).

Start the Z39.50 server

Start the Zserver by issuing the following UNIX command. The server will start up, mount the specified databases, and then listen on the specified port. You could place this command in a shell script.

# start the Z39.50 server and run it in the background
$ISITE_BIN/zserver -i$ISITE_BIN/zserver.ini &

You may also want to redirect the output to a log file if you want detailed connection and searching information for parsing of log files and generating usage reports. The standard log file (specified in zserver.ini) collects only very basic connection information.

# start the Z39.50 server,
# redirect STDOUT and STDERR to be appended to a log file,
# and run zserver in the background
$ISITE_BIN/zserver -i$ISITE_BIN/zserver.ini > zserver-200102.log 2>&1 &

Testing your node

The best way to test your node is to use the Isite program called "zbatch" which allows you to specify a set of queries in a plain text file. Zbatch will connect to the specified server and run the queries sequentially.

See the document Testing Isite ASDD nodes.

Hosted collections

Any one node can also host a collection of geospatial metadata for another organisation. In that way, an organisation that does not have an actual Z39.50 server can appear to be a fully-fledged node of the ASDD. The collection of XML documents and the corresponding presentation documents need to be on the same machine on which the Zserver is running. Simply index the collection of documents using Iindex and define all of the collections in the sapi.ini and zserver.ini configuration files.

Technical assistance

See ASDD Technical coordinating node