--- texttool-architecture/soft-cgi.tex 2004/01/07 23:46:33 1.6 +++ texttool-architecture/soft-cgi.tex 2004/01/16 11:21:14 1.13 @@ -2,7 +2,7 @@ \label{sec:rec.cgi} \paragraph -On the ECHO server, hte registration of new texts is implemented by +On the ECHO server, the registration of new texts is implemented by means of a cgi script, reg.cgi (archimedes/web/cgi-bin/toc/admin/reg.cgi ). reg.cgi retrieves a metadata file in MPIWG archive metadata format from the entered uri @@ -41,6 +41,7 @@ The input metadata file must have the fo +\paragraph{archimedes object registration} \subsubsection{toc.cgi (display text)} \label{sec:toc.cgi} @@ -48,18 +49,42 @@ The input metadata file must have the fo \paragraph{plan of this section } \begin{enumeration} +\item An overview of toc.cgi architecture \item A walk-through of typical cgi queries for toc.cgi \item An index of cgi parameters and values with short descriptions of function \end{enumeration} -\paragraph{} +\paragraph{Overview of toc.cgi architecture} + +\subparagraph{} toc.cgi is a perl script for displaying collections of xml texts and linking them to related resources such as page-images, morphological analysis, commentaries, dictionaries, etc. It implements generic methods for resource-linking provided by a series of perl modules which are in -turn based mainly on generic tools for xml manipulation and networking +turn based mainly on generic open-source tools for xml manipulation and networking written in C. +\subparagraph{toc.cgi collections--Network transparency} +Each of the collections in toc.cgi is a ``virtual'' collection, that +is, a collection of links or uri's to resources that reside somewhere on an accessible +network, local or remote. + +\subparagraph{toc.cgi collections--remote resources} + +What is at the other end of the link is of no concern to toc.cgi, as +long as the resource referenced by the link meets minimal toc.cgi +requirements--how the resource is actually implemented and exposed is +a matter for the resource provider. The link may, for instance, point +directly to an xml text or it may point to a container which exposes a +particular xml view of an underlying resource that is perhaps not in +xml format at all. + + +\subparagraph{resource registry} + + + + \paragraph{cgi parameters -- standard queries} \htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus } @@ -116,6 +141,34 @@ get a page of text from a work from defa \subsubsection{Indexing} \label{sec:indexing} +\paragraph{Status quo ECHO} +Currently indexing is not implemented on the ECHO server. + +\paragraph{Plan ECHO} + +\begin{enumeration} +\item construct remote (141.14.236.86) index for each file at + per-change or daily intervals +\item store indices locally in +archimedes/data/db/PROJECT_NAME/CORPUS_NAME/WORK +\item 2 progs on server 1. cgi: indexer 2. backend da_remote +\item 2 progs on client 1. cgi: sendindex 2. backend getindex +\item indexing transaction handled by two cgi scripts, one on the + server the other on the client [this is the 1st implementation bcs + its easiest and there are no port issues, but probably it'd be + better to have a separate port]. +\item client cgi: getindex -- sends 1. list of files to index + 2. uri to which xml notification of completion is to be sent. Upon + notification, activates backend prog that fetches and installs the + indices. +\item server cgi: indexer receives filelist and notification + addess. Activates backend that fetches files, indexes, places + completed indexes in a networked location, then sends xml + notification back to client. +\item single script provides backend access to indices +\item leave front-end issues like display, collection and navigation + to web-design programmers. Do only a sample for now. +\end{enumeration} \subsubsection{Morphology} \label{sec:morphology} @@ -125,6 +178,30 @@ get a page of text from a work from defa \label{sec:dictionary-server} +\subsubsection{helper programs} + +\paragraph{addarch.pl ARCHIMEDES} + +Automatically registers new texts as toc.cgi objects when they appear in +cvs. Automatically updates relevant morphological indices (slow!) each +time a cvs update occurs. This program is called by a hook in the cvs +``loginfo'' configuration file. + + +\paragraph{makelemma.pl ARCHIMEDES} + +Updates lemmatization indices. +Parameters: +No parameter--update all lemmatization indices +[latin | ital | greek | en | nl | de]-- update this language + +\paragraph{makefast.pl ARCHIMEDES} + +Updates the toc.cgi morphology indices +Parameters +No parameter--update all lemmatization indices +[latin | ital | greek | en | nl | de]-- update this language + %%% Local Variables: %%% mode: latex %%% TeX-master: "texttools"