--- texttool-architecture/soft-cgi.tex 2004/01/06 21:34:05 1.4 +++ texttool-architecture/soft-cgi.tex 2004/06/01 13:07:27 1.18 @@ -2,11 +2,15 @@ \label{sec:rec.cgi} \paragraph -On the ECHO server, new texts are registered by means of reg.cgi ( -archimedes/web/cgi-bin/toc/admin/reg.cgi ). reg.cgi retrieves a -metadata file from the entered uri (currently only local paths are -supported ) and constructs from this file a toc.cgi object file (see -below) , which it writes to toc.cgi's data section. [corpus???] +On the ECHO server, the registration of new texts is implemented by +means of a cgi script, reg.cgi +(archimedes/web/cgi-bin/toc/admin/reg.cgi ). reg.cgi retrieves a +metadata file in MPIWG archive metadata format from the entered uri +(currently only local paths are supported ) and constructs from this +file a toc.cgi object file (see below) , which it writes to toc.cgi's +data section. [corpus???] It should be stressed that this is a +registration procedure developed for a particular implementation of +toc.cgi and not a part of the core application. \paragraph reg.cgi takes two parameters, path and show. Path should give the @@ -15,73 +19,143 @@ registered. If ``show'' is set to 1, reg inspection the toc.cgi object file that it has built out of the submitted metadata file. +\paragraph{input metadata file} + +The input metadata file must have the following form + +\begin{verbatim} + + ... + + + + +Mainzer Untergerichtsordnung (von 1534) +anon +1580 + yes + pageimgtif + /mpiwg/online/experimental/echo_DRQEdit_test/anon_Mainz_1580/fulltextDW/mainzugo02_utf8.xml + pb01-presentation/info.xml + + +\end{verbatim} + +\paragraph{archimedes object registration} \subsubsection{toc.cgi (display text)} \label{sec:toc.cgi} \paragraph{plan of this section } -\begin{enumeration} +\begin{enumerate} +\item An overview of toc.cgi architecture \item A walk-through of typical cgi queries for toc.cgi \item An index of cgi parameters and values with short descriptions of function -\end{enumeration} +\item The TOC Perl modules +\end{enumerate} + +\paragraph{Overview of toc.cgi architecture} + +\subparagraph{} +toc.cgi is a perl script for displaying collections of xml texts and +linking them to related resources such as page-images, morphological +analysis, commentaries, dictionaries, etc. It implements generic methods +for resource-linking provided by a series of perl modules which are in +turn based mainly on generic open-source tools for xml manipulation and networking +written in C. + +\subparagraph{toc.cgi collections--Network transparency} +Each of the collections in toc.cgi is a ``virtual'' collection, that +is, a collection of links or uri's to resources that reside somewhere on an accessible +network, local or remote. + +\subparagraph{toc.cgi collections--remote resources} + +What is at the other end of the link is of no concern to toc.cgi, as +long as the resource referenced by the link meets minimal toc.cgi +requirements--how the resource is actually implemented and exposed is +a matter for the resource provider. The link may, for instance, point +directly to an xml text or it may point to a container which exposes a +particular xml view of an underlying resource that is perhaps not in +xml format at all. + + +\subparagraph{resource registry} + + + \paragraph{cgi parameters -- standard queries} -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus } \newline \newline get a listing of corpora -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest } \newline \newline get an xml listing of corpora - -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi } \newline \newline get a listing of works in default corpus -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 } \newline \newline get a listing of works in corpus 1 [default corpus = 0] -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist } \newline \newline get an xml listing of works in default corpus -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 } \newline \newline get an xml listing of works in corpus 1 -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb } \newline \newline get a work from default corpus with thumbnail navbar displayed left -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright } \newline \newline get a work from default corpus with thumbnail navbar displayed right -\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 } +\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 } \newline \newline get a page of text from a work from default corpus +\paragraph{TOC Perl Modules} +\subparagraph{general}The documentation for the Toc Perl Modules is +located in the modules themselves in POD format. The POD is the +definitive documentation for the modules. + +The modules are available to archimedes staff from cvs on the archimedes server at +141.14.236.86:/perseus/cvsroot in the module +/perseus/cvsroot/mpitexts/perl/perllib. To get them, log on to the +archimedes server and use the commandline command: +\begin{verbatim} + cvs -d /perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib +\end{verbatim} + +or from a remote location + +\begin{verbatim} + bash; export CVS_RSH=ssh; cvs -d :ext:myusername@141.14.236.86:/perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib +\end{verbatim} - -\subsubsection{Indexing} -\label{sec:indexing} - +\input{soft-search} \subsubsection{Morphology} \label{sec:morphology} @@ -91,6 +165,56 @@ get a page of text from a work from defa \label{sec:dictionary-server} +\subsubsection{helper programs} + +\paragraph{addarch.pl ARCHIMEDES} + +Automatically registers new texts as toc.cgi objects when they appear in +cvs. Automatically updates relevant morphological indices (slow!) each +time a cvs update occurs. This program is called by a hook in the cvs +``loginfo'' configuration file. + + +\paragraph{makelemma.pl ARCHIMEDES} + +Updates lemmatization indices. +Parameters: +No parameter--update all lemmatization indices +[latin | ital | greek | en | nl | de]-- update this language + +\paragraph{makefast.pl ARCHIMEDES} + +Updates the toc.cgi morphology indices +Parameters +No parameter--update all lemmatization indices +[latin | ital | greek | en | nl | de]-- update this language + +\subsubsection{summary of differences btwn the archimedes toc.cgi + implementation and the echo toc.cgi impelementation (toc.x.cgi)} + +\paragraph{missing in archimedes} +\begin{enumerate} + +\item html templates (coded but phased out of cvs branch) +\end{enumerate} + +\paragraph{missing in echo} +\begin{enumerate} + +\item word-coloring? +\item remote text method may work differently + + + +\end{enumerate} +\paragraph{differences} +\begin{enumerate} +\item structure of info.xml +\item resource-discovery algorithm for info.xml +\end{enumerate} + + + %%% Local Variables: %%% mode: latex %%% TeX-master: "texttools"