version 1.4, 2004/01/06 21:34:05
|
version 1.19, 2004/08/16 22:15:28
|
Line 2
|
Line 2
|
\label{sec:rec.cgi} |
\label{sec:rec.cgi} |
|
|
\paragraph |
\paragraph |
On the ECHO server, new texts are registered by means of reg.cgi ( |
On the ECHO server, the registration of new texts is implemented by |
archimedes/web/cgi-bin/toc/admin/reg.cgi ). reg.cgi retrieves a |
means of a cgi script, reg.cgi |
metadata file from the entered uri (currently only local paths are |
(archimedes/web/cgi-bin/toc/admin/reg.cgi ). reg.cgi retrieves a |
supported ) and constructs from this file a toc.cgi object file (see |
metadata file in MPIWG archive metadata format from the entered uri |
below) , which it writes to toc.cgi's data section. [corpus???] |
(currently only local paths are supported ) and constructs from this |
|
file a toc.cgi object file (see below) , which it writes to toc.cgi's |
|
data section. [corpus???] It should be stressed that this is a |
|
registration procedure developed for a particular implementation of |
|
toc.cgi and not a part of the core application. |
|
|
\paragraph |
\paragraph |
reg.cgi takes two parameters, path and show. Path should give the |
reg.cgi takes two parameters, path and show. Path should give the |
Line 15 registered. If ``show'' is set to 1, reg
|
Line 19 registered. If ``show'' is set to 1, reg
|
inspection the toc.cgi object file that it has built out of the |
inspection the toc.cgi object file that it has built out of the |
submitted metadata file. |
submitted metadata file. |
|
|
|
\paragraph{input metadata file} |
|
|
|
The input metadata file must have the following form |
|
|
|
\begin{verbatim} |
|
<resource> |
|
... |
|
<meta> |
|
<meta> |
|
<bib type=''Book''> |
|
|
|
<title>Mainzer Untergerichtsordnung (von 1534)</title> |
|
<author>anon</author> |
|
<year>1580</year> |
|
<texttool><display>yes</display> |
|
<image>pageimgtif</image> |
|
<text>/mpiwg/online/experimental/echo_DRQEdit_test/anon_Mainz_1580/fulltextDW/mainzugo02_utf8.xml</text> |
|
<pagebreak>pb</pagebreak><presentation>01-presentation/info.xml</presentation></texttool></meta> |
|
|
|
</meta> |
|
\end{verbatim} |
|
|
|
\paragraph{archimedes object registration} |
|
|
\subsubsection{toc.cgi (display text)} |
\subsubsection{toc.cgi (display text)} |
\label{sec:toc.cgi} |
\label{sec:toc.cgi} |
|
|
\paragraph{plan of this section } |
\paragraph{plan of this section } |
|
|
\begin{enumeration} |
\begin{enumerate} |
|
\item An overview of toc.cgi architecture |
\item A walk-through of typical cgi queries for toc.cgi |
\item A walk-through of typical cgi queries for toc.cgi |
\item An index of cgi parameters and values with short descriptions of function |
\item An index of cgi parameters and values with short descriptions of function |
\end{enumeration} |
\item The TOC Perl modules |
|
\end{enumerate} |
|
|
|
\paragraph{Overview of toc.cgi architecture} |
|
|
|
\subparagraph{} |
|
toc.cgi is a perl script for displaying collections of xml texts and |
|
linking them to related resources such as page-images, morphological |
|
analysis, commentaries, dictionaries, etc. It implements generic methods |
|
for resource-linking provided by a series of perl modules which are in |
|
turn based mainly on generic open-source tools for xml manipulation and networking |
|
written in C. |
|
|
|
\subparagraph{toc.cgi collections--Network transparency} |
|
Each of the collections in toc.cgi is a ``virtual'' collection, that |
|
is, a collection of links or uri's to resources that reside somewhere on an accessible |
|
network, local or remote. |
|
|
|
\subparagraph{toc.cgi collections--remote resources} |
|
|
|
What is at the other end of the link is of no concern to toc.cgi, as |
|
long as the resource referenced by the link meets minimal toc.cgi |
|
requirements--how the resource is actually implemented and exposed is |
|
a matter for the resource provider. The link may, for instance, point |
|
directly to an xml text or it may point to a container which exposes a |
|
particular xml view of an underlying resource that is perhaps not in |
|
xml format at all. |
|
|
|
|
|
\subparagraph{resource registry} |
|
|
|
|
|
|
|
|
\paragraph{cgi parameters -- standard queries} |
\paragraph{cgi parameters -- standard queries} |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=corpus } |
\newline |
\newline |
\newline |
\newline |
get a listing of corpora |
get a listing of corpora |
|
|
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpusmanifest } |
\newline |
\newline |
\newline |
\newline |
get an xml listing of corpora |
get an xml listing of corpora |
|
|
|
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi } |
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi } |
|
\newline |
\newline |
\newline |
\newline |
get a listing of works in default corpus |
get a listing of works in default corpus |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?corpus=1 } |
\newline |
\newline |
\newline |
\newline |
get a listing of works in corpus 1 [default corpus = 0] |
get a listing of works in corpus 1 [default corpus = 0] |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist } |
\newline |
\newline |
\newline |
\newline |
get an xml listing of works in default corpus |
get an xml listing of works in default corpus |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?step=xmlcorpuslist;corpus=1 } |
\newline |
\newline |
\newline |
\newline |
get an xml listing of works in corpus 1 |
get an xml listing of works in corpus 1 |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=baifl_renav_006_la_1537;step=thumb } |
\newline |
\newline |
\newline |
\newline |
get a work from default corpus with thumbnail navbar displayed left |
get a work from default corpus with thumbnail navbar displayed left |
|
|
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=thumb;ftype=thumbright } |
\newline |
\newline |
\newline |
\newline |
get a work from default corpus with thumbnail navbar displayed right |
get a work from default corpus with thumbnail navbar displayed right |
|
|
\htmladdnormallink{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 }{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 } |
\url{ http://archimedes.mpiwg-berlin.mpg.de/cgi-bin/toc/toc.cgi?dir=jorda_ponde_050_la_1533;step=textonly;corpus=;page=22 } |
\newline |
\newline |
\newline |
\newline |
get a page of text from a work from default corpus |
get a page of text from a work from default corpus |
|
|
|
|
|
\paragraph{TOC Perl Modules} |
|
\subparagraph{general}The documentation for the Toc Perl Modules is |
|
located in the modules themselves in POD format. The POD is the |
|
definitive documentation for the modules. |
|
|
|
The modules are available to archimedes staff from cvs on the archimedes server at |
|
141.14.236.86:/perseus/cvsroot in the module |
|
/perseus/cvsroot/mpitexts/perl/perllib. To get them, log on to the |
|
archimedes server and use the commandline command: |
|
\begin{verbatim} |
|
cvs -d /perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib |
|
\end{verbatim} |
|
|
|
or from a remote location |
|
|
|
\begin{verbatim} |
|
bash; export CVS_RSH=ssh; cvs -d :ext:myusername@141.14.236.86:/perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib |
|
\end{verbatim} |
|
|
|
\input{soft-search} |
\subsubsection{Indexing} |
|
\label{sec:indexing} |
|
|
|
|
|
\subsubsection{Morphology} |
\subsubsection{Morphology} |
\label{sec:morphology} |
\label{sec:morphology} |
Line 91 get a page of text from a work from defa
|
Line 165 get a page of text from a work from defa
|
\label{sec:dictionary-server} |
\label{sec:dictionary-server} |
|
|
|
|
|
\subsubsection{helper programs} |
|
|
|
\paragraph{addarch.pl ARCHIMEDES} |
|
|
|
Automatically registers new texts as toc.cgi objects when they appear in |
|
cvs. Automatically updates relevant morphological indices (slow!) each |
|
time a cvs update occurs. This program is called by a hook in the cvs |
|
``loginfo'' configuration file. |
|
|
|
|
|
\paragraph{makelemma.pl ARCHIMEDES} |
|
|
|
Updates lemmatization indices. |
|
Parameters: |
|
No parameter--update all lemmatization indices |
|
[latin | ital | greek | en | nl | de]-- update this language |
|
|
|
\paragraph{makefast.pl ARCHIMEDES} |
|
|
|
Updates the toc.cgi morphology indices |
|
Parameters: |
|
No parameter--update all lemmatization indices |
|
[latin | ital | greek | en | nl | de]-- update this language |
|
|
|
Currently stores the indices with the name xml:hit:\$lang, where \$lang is one of |
|
[ital,greek,latin,de,en,fr,nl] in the directory |
|
/usr/share/perlobjects/wordindex in Archim::Object::Depot format |
|
(Storable). Access to these indices is provided by |
|
Archim::Toc::Utils->get_hits_hash(\$lang) |
|
|
|
|
|
\subsubsection{summary of differences btwn the archimedes toc.cgi implementation and the echo toc.cgi impelementation (toc.x.cgi)} |
|
|
|
\paragraph{missing in archimedes} |
|
\begin{enumerate} |
|
|
|
\item html templates (coded but phased out of cvs branch) |
|
\end{enumerate} |
|
|
|
\paragraph{missing in echo} |
|
\begin{enumerate} |
|
|
|
\item word-coloring? |
|
\item remote text method may work differently |
|
|
|
|
|
|
|
\end{enumerate} |
|
\paragraph{differences} |
|
\begin{enumerate} |
|
\item structure of info.xml |
|
\item resource-discovery algorithm for info.xml |
|
\end{enumerate} |
|
|
|
|
|
|
%%% Local Variables: |
%%% Local Variables: |
%%% mode: latex |
%%% mode: latex |
%%% TeX-master: "texttools" |
%%% TeX-master: "texttools" |