|
|
| version 1.17, 2004/04/03 17:19:33 | version 1.20, 2004/08/16 22:34:04 |
|---|---|
| Line 155 or from a remote location | Line 155 or from a remote location |
| bash; export CVS_RSH=ssh; cvs -d :ext:myusername@141.14.236.86:/perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib | bash; export CVS_RSH=ssh; cvs -d :ext:myusername@141.14.236.86:/perseus/cvsroot co /perseus/cvsroot/mpitexts/perl/perllib |
| \end{verbatim} | \end{verbatim} |
| \subsubsection{Indexing} | \input{soft-search} |
| \label{sec:indexing} | |
| \paragraph{Status quo ECHO} | |
| Currently indexing is not implemented on the ECHO server. | |
| \paragraph{Plan ECHO} | |
| \begin{enumerate} | |
| \item construct remote (141.14.236.86) index for each file at | |
| per-change or daily intervals | |
| \item store indices locally in | |
| \url{archimedes/data/db/PROJECT_NAME/CORPUS_NAME/WORK} | |
| \item 2 progs on server 1. cgi: \url{indexer} 2. backend \url{da_remote} | |
| \item 2 progs on client 1. cgi: \url{sendindex} 2. backend \url{getindex} | |
| \item indexing transaction handled by two cgi scripts, one on the | |
| server the other on the client [this is the 1st implementation bcs | |
| its easiest and there are no port issues, but probably it'd be | |
| better to have a separate port]. | |
| \item client cgi: getindex -- sends 1. list of files to index | |
| 2. uri to which xml notification of completion is to be sent. Upon | |
| notification, activates backend prog that fetches and installs the | |
| indices. | |
| \item server cgi: indexer receives filelist and notification | |
| addess. Activates backend that fetches files, indexes, places | |
| completed indexes in a networked location, then sends xml | |
| notification back to client. | |
| \item single script provides backend access to indices | |
| \item leave front-end issues like display, collection and navigation | |
| to web-design programmers. Do only a sample for now. | |
| \end{enumerate} | |
| \subsubsection{Morphology} | \subsubsection{Morphology} |
| \label{sec:morphology} | \label{sec:morphology} |
| Line 215 No parameter--update all lemmatization i | Line 185 No parameter--update all lemmatization i |
| \paragraph{makefast.pl ARCHIMEDES} | \paragraph{makefast.pl ARCHIMEDES} |
| Updates the toc.cgi morphology indices | Updates the toc.cgi morphology indices |
| Parameters | Parameters: |
| No parameter--update all lemmatization indices | No parameter--update all lemmatization indices |
| [latin | ital | greek | en | nl | de]-- update this language | [latin | ital | greek | en | nl | de]-- update this language |
| \subsubsection{summary of differences btwn the archimedes toc.cgi | The indices are produced from the corpus word index 'xml:raw:norm', |
| implementation and the echo toc.cgi impelementation (toc.x.cgi)} | which correlates raw forms to normalized forms, and |
| '\$lang:inc_lemma', which correlates incidentia to lemmata. The basic | |
| rule is, if exists \$raw->\$norm->\$inc_lemma, then \$raw is included | |
| in the 'fast' index for that language. | |
| Currently stores the indices with the name xml:hit:\$lang, where | |
| \$lang is one of [ital,greek,latin,de,en,fr,nl] in the directory | |
| /usr/share/perlobjects/wordindex in Archim::Object::Depot format | |
| (Storable). Access to these indices is provided by | |
| Archim::Toc::Utils->get_hits_hash(\$lang) . | |
| The functionality of makefast.pl is duplicated by Archim::Toc::Index->make_fast_lemma(\$lang); | |
| \subsubsection{summary of differences btwn the archimedes toc.cgi implementation and the echo toc.cgi impelementation (toc.x.cgi)} | |
| \paragraph{missing in archimedes} | \paragraph{missing in archimedes} |
| \begin{enumerate} | \begin{enumerate} |
| Line 235 No parameter--update all lemmatization i | Line 219 No parameter--update all lemmatization i |
| \item remote text method may work differently | \item remote text method may work differently |
| \end{enumerate} | \end{enumerate} |
| \paragraph{differences} | \paragraph{differences} |
| \begin{enumerate} | \begin{enumerate} |