| 1 | {{{ |
| 2 | #!html |
| 3 | |
| 4 | <h1>Beyond Browsing: The Web of the Future</h1> |
| 5 | |
| 6 | <p>The current paradigm of the web — in which the user |
| 7 | <i>browses,</i> leaving behind a clicktrail that is of interest |
| 8 | primarily to marketers — falls far short of the needs of |
| 9 | scientists and scholars. <i>Browsing</i> the web is scarcely more |
| 10 | interactive than <i>surfing</i> television channels. True |
| 11 | interactivity — which will allow the web finally to achieve its |
| 12 | potential as a medium for scholarly, political, and social dialogue |
| 13 | |
| 14 | — demands something other than the current browser/server |
| 15 | paradigm. New tools will be needed, whose developers recognize that |
| 16 | information <i>consumers</i> are also information <i>producers.</i> |
| 17 | Scholarship is an inherently recursive activity, in that the scholar |
| 18 | uses <i>existing</i> scholarship to produce <i>new</i> scholarship. |
| 19 | Knowledge undergoes a process of accretion, akin to the formation of a |
| 20 | pearl; one exemplary model is a page of the <a |
| 21 | href="http://ccat.sas.upenn.edu/rs/2/Judaism/talmud.html" |
| 22 | target="_blank">Talmud</a>, on which there is a hierarchical |
| 23 | arrangement of commentary, super-commentary, annotation, and |
| 24 | cross-reference that spreads from center to margin.</p> |
| 25 | |
| 26 | <p>Information production is possible within the web browser of today |
| 27 | — using <a href="http://en.wikipedia.org/wiki/Wiki" |
| 28 | target="_blank">Wikis</a>, or content management systems such as <a |
| 29 | href="http://www.zope.org/" target="_blank">Zope</a>. But these tools |
| 30 | seem like primitive intruders in an environment that was engineered |
| 31 | primarily for publication. True interactivity demands a new tool: not |
| 32 | a <i>browser,</i> but an <i>interagent.</i> With these ideas in mind, |
| 33 | for the past few years, the <a href="http://www.mpiwg-berlin.mpg.de/" |
| 34 | target="_blank">Max-Planck-Institut für |
| 35 | Wissenschaftsgeschichte</a> has been developing, in collaboration with |
| 36 | |
| 37 | <a href="http://www.harvard.edu/" target="_blank">Harvard |
| 38 | University</a>, a prototype interagent called <i>Arboreal.</i> |
| 39 | Arboreal allows for flexible, non-linear navigation of arbitrary XML |
| 40 | documents and for granular annotation of these documents down to the |
| 41 | word- or term-level. Annotations themselves are XML data, which can be |
| 42 | shared, published, and further annotated in turn.</p> |
| 43 | |
| 44 | <p>Natural language is the primary means by which humans communicate |
| 45 | — though it is supplemented, of course, by formal languages and |
| 46 | other symbolic systems and by pictures and other audio-visual media. |
| 47 | Yet today's web browsers provide only the crudest tools to support |
| 48 | natural language documents. Most linguistic support in browsers is |
| 49 | focused on visual <i>presentation</i> of text in some writing system. |
| 50 | Even in this area, the technology comes up short: what browsers can |
| 51 | properly render Chinese or Mongolian in their traditional vertical |
| 52 | layouts or can adequately deal with Japanese ruby?<sup><a |
| 53 | href="#fn1">1</a></sup><a name="m1"></a> Beyond display, browsers also |
| 54 | typically allow for the <i>searching</i> of text — but again |
| 55 | only in an unsophisticated and inflexible way, which is of limited |
| 56 | value even for most western European languages, and is thoroughly |
| 57 | inadequate for highly inflected languages or languages written in |
| 58 | complex scripts.</p> |
| 59 | |
| 60 | <p>Tomorrow's interagents must provide more sophisticated linguistic |
| 61 | capabilities: language technology must be available from within the |
| 62 | interagent. Yet this is not to say that language technology should be |
| 63 | <i>built in</i> to the interagent; such a monolithic approach can only |
| 64 | fail users from complex and diverging linguistic, ethnic, and |
| 65 | professional backgrounds, and with equally heterogeneous needs and |
| 66 | interests. Rather, a <i>services-oriented</i> architecture is needed, |
| 67 | in which the interagent can communicate with linguistic web services |
| 68 | (via, for example, <a href="http://www.xmlrpc.com/" |
| 69 | target="_blank">XML-RPC</a> or <a |
| 70 | href="http://www.w3c.org/2000/xp/Group/" target="_blank">SOAP</a>) and |
| 71 | dynamically acquire new linguistic behaviors (via the dynamic |
| 72 | class-loading mechanisms offered by frameworks such as <a |
| 73 | href="http://java.sun.com/" target="_blank">Java</a> or <a |
| 74 | href="http://www.microsoft.com/net/" target="_blank">.NET</a>). Again, |
| 75 | Arboreal has been designed to implement these techniques: via web |
| 76 | services it can acquire morphological data that allow for lemmatized |
| 77 | searching, lexicon lookup, and other language-based functions. In |
| 78 | addition, it supports pluggable language behaviors that allow for |
| 79 | dynamic transliteration of writing systems such as Arabic, Greek, and |
| 80 | Chinese and for orthographically-normalized searching that renders the |
| 81 | spelling peculiarities of (e.g.) early modern texts transparent to the |
| 82 | user. These are critical and basic functions that will serve as the |
| 83 | foundation in the future for a richer set of facilities, including |
| 84 | term and keyword discovery, language-neutral searching based on |
| 85 | concepts rather than words, automatic summarization, and sophisticated |
| 86 | semantic linking.</p> |
| 87 | |
| 88 | <hr> |
| 89 | <h2>Notes</h2> |
| 90 | <ol> |
| 91 | <li><a name="fn1"></a>Cf. Y. Haralambous, <q>Unicode et typographie: |
| 92 | un amour impossible?</q>, <cite>Document Numérique</cite> 3/4 |
| 93 | (2002), 105-37. Ruby annotation is dealt with in a W3C recommendation: |
| 94 | |
| 95 | <a href="http://www.w3.org/TR/ruby" |
| 96 | target="_blank">http://www.w3.org/TR/ruby</a>. [<a href="#m1">main |
| 97 | text</a>] |
| 98 | </ol> |
| 99 | }}} |