Imagine a network of concepts, each concept naming a combination of words (symbols). Imagine searching for words belonging to that concept through a collection of digital documents containing text.
Then each query corresponds to a path through that network, path which reaches a collection of documents.
Then each document found can, essentially, be represented (as an alternative to its current representation as a series of words) as a collection of paths through that network.
This means that, given a refined enough ontology, an arbitrary digital document is formally rewritten according to that ontology through the search process (by collecting and inverting the hits).
This ultimately means that searching means rewriting too.
A consequence of this is that we have, in principle, an alternative solution to semantic authoring (authors using semantic mark-up to write new documents). Another consequence of this is that old, semantically flat, documents can recover semantic depth over time while the public gets involved in this playfully.
Why essays? Why libraries?
I was looking for "fait que ton rêve soit plus long que la nuit", a Vangelis artwork (104 hits on Google), and guess what, I found it on Walmart's website. Yes, Walmart can preserve history too. Yes, drones, this is customer driven history. A search for
site:.fr "fait que ton rêve soit plus long que la nuit" gives no hits. Mer
çci, Bibliothèque nationale de France. And no, the correct spelling,
site:.fr "fais que ton rêve soit plus long que la nuit" is not to be found in BNF either (out of a total of 3 hits on Google).
Acu' f'o 2 ani, când încă plănuiam să vin în România să fac biblioteci digitale pt. public, cumpărai la o gheretă din Brăila un număr dintr'o revistă pe nume "Cultura", și găsii un pamflet mic, ca o manea cu suedeze, la adresa netului și cum că ce pățim dacă ne băgăm nasu'ntr'așa un lucru neîngeresc, semnat Pleșu.
Cum, tot atunci, aveam deja vârsta de netizen de 9 anișori, rămăsei nemulțumit că articlu' d'lui Pleșu semăna la conținut cu zisa unei tanti din Pittsburgh, fostă coafeză prin Franța, măritată dup'un american de succes, ce'și deschidea o galerie de artă, de plictiseală, în Lawrenceville (un cartier mai ieftin din fostu' oraș): am auzit că Internetu' dă boală mintală; deși semnataru' fu' ministru cultural cam atunci când netu' lipea deja oameni de pe'aiurea, Românica inclusiv.
Acu' recidivează cu o notă inspirată d'un coleg de vârstă mentală, cu aceeași trepidație molcomă cu care elefanții se recunosc și s'adună fund în fund cu fața la necunoscut.
Recunosc ticul verbal al intelectualului pe ștat (surtucar, zicea cineva) ce nu'și asumă de fapt nimic: "Se intră, fără violență, în spațiul tău intim, se lucrează, în filigran, ..." "Nu se urmărește ceva anume. Se încurajează, pur și simplu, o nonșalanță fără chip...". Impersonalul "se" având rol de ridicare a mediocrității la rang de ambiguu. Cine dom'ne? Care, mă rog? Supt scuza, dacă nu acoperirea, că'i notă berlineză, autoru'și scrie singur pe frunte că n'a pășit în secolu' prezent și se prezintă cu totul ridicol netizenilor.
E un anumit regret, comun cu al altora obișnuiți să predice la gură cască și mai ales tăcută, că pleacă lumea de la taraba lui; că netu'i lasă omului, așa cum e el, oarecare, loc să și scrie nu doar s'asculte.
Netu'i locu'n care omu'i încurajat să fie conștient de el însuși: e lăsat să observe și să se observe, poate asculta și poate replica, poate veni și pleca fără să plătească juma' de salariu p'o carte sau fără să fie obligat să bea suc în picioare în fața librăriei, la te miri ce eveniment de boit profanu'n sacru.
Pe scurt, d'le Pleșu, netizenu' trăiește că a fi intelectual nu e o profesie, e condiția umanului. Orchestra matale n'a cântat cu'atâta har să se compare cu ce'i Netu', și asta nu doar pt. c'a trebuit să'și care și scaunele de la o expresie la alta.
Și mai simplu: blogul e un spațiu cultural cu slabă rezistență la individ și cu măcar un ordin de mărime mai bine pregătit pentru comunicare bidirecțională (publicare, impact, reacții) decât structurile din jurul editurilor-tipografiilor (ce'i drept, astea din urmă's mai potrivite pt. băut cafea împreună cu amicii, aceiași de decenii).
I see the web more explicitly split into the machine-friendly and human-friendly layers, whereas in-between an on-the-fly cross-referencing mechanism is in place.
A searching session: a user sets some parameters for a web metric, thus tuning his subjective view of the web landscape (e.g. a metric for strong/weak concept relationship).
After the user gets a landscape as a result of searching, he can mark areas of interest (based on that subjective web metric) thus generating a structured annotation which may get stored as a web service for others, he can also classically annotate (attach resources to objects found in that landscape).
Objects found in a web landscape result are marked as machine processable or human viewable. The user has a web client he can use as a visual tool to assemble objects in the landscape, and, if they are fully machine processable, he can hand the model to a machine. The assembly can be publicly saved as a web service also, and can be incomplete (partially machine processable), along with a view of it.
Ease of storing web services may be one way to solve, in principle, the problem of permanent resource locators (by saving the instances instead of references to component objects along with an equivalence relationship, which tells a retriever that 'this instance' is equivalent to the referenced one he was searching for).
The above imply a read-write web, where writing means, beyond editing text, creating web-services (or creating meaningful structures).
The future web is, in my view, made of human viewable entities (symbols with meaning proper for human receptors), some of which are marked as machine processable (objects with a machine semantics description), which can be located, assembled, annotated, saved and shared as web-services.
This piece was reachable on my website at a different URL, since 1999, but I just decided to move it here.
Abstract (of a talk at IUK99), 1999, March 24
A possible way of maintaining validity and accessibility of references in an electronically published scientific document, is presented. A definition of Active Brokers Network, as a technical mean of maintaining valid links from the document containing the reference towards the referenced documents, is tried. Such tools would enable diachronic publishing.
Science is built step by step starting from an idea, axiom, reasoning or empiric result. So, a further step must reference the earlier one(s). That is, an actual reader of a scientific paper (A) must have access to the context, foundation, premises and details of what is read, therefore (A) contains references which must remain valid over time if (A) and the citations used in it are to remain a comprehensive document.
Many of us are still spending a lot of time for searching/piling cited papers in order to cover the most of the subject in a read paper. An Internet based solution for citing/referencing has naturally a dynamic character as opposed to the printed paper containing static pointers to other sources of information.
We read some (Bx) electronically stored documents and want to cite them when writing an (A) document. We can do that by HTML but what if one/all of those (Bx) documents are moved in another public place? A cited scientific paper (B) maintains its validity as a referenced document if it remains accesible and its content doesn't change over time. How can we keep the validity of the reference?
Some partially active solutions of keeping it accessible might be:
* (HTML) (B) leaves traces and the reader of (A) tracks them from the initial location and updates the reference to (B) to its final location; this is obviously inefficient because traces can be erased, or can become very long.
* (XML) (A) contains multiple references to the already mirrored (B); we encounter the same inefficiency because of the static character of the references.
Using entirely active solutions might be more promising, so where should we implement the âactiveâ? mechanism: in (A), in (B) or at a point between them? (A) would make traffic by checking the location of (B) periodically, (B) doesnât care/know about (A) therefore we need a middlepoint activation: an Active Brokers Network (ABN) to which (B) beeps when itâs moving. (A) declares its own properties to an AB point (ABP) and cites the (potentially different) ABP to which (B) has already declared its properties.
The properties (A) should declare to the ABP might be the classical ones: title, author, date of creation, keywords, its (ABN assigned) Unique Identifier (UID), and the ABN UIDs of the cited documents.
Therefore the ABN is a distributed database of (the above) properties which should relate UIDs to the (older) referenced UIDs. A query on it should reveal paths through series of scientific articles having in common a property. Obviously, references point backwards in time, this feature may be used as a part of the validation criteria of an electronic reference or/and as a simplifying constraint of searching.
Actually, a new (A) document should look this way as a folder, containing, beside its internal objects (texts, equations, data tables, scripts, graphics..) other folders representing the referenced (B) documents and so on recursively. That is, every (A) becomes a virtual and distributed file system, based on ABN.
The addition of a document to an ABN may be done through an enhancement to the current operating systems, e.g. adding an active âpublic/local/privateâ? property beside the passive ârwxâ? ones. When a document gets the âpublicâ? attribute, the operating system managing it should beep to the closer ABP its creation/movement.
The ABPs could implement various mechanisms such as obsoletingof its documents (but better not if we want a history of science, once a scientific document is made âpublicâ? it must be frozen and cited as it is), or extinction: a document which is not cited at all for a long duration (say, ten years) should dissapear (this way, scientific garbage is thrown out simply by the Time).
Possible results of implementing the Active Brokers Network:
o boneless, flexible documents (Octopus) which can incorporate knowledge related to a subject over time, i.e. diachronic publishing. The Octopus document becomes a continuously growing monograph written by several authors and thus it tends to form itself as an exhaustive/comprehensive unit of scientific knowledge.
o a keyword search in such a distributed database would reveal paths (unidirectional in time) with referenced/referring papers creating ad-hoc dynamic documents structured and focused on a specific scientific subject. These things bring to mind the percolation and all the mathematics/physics associated with random walks allowing a deeper formal understanding of information retrieval and structured knowledge.
o a search by keyword AND author may reveal true schools of thought in science.
o the refereeing process of (B) can be done by the authors of (Ax) which cite (B) being interested_in / aware_of its content and not by hidden, possibly indifferent, readers.
o the editor's selection process of a review's issue can be done by simple path selections through these Octopus documents .
Rss feed of the tag