The main work done during February was in the design area, in particular the design of the interaction model used for the cooperation and the design of a preliminary communication protocol.
For the prototype versions of explorer ants, it was felt that there would be benefits to separating the explorer process(es) from a coordinating process. Essentially, this means breaking up the task of exploring cooperatively into two distinct tasks: exploring and coordinating. The design chosen for this prototype uses a master-slave model of interaction, in which a single master process (the coordinator) coordinates the efforts of numerous slave processes (the explorers):
When a slave process wishes to explore (retrieve and index) a document, it first asks its master for permission. The master checks its records and, if the document is unexplored, grants permission to the slave and marks the document as explored.
This model alone assumes that there can be a single master coordinating all exploring efforts. This is, of course, unrealistic. To account for this, it is possible for a master process to communicate with other master processes:
This inter-master communication fits into the model during the permission check step. When a slave asks for permission to explore document, a master should not only check its own records, but it should also ask any other masters it is aware of. Conveniently, since the question ("Has this document been explored?") is the same whether it is being asked by a slave or by a master, no additional mechanisms should be required in the protocol for this step.
A preliminary, very simple, (as yet unnamed) protocol has been designed for the communication between the various processes. Each communication consists of a single transation of one of the following types:
CLAIMurl", where url is fully specified. This request asks the question "May I explore this document?" A master may respond in one of several ways:
OKAY" meaning "It is okay to explore that document."
NO" meaning "It is not okay to explore that document (for some reason)."
WAIT" meaning "It is not okay to explor that document now, but it might be later (ask again)"
ASSIGNMENT?". This request asks the question "What document may I explore?" A slave may send such as a request if it runs out of documents to explore. A master may respond in one of several ways:
ASSIGNMENTurl" meaning "You may explore this document."
NO" meaning "I have no way of providing an assignment (for some reason)."
WAIT" meaning "I don't have any assignments now, but I might later (ask again)"
WAITmessage and then deny permission for the next valid claim request, instead earmarking that document for the assignmentless slave when it asks again.
Some initial work has been done on a prototype "master" in both C and Perl. At the moment, it looks like Perl is going to be the implementation language of choice, although the socket programming is cleaner in C.
I will be attending the Third International WWW Conference in Darmstadt, Germany in April to participate in the workshop on indexing the web.
There will be a short piece on WebAnts in the March issue of CMC Magazine. I've seen the piece and it just gives an overview of the project and mentions that it is now off the backburner due to funding from TI.
John December (editor of CMC magazine) is writing a book about the Internet and will be including a short section on WebAnts at the end of the chapter on spiders. I have seen a draft of this section and it is quite complimentary.
There has been an expression of initial interest from someone at Internet Direct in becoming a beta test site once for WebAnts when the time comes for that.