Web Scholars

Simon Kampa, Dr. Les Carr
{srk98r, lac}@ecs.soton.ac.uk
Department of Computer Science, University of Southampton


The scholarly community has minimal support on the Internet to accomplish day-to-day tasks of discourse and literature review. Aside from e-journals and some prototype discourse tools, scholars are offered few practical facilities. Consequently they remain resolute to using traditional and less effective methods. However, the Internet has the potential to act as an invaluable instrument for traversing the scholarly world, not only by providing the mechanism to readily provide access to a wealth of scholarly information, but also by adding rich linking and query services. This should not only include knowledge about the articles, but in depth information about the entire scholarly community, such as the authors, conferences, journals, institutes, activities and committees involved as well as their characteristics and relationships to other objects.

This article gives a brief overview of a prototype system that provides a rich, semantic linking and query environment aimed at the scholarly community. Ontologies are used to model the scholarly community and a series of agents provide the query and linking services. The resource description framework (RDF) is named as a possible mechanism for adding scholarly semantic information to web pages.


Scholarly publication is the building block of the entire scholarly community. Its articles enable scholars to present their thoughts and claims to a large community of researchers. Following publication, debate ensues where fellow researchers refute, support and modify the original ideas by publishing further papers, citing the original paper(s).

This iterative process of debate is time consuming due to the lengthy delay between journal publications and conference proceedings, and thereby reduce scholars' interest on a particular subject. The Internet makes this process immediate and effective, by making publications rapidly available to a larger audience (through e-prints and e-journals). The WWW also provides the infrastructure for linking and querying, to enable scholars to easily research and understand the entire literature and the research community that produced it.

Citations play a prominent role in scholarly debate and have been "the way researchers have been interconnecting their writings all along" [3]. They allow scholars to discover related papers, thereby enabl ing a literature-wide survey. Providing links on the web between a paper and its citations is currently an active research area and the importance and example applications have been thoroughly discussed [1,2,3,4,5].

However, citations alone do not provide the level of comprehensiveness required by scholars. Scholars make use of detailed knowledge about authors of articles, as well as the research activities hosted at their institutes and journal and conference information, to enable discovery of pools of related literature. Information on societies, their committees and the funding agencies involved are also part of the scholarly community and therefore of interest.

This complete understanding enables both specific queries (e.g. I want contact details for author X) as well as more interesting analytical information (e.g. Who are the experts?). It would prove impossible to provide such support using the current Internet infrastructure as web documents currently lack semantic knowledge preventing programs from understanding the content of documents. Even after overcoming this, a program must then be able to understand, reason and communicate about all the objects and their relationships within this community.


To enable programs to reason and communicate about the scholars' community it must be formalised. Ontologies provide an excellent method of accomplishing this as complex classes and relationships can be represented. Using notions of concepts, instances, relations, functions and axioms elaborate ontologies are conceptualised that present a domain in greater richness than other taxonomies could. Fundamentally, they allow programs to communicate and reason about knowledge.

Rather than encode the ontology using a knowledge modeling language, the Southampton Framework for Agent Research (SoFAR) [6], was used. SoFAR provides the mechanism to describe and manipule ontologies within an agent framework.

The prototype consists of several agents to monitor the user, broker requests and complete both specific and analytical queries. It assists web scholars by both enriching documents with links to scholarly information, and by providing comprehensive query support. The query interface allows scholars to pose both specific queries, as well as more interesting and useful, deductive queries.


The web scholars' prototype system aims to demonstrate the potential power the Internet has in providing scholars with an invaluable tool to conduct research. By modeling the scholars' community with a sophisticated ontology, agents can understand, reason and communicate about the community. Not only does this allow specific data to be easily extracted, but more interestingly, analytical queries can be answered by using deductive methods.

Although the scholarly data is currently entered manually using several scripts, future work will use RDF to enrich the web pages with semantic information and thereby allow agents to automatically retrieve the necessary knowledge. Output would also be in RDF format to improve portability and flexibility.


  1. Cameron, R. A Universal Citation Database as a Catalyst for Reform in Scholarly Communication. First Monday. 1997. http://www.firstmonday.dk/issues/issue2_4/cameron/.
  2. Chen, C. and Carr, L. Trailblazing the Literature of Hypertext: Author Co-Citation Analysis (1989-1998). Department of Information Systems and Computing, Brunel University, Uxbridge, UK. 1999. http://www.ecs.soton.ac.uk/~lac/ht99.pdf.
  3. Integrating and Navigating Eprint Archives through Citation-Linking. The Open Citation Project Proposal Paper. 2000. http://www.ecs.soton.ac.uk/~lac/COHSE.html.
  4. Hitchcock, S., Carr, S., Harris, S., Hey, J. and Hall, W. Citation Linking: Improving Access to Online Journals. Open Journal Project, MMRG, University of Southampton. 1997. http://journals.ecs.soton.ac.uk/diglib/acmpaper.html.
  5. Hitchcock, S., Carr, S., Quek, F., Witbrock, A., Tarr, I. and Hall, W. Linking Everything to Everything: Journal Publishing Myth or Reality? Open Journal Project, MMRG, University of Southampton. 1997. http://journals.ecs.soton.ac.uk/IFIP-ICCC97.html
  6. Southampton Framework for Agent Research (SoFAR). Open Journal Project, MMRG, University of Southampton. 1997. http://www.ecs.soton.ac.uk/~lavm/sofar/doc/sofar.html.