Previous Up Next

3  Lecture Server Implementation

The lecture server's job is to convert client queries into appropriate lecture index queries, and convert the results of index queries into XML to be sent back to the client. The majority of the work is handled by the a Java class library called the index, which is described in its own section.

The lecture server is a web application, implemented as an Apache Tomcat servlet. A brief summary of web applications and Apache Tomcat is provided for context before going into details about the server itself.

3.1  Web Application Overview

The client uses the HTTP protocol for requests to the server. In HTTP, all requests are initiated from the client, so the server can only provide information to the client in response to a request from the client. The most common HTTP requests are GET and POST requests. In each, the recipient of the request is specifed by a URL. The request can include a number of parameters and values.

When you type a URL to a browser, it performs a GET request. GET requests append the parameters and values to a URL, which means that you can perform your own parameterized GET requests by typing the URLs yourself. Browsers keep the parameters and values of POST requests hidden from site.

HTTP is stateless, which means that there is nothing in HTTP to connect a series of requests together. Web applications that need to maintain state from request to request must implement a mechanism on top of HTTP. Typically, the server will include a special unique session identifier with its first response, and thereafter the client ensures that it includes that unique value with each request. The simplest way to do this is for the server to include a set cookie request with its first response, which will cause the client to send the cookie to the server with each subsequent request. When clients do not enable cookies, the client and server can arrange for the session identifier to be included as a parameter in every request.

3.2  Apache Tomcat

The lecture server makes use of a number of facilities provided by Apache Tomcat. Apache Tomcat is a web server implented in Java. It was originally developed by Sun Microsystems, but was given to the Apache group. Sun continues to provide the specification for the server, and includes a version of the server as part of their Java enterprise software. Sun provides extensive documentation.

Like any web server, Tomcat waits for clients to make HTTP requests, and then responds with some form of content. The response might be the contents of some file, it might be completely computed, or some combination of the two, such as filling in a few values in a template.

All HTTP requests include a URL, which has several components that are used by the web server to determine which content to return. With Tomcat, the host and optional port components of the URL determine which virtual server Tomcat should act like (one server can appear to be many web hosts, and one web host can run on many web servers). Each virtual server has its own configuration, divided into applications, which are associated with the first part of the directory component of the URL. Tomcat applications are deployed (installed) on a virtual server a file called a web archive, or war file.

In addition to its deployed applications, Tomcat web servers also include a number of configuration files, some of which are automatically updated when applications are deployed. The lecture server makes use of the file context.xml to provide parameter values for the lecture server, such as the locations for databases, media, etc. This makes it possible for one web archive to be used on multiple Tomcat servers, and for multiple lecture servers (with different lectures) to be used on one server. The contex.xml file is also used to define the JDBC resource for the transcriber account database.

A web archive is a packaged directory that describes a web application. When the archive name.war is deployed on a server, Tomcat will map name in the beginning of a URL directory to the contents of the archive.

The WEB-INF directory in a web archive contains configuration information for the web application. The lecture server uses the file WEB-INF/web.xml to specify how URLs should be handled. In Tomcat, URLs are handled by servlets, which are Java classes that interpret the request and generate the response. The defualt servlet simply maps the request to a file and sends the contents of the file in the response. Other servlets are specified in the web.xml file, by associated a servlet name with a Java class, and then associating URL patterns with servlet names.

Most servlets in the Lecture Server are generated from Java Server Pages, which are a combination of static content, script, and Java code. These servlets need to appear in the web.xml on the server, but do not appear in the sources because the development environment automatically inserts them. For example, the file names.jsp would be converted into a servlet and the URL pattern */names.jsp would automatically be mapped to the servlet.

Tomcat also provides session management for the web application. By default, every request to the application looks like a new request. Session management allows the web application and the web client to relate a series of requests, either by setting a cookie (if the client permits it) or by adding a unique session-identifying information to requests embedded in the response. There is no notion of “ending a session” in HTTP, so Tomcat tracks how long it has been since the client last made a request. The web.xml file specifies how long session information should be maintained when there are no client requests.

Tomcat provides session variables that applications can use to hold session state. The application can specify a listener class in web.xml whose methods are invoked when the application starts or ends, and when sessions start and end. The listener class can be used to initialize session variables.

3.3  Lecture Server Configuration

All Lecture Server application configuration is done in the server's context.xml file. The lecture server web archive can be deployed under multiple names (by renaming the file), so the parameters are keyed by the application name. The names take have the format appName/paramName. The parameter names are

indexDirectory
The indexDirectory is the directory that contains the text index and structural index, described in the section on the index. If the actual directory is empty, a new empty index will be created when the server is started.
rpmRoot
The rpmRoot is a prefix added to every media URL. This makes it easy to change which web server should be used for the media.
wavRoot
At one time, we allowed for waveform fragments to be returned. The wavRoot specified the directory the should be prefixed to unrooted .wav file names to determine the complete pathname for the .wav file.
mail.host
For demonstration purposes, we have provided simple transcription editing. After an edit, mail is sent about the edit. The mail.host is the SMTP server used for sending the mail.
mail.from
When mail is sent about an edit, mail.from is the name of the sender of the message. This would typically be an administration mailing list.
mail.recipient
When mail is sent about an edit, mail.recpipient is the name of the receiver of the edit. This would typically be a mailing list of interested parties.

An example use in context.xml is show in Figure 14.


<Context>
   ...
  <Parameter
    name="lectures/indexDirectory"
    value="/scratch/segindex"
    description="The location of the lecture index" />
  <Parameter
    name="lectures/rpmRoot"
    value="http://web.sls.csail.mit.edu/lecdata"
    description="Server to use for RealPlayer rpm files" />
  <Parameter
    name="lectures/wavRoot" 
    value="/s/lectures"
    description="Root path for .wav files" />
  <Parameter name="lectures/mail.host"
    value="outgoing.csail.mit.edu"
    description="Host to send mail to" />
  <Parameter name="lectures/mail.from"
    value="lecadmin@csail.mit.edu"
    description="Sender to user for sent mail" />
  <Parameter name="lectures/mail.recipient"
    value="lectrans@csail.mit.edu"
    description="Recipient of sent mail" />
  ...
</Context>
Figure 14: Fragment of context.xml

3.4  Listener Class

The lecture server uses the class Listener as a listener. The contextInitialized method is called by Tomcat when the application is started. This method is used to retrieve the configuration information from context.xml.

The version of Tomcat we are using, Tomcat 5.5, implements version 2.4 of the Servlet specification, which does not provide a way to determine under what name the application was deployed. In other words, it is not possible to determine the prefix to be used for getting parameter values from context.xml. However, version 2.5 of the Servlet specification does provide this ability with the getContextPath method, and, fortunately, Tomcat 5.5 implements this method, so we are able to invoke the method “by hand.”

The values from context.xml are stored as attributes in the servlet context, which holds the application state, so that they can be accessed by servlets. The mail-related parameters are stored in a Properties object in the form needed by a mail class library.

The listener also implements the contextDestroyed method to cleanly close the lecture index when the application is shut down.

3.5  Initial Request

The initial request is sent to index.html, which is returned verbatim to the client. The index.html includes references to a style sheet and a number of static script files described in the section with the browser.

There is one dynamically generated script file, urls.jsp. As mentioned previously, Tomcat relates request to sessions either with cookies or with URLs. The file urls.jsp defines JavaScript variables for all the URLs that the client will use to make requests.


<%@ page contentType="text/javascript; charset=UTF-8" %>
<%@ taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c" %>
<%@ taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="fn" %>
...
<c:url var="lecturesURL" value="lectures.jsp"/>
...
var lecturesURL="${lecturesURL}";
...
Figure 15: urls.jsp Fragment

Figure 15 shows a fragment of the file. The <%@...%> lines are directives to the compiler. The c:url .../> line defines a JSP variable lecturesURL that will contain the URL for lectures.jsp with session identification appended if the browser will not accept cookies. Later, an ordinary JavaScript variable lecturesURL is defined and initialized to the value of the JSP variable (which happens to have the same name). As long as the client makes all requests through URLS from variables defined in urls.jsp, the session information will be provided.

3.6  Queries

Once the client has retrieved the initial page, it will make query requests in response to user queries. All lecture query results are returned as XML. Although the client only uses HTTP POST requests, HTTP GET requests are also handled, and can be useful for debugging and experimentation.

3.6.1  Category Queries

The category query has no parameters, and is used to fill in the list of lecture categories.


<%@ page language="java" contentType="text/xml; charset=UTF-8"
    pageEncoding="UTF-8"%>
<%@ taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c" %>
<%@ taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="fn" %>
<%-- index gets initialized by the browser.Listener --%>
<% try {%>
<c:set var="qe" value="${index.query}"/>
<categories>
  <c:forEach var="category" items="${qe.categories}">
    <category name="${fn:escapeXml(category.name)}"
              categoryid="${fn:escapeXml(category.categoryid)}"/>
  </c:forEach>
</categories>
<%} finally {%>
    <jsp:setProperty name="qe" property="closed" value="true"/>
<%} %>
Figure 16: category.jsp

Figure 16 shows the actual contents of category.jsp, which generates the list of categories. You can get a general idea of how the list of categories is generated from comparing Figures 16 and 1. The rest of this section describes gives a more complete description.

The first line tells the JSP compiler about the page. The contentType is text/xml; charset=UTF-8, which is set by the server in the headers of the response, so that the client knows that the response is XML and the characters are encoded as UTF-8.

The next two lines tell the JSP that two tag libraries are used. Tags whose name begin with c: refer to the “core” library, while those whose names begin with fn: refer to the “functions” library.

The next line is a comment. As noted in the comment, the Listener class initializes context parameters from the context.xml file when the application starts. One of the values it initializes in index, which provides access to the class that manages the index.

The index a Java Object accessed through “Java Beans,” which provides the most convenient way to expose objects to the JSP scripting language through “Java Beans.” Java beans lets a Java object expose “properties” by defining public methods for accessing the property. For example, to expose a property called height, the object would need getHeight and setHeight methods. The scripting language does not seem to support calling methods on objects, so things that would be more natural to do with a method had to be done as though they were properties.

All queries must take place in a transaction. Databases use transactions to ensure that clients see consistent views of the data. For read operations, such as this, the transaction acts as though it takes a snapshot of the entire database, so that if any changes are made outside of the transaction (such as adding lectures or modifying transcriptions) they will be isolated from this transaction, so that this transaction does not see the data in an inconsistent state. In a write transaction, if anything goes wrong during the transaction, the database arranges for all changes made during the write to be undone, again preserving the integrity of the data.

The query property of the index bean is an example of a property that is more like a method. Accessing the query creates a new query bean with its own new transaction. By convention, the Lecture Server stores the query object in the qe JSP variable. When a transaction is started, it is important that it always be finished; if not, bad things will happen, like the database will stop working or get very slow after a while. Thus, the transaction must be wrapped in a Java try/finally. The <% ... %> notation is used to surround arbitrary Java code. Here we use it to add the try/finally form that ensures that the transaction which will be started by index.query always gets finished at the bottom of the file by setting the closed property of qe to true.

The <c:set .../> notation is used to set a JSP variable to a value, in this case the result of runnign the script index.query, which is invoked because it is inside of curly braces preceded by a $. As noted above, the <c:set ...> tag is defined by the “core” Java server tag library, whose functionality is included at the top of the file, where the functionality is also associated with c:.

The <category> tag has not been defined to do anything special, so it just gets sent in the response, as is it's close tage, </category> a few lines later. Since this is essentially raw text, nothing in JSP prevents you from forgetting a close tag or mispelling a tag name.

The query bean, qe, has a property which is a list of beans for the categories. We need to generate one XML tag for each category, so we use the JSP tag fn:forEach which iterates through a list, setting its variable, category, to each item in the list.

The category bean has two properties that we are interested in, the name and the categoryid. Each of these could have characters that need to be specially quoted in XML, so we use fn:escapeXml to ensure that this happens.

3.6.2  Lecture Queries

The most common queries are lecture queries, which use the lectures.jsp URL. Unlike category queries, lecture queries have a number of parameters. The basic outline of lectures.jsp is the same as categories.jsp, with the try/finally to ensure the transaction is closed.


<c:if test="${!empty param.query}">
  <jsp:setProperty name="qe" property="query" value="${param.query}"/>
</c:if>
Figure 17: Setting a Query Parameter

Figure 17 shows how a query bean property is set to the value of an optional request parameter. JSP will expose all the request parameters as properties of the param bean, and missing properties will be “empty.” When the query bean is created, the properties are all initialized to reasonable defaults, so we only set the query bean properties for those parameters that were set in the query.

3.7  User Login

The login.jsp page concatenates the nonce it sent to the client with the base-64 string of the hashed salt and password, computes an MD5 sum, and converts it to a base-64 string. If this matches the clientHash then the server sets session parameters to indicate the pemitted roles for the user. All access control is controlled by the session parameters in the server, never by what the client claims it is allowed to do.

The user account information is stored in an SQL database, which must be described in the server's context.xml file in a Resource tag, as shown in Figure 18. The values DBUSER and DBPWD should be replaced with the database user name and password, which is on the host DBHOST and named DBNAME on that host.


<Resource
 name="jdbc/TrantestDB" 
 auth="Container"
 type="javax.sql.DataSource" 
 username="DBUSER" 
 password="DBPWD"
 driverClassName="org.postgresql.Driver"
 url="jdbc:postgresql://DBHOST/DBNAME"
 maxActive="8"
 maxIdle="4"
 removeAbandoned="true"
 logAbandoned="true" />
Figure 18: User Database Information


Previous Up Next