IBM Research
 

Proxies and Servers

WBI is an HTTP request processor. This means that WBI receives requests from clients through the HTTP protocol, processes them, and returns results to the client. The way WBI processes these requests is completely programmable.

WBI as Proxy

In many circumstances, WBI is programmed to act as a proxy. A proxy receives HTTP requests and forwards them to the appropriate server (referred to as the origin server) for the request to be satisfied. When the origin server returns the results to WBI, they are then forwarded back to the client. This proxy function is often completely transparent to the client. In other words, the client receives back the same results as if the origin server had been addressed directly. Transparent proxies are commonly used to provide firewall protection for an enterprise and to cache web pages for better network utilization.

Proxies need not be transparent. In other words, the proxy may produce different results for the client than the client would have received if it had addressed the origin server directly. Common uses for non-transparent proxies are:

  • Transcoding: conversion of the content from one form to another, such as rendering it for a limited-capacity display or compressing the content for slow network links;
  • Blocking: blocking access to particular content that the client is not permitted to see; and
  • User Modeling: recording a userís access patterns and then modifying the content the user sees to personalize or customize it for his or her needs.

By programming to the WBI application programming interfaces (APIs), developers can produce both transparent and non-transparent proxies in a straightforward manner.

WBI as Server

WBI can also be programmed to act as a server. A server receives HTTP requests and returns an appropriate result. Commonly, this result is simply the contents of a file that resides in the serverís file system. However, most HTTP servers can also produce dynamic content. Dynamic content is produced on-the-fly by the server when the client requests it. The most common way to program dynamic content is CGI (common gateway interface). With this technique, the server is configured so that certain requests result in the server executing a script or executable program with the server routing the standard output of that program back to the client. This approach is straightforward, but suffers from some practical limitations in performance and flexibility.

WBI provides an alternative programming interface to CGI (and JavaSoft servlets, etc.) for developing HTTP server applications. WBIís APIs provide more power, flexibility and programming simplicity than many of the other approaches. WBI can easily produce both dynamic or static content and allows complex web-based applications to be developed with ease.

The Difference between a Proxy and a Server

In their simplest forms, proxies and servers appear to be completely different: proxies forward requests to other servers and servers handle requests themselves. However the line between proxy and server often becomes blurred in practice. For example, when a caching proxy delivers a web page from its cache back to the client, the proxy has performed the function of a server (it has handled the request itself). Likewise, many servers can be configured to forward certain types of requests on to other servers for processing. When they do this, they are acting like proxies. So what is the difference? To answer this question, we will look at it from three different points-of-view: that of the client, the protocol, and the server.

The Clientís Point-of-View

The typical client in the HTTP environment is a web browser (though many other types of HTTP clients exist). Most web browsers have a configuration option for setting the browserís proxy. This setting is then used to handle requests that the browser makes. If the setting is left empty, the browser simply contacts the origin server directly. But if the browser is configured to use a proxy, that proxy is contacted for every request that comes from the browser, regardless of the actual origin server specified by the user. For example, if a browser is configured to use proxy.ibm.com as its proxy and then the user types in the URL http://www.ibm.com/java, the browser will contact proxy.ibm.com and ask it to produce the document located at http://www.ibm.com/java. The proxy will then contact www.ibm.com and ask for the document /java. When www.ibm.com produces that document, it returns it to the proxy who returns it to the browser.

Note that this simple explanation has ignored the fact that many browsers can be configured to use different proxies for different types of URLs. But the basic point remains: a browser is configured to use a particular proxy (or set of proxies) and then those settings are rarely changed. That proxy (or one from a set of proxies) is used for every URL that the browser requests. The requested origin server is changing all the time as the user browses the web, but the fact that the requests are being handled by a proxy is less obvious.

The Protocolís Point-of-View

As far as the HTTP protocol is concerned, the difference between a transaction involving a proxy and one involving an origin server is small. An HTTP request for a proxy includes three basic parts: the protocol to be used to interact with the origin server, the name of the origin server, and the document to be retrieved from the origin server. In the above example, the browser would make a connection to the proxy server and issue the following request: GET http://www.ibm.com/java. This request simply tells the proxy server that the client would like to receive the response from the server www.ibm.com when it is asked for the document /java using the HTTP protocol.

An HTTP request for a server is very similar, but does not contain the protocol specification or the name of the origin server. These parts of the request are unnecessary because 1) the client is already speaking to the server using HTTP, and 2) the client is already talking to the appropriate server. To complete our example, to directly request the document /java, the browser would make a connection to www.ibm.com and issue the following request: GET /java.

This brief discussion has glossed over some of the other differences between proxy requests and server requests in the HTTP protocol in order to get to the crux of the matter. Other details of the transaction can vary between proxy and server requests.

The Serverís Point-of-View

The serverís point-of-view is very similar to the protocolís: proxy requests and server requests are very similar. Some programs are designed and configured to process server requests, while others are designed for proxy requests. The principal difference is that proxy requests include a protocol specification, the name of the origin server, and the requested document name, while server requests only include the requested document name. Whether the program then handles the request by "proxying" it to another server or "serving" the resulting document itself, is basically irrelevant -- sometimes it will be done one way and sometimes the other, governed by the design of the system.

Is WBI a Proxy or a Server?

Is light a particle or a wave? The correct answer is that it is not a very good question. WBI is both a proxy and a server, and it is neither. WBI is an HTTP request processor. It receives requests through HTTP, whether they are proxy requests or server requests. It then produces responses to these requests. How it produces these responses is completely under the control of the programmer and system administrator. WBI can be programmed to handle all proxy requests and return error messages when it is addressed as a server. Likewise it can be programmed to handle all server requests and return error messages when it is addressed as a proxy. When it handles a proxy request, it can do the obvious thing and forward the request to the requested origin server. However, it can also be programmed to handle proxy requests itself, producing the response like a server. Likewise, it can be programmed to handle server requests in a "normal" fashion, or it can be programmed to forward server requests to other servers, like a proxy would.

It is important for the programmer to understand the difference between functioning like a proxy, i.e. forwarding requests to other servers, and handling proxy requests, i.e. reacting properly to requests that specify a protocol and origin server name, in addition to the requested document name. Similarly, there is a difference between functioning like a server, i.e. producing suitable responses to requests by itself, and handling server requests, i.e. reacting properly to requests that specify only a requested document name.

It is also important for the programmer to carefully design the end-to-end system so that usersí browsers will make the desired types of requests of WBI when they are using a WBI-based application. Some WBI applications (such as the transcoding example listed above) will want to be involved in every request that comes from the browser. These sorts of applications should then be designed to have the browserís proxy settings configured so that WBI is the userís proxy. And WBI should then handle proxy requests properly. Other WBI applications will want to be involved in only a few types of requests that come from the browser. An example would be an application that shows the user the local weather. This sort of applications might best be designed with the user addressing WBI with server requests, e.g. going to the URL http://wbi.ibm.com/weather. However, this application could also be implemented by configuring the browser to use WBI as a proxy and then having WBI perform as a transparent proxy except when the above URL is requested, and then it acts like a server.

This flexibility in approach should be viewed as added power for the system designer and not as needless confusion. Some choices will be arbitrary, but many systems will be optimally designed by considering all the alternatives and choosing the most appropriate one for the given requirements.


Proxy requests
Figure 1. When a browser requests the URL http://www.ibm.com/java through a proxy server and directly to the origin server (i.e. www.ibm.com). Note the simple request that the browser makes when it talks to the origin server directly. When it uses a proxy server, the browserís request contains enough information that the proxy can make the browserís request for it, on its behalf.