Handling of Protocols and Privacy Information

    The OPA must be able to watch the flow of information during Web transactions. Furthermore it must be able to add privacy information to the stream and extract it from the stream. The two items that are of interest in this context are the standard protocol for using the Web (HTTP) and P3P. In addition to this, as previously mentioned, the user's preferences play an important role during Web transactions. Before going into a detailed description of the OPA, we briefly introduce HTTP, the standard Web communication protocol; P3P, the framework for the exchange of privacy information; and APPEL, the language to express a user's preferences.

Hypertext Transfer Protocol (HTTP)

    The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems [8] . HTTP is a request-response protocol and has been in use by the World Wide Web global initiative since 1990. It became the standard protocol to access and send Web resources. In this section it will be described how HTTP is used in Web browsers.

    Web browsers issue requests for Web resources. A typical HTTP request looks like:

    Sample HTTP request.

    GET http://www.ibm.com/home.html HTTP/1.0

    Accept: */*

    Accept-Language: en-us

    Accept-Encoding: gzip, deflate

    User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)

    Host: www.ibm.com

    This request asks for the Web document home.html from IBM's Web site. The name of the Web server (www.ibm.com) to be contacted is mentioned in the "Host:"-field. Among others, the most commonly keywords for requests are GET and POST [8] . In order to process this request, the browser establishes a connection to the Web server and sends the request (see Figure 3-1 ). After IBM's Web server receives the request it issues a HTTP response. This response includes the requested document and is sent to the Web browser which sent the initial request for the document. The server's response looks like:

    Sample HTTP response.

    HTTP/1.0 200 OK

    Content-Length: [size of the document]

    Expires: Sat, 30 Jan 1999 14:00:05 GMT

    Content-Type: text/html

    [HTML document containing IBM's home page]

    After the Web browser receives the requested document, it extracts IBM's homepage from the HTTP response body, closes the connection and displays 1 the information in the browser window. The OPA monitors the stream of HTTP messages and optionally intercepts it in order to add privacy information to the stream or to extract it from the stream. The privacy information is encoded and exchanged by the means of the Platform for Privacy Preferences Project (P3P).

Platform for Privacy Preferences Project (P3P)

    P3P is a project of the W3C, an international industry consortium that specifies protocols which promote the evolution of an open and interoperable Web. P3P is designed to help Web users reach agreements with services, such as Web sites or applications, that declare privacy practices and issue data requests [Reagle99] .

Overview

    First of all, this overview is only a very brief summary of P3P and not intended to cover all aspects of P3P. More detailed information about P3P can be found in Appendix B on page 99 of this document. In addition to this, the P3P home page [5] and [Reagle99] provide background information about P3P as well as the full P3P draft specification and links to related documents.

    Web users can reach agreements with services through P3P transactions. A P3P transaction usually starts with a service sending a machine-readable proposal as a response to a user's request for a Web resource. In such a response, the organization responsible for the service declares its identity and privacy practices. Furthermore, the proposal can also enumerate data elements the service proposes to collect during the transaction. Data elements normally refer to personal information, such as name, phone number, and address. Moreover the proposal specifies, among other things, how each requested data element will be used and with whom it may be shared. For example, the data elements collected by a Web service are given out to advertisers. A proposal can be automatically parsed by user agents, such as Web browsers or proxies. The respective user agent can then check the proposal against the user's personal preferences. This can be done automatically without requiring users to read the privacy policies of every Web site they visit. Depending on the result of the user agent's proposal check against the user's preferences, the user agent has several choices to respond to the service's proposal. Such a response uses a particular P3P message.

Messages

    P3P was designed to exchange privacy information in the headers of a P3P HTTP extension. P3P offers four different types of messages:

  1. Proposal message
  2. Txd message
  3. Ok message
  4. Sorry message
  5. As mentioned earlier, proposals (see Figure B-2 on page 106 ) are usually sent by a server to declare the privacy practices of the organization responsible for the service and to list the data elements requested by the service. A Txd message (see Figure B-5 on page 108 ) may be sent by the user agent after the reception of a proposal. It is used to transmit the data elements requested in the proposal. An Ok message (see Figure B-6 on page 108 ) is sent when a participant of a P3P transaction agrees to a proposal or when a data transfer has succeeded (response to a Txd message). The fourth type of P3P messages is the Sorry message (see Figure B-3 on page 107 ), which is sent as a response to a proposal or a data transfer (Txd). It indicates that the request could not be processed for a particular reason, which is included in the Sorry message.

A P3P Preference Exchange Language (APPEL)

    Preferences are used during transactions in order to evaluate the services' P3P information and determine the next action to be taken by the OPA. This next action can be the seamless transfer of pieces of the user's personal information or providing of information to the user. The OPA supports the use of preferences by the means of A P3P Preference Exchange Language (APPEL). The technical details and the APPEL draft specification are summarized in Appendix C on page 109 . Appendix C provides information on APPEL's grammar and the process of rule evaluation. In this section, though, we want to explain on a more abstract level how APPEL can be used to organize a user's personal information regarding its use in online transactions.

    Using APPEL, 2 a user specifies his preferences in a collection of preference-rules. This collection, the ruleset, contains one or more rules which specify the user's preferences regarding P3P proposals. Basically, a rule consists of three parts,

  1. a collection of data elements,
  2. several attributes regarding a P3P proposal,
  3. and a behavior.
  4. During a P3P transaction, the OPA compares the Web services' proposal against the ruleset. This process is called rule evaluation. During rule evaluation, the trust engine tries to find a rule that is matched by the Web site's P3P proposal. If a rule can fire (i.e., is matched by the proposal), the behavior as specified in the rule determines the next action to be taken by the OPA. Figure 3-3 shows a sample APPEL rule. This rule specifies that the OPA can seamlessly accept P3P proposals, which only ask for the user's first and last name under certain conditions. These conditions are specified as follows:

    Sample APPEL rule (level 1).

    <APPEL:RULE behavior="seamless-accept"

    description="Service collects user's full name">

    <P3P:PROP>

    <P3P:USES>

    <P3P:STATEMENT VOC:purp="0,1" action="r">

    <DATA:REF name="user.name.first"/>

    <DATA:REF name="user.name.last"/>

    </P3P:STATEMENT>

    </P3P:USES>

    <VOC:DISCLOSURE discURI="*"/>

    </P3P:PROP>

    </APPEL:RULE>

  5. The Web service only uses the information to complete or support the current transaction, and for Web site and system administration (VOC:purp="0,1").
  6. The Web service only requires read access and does not want to store information on the user side (action="r").
  7. The Web service specifies a URI where the user can find a natural language privacy statement of the P3P proposal (<VOC:DISCLOSURE discURI="*"/>).
  8. In addition to the seamless-accept behavior, APPEL provides three other behaviors. 3 The behavior seamless-reject can be used to make the OPA seamlessly reject a P3P proposal (see P3P demo, Figure 2-8 on page 30 ). The remaining two behaviors are info-prompt and warn-prompt.

Information Management

    By supporting the previously introduced protocols (HTTP and P3P) and the preferences language (APPEL), the OPA is a useful tool for the management of information. In this section, we will describe how the OPA can manage a user's personal information. In addition to this, we will explain the OPA's ability to keep a user's transaction history and why this is valuable in the context of P3P transactions.

Personal Information

    There are times in many Web transactions when services solicit information from a user. This personal information may be the user's name and home address, or his phone number and email address. Often, the requested information is needed to complete the transaction. We have described how the OPA accesses such information from the user's personal information storage in order to assist the user or act on his behalf. The OPA's ability to access the user's personal information and use it in Web transactions is very useful. As we have seen in the P3P demo (see Section 2.3.3 on page 23 ), among other things, the user benefits from not having to retype information. But the OPA can do more than just lessen the burden on the user regarding the provision of information.

    First of all, the OPA let's the user organize and manage his personal information regarding its use in Web transaction. The OPA's trust engine is responsible for evaluating the terms and conditions of Web transactions by comparing P3P proposals against the user's preferences. These preferences determine the output of the trust engine, i.e., the next action to be taken by the OPA. Secondly, the OPA helps to keep track of information which is used in Web transactions. Web sites might ask for data elements that do not exist in the user's personal information storage and therefore not known to the OPA. In other cases, Web sites might want to store information on the user's computer instead of storing it on the server side.

    An informational prompt (i.e., info-prompt) can be used in transactions where a user wants to control the release of information rather than let the OPA give it away seamlessly. This is very useful in order to manage the release of highly confidential information, such as credit card information or the social security number. A warning prompt (i.e., warn-prompt) is a useful behavior in transactions where the OPA cannot determine whether to reject or accept a proposal. Although there might exist a potential threat to the user's privacy, it will be the user's decision whether to proceed with the current transaction or not.

    With the ability to define different behaviors as provided by APPEL, the OPA can represent a user in online transactions. For transactions that allow a seamless completion, the OPA acts on behalf of the user. When rule evaluation requires the OPA to inform the user about the current transaction, the OPA offers assistance by summarizing the results of rule evaluation. Moreover, these different behaviors allow the support of different kinds of user profiles. Some users may be more liberal than others regarding the release of information in online transactions. These people are more likely to specify behaviors which allow a seamless completion of the transaction. Others, who are more reluctant to release personal information, can use the informational features of the OPA. Thus, they maintain control over the release of their personal information. Over time, as such users become more familiar and confident with the OPA they can transfer certain responsibilities to the OPA.

Managing New Information

    The APPEL rules specify conditions under which certain data elements can be released or need to be blocked. In order to identify pieces of a user's personal information, P3P defines a base set of commonly used and requested pieces of information, so called data elements. The base data set is specified in [15] and defines elements familiar to the majority of Web clients and services. A data element is identified by a standard name and uses an assigned data format. Among others, base data elements exist for a user's first name (user.name.first), email address (user.home.online.email), and postal address (user.home.postal.*). 4

    Although the base data set is sufficient for most Web transactions, Web sites are likely to collect information not contained in the base data set. P3P includes a mechanism for Web services to define their own data sets and ask the user to add these elements to their personal information storage. The OPA was designed to support the retrieval of new information. In case a Web site asks for information that cannot be found in the user's personal information storage, the OPA would ask the user to provide this missing piece of information (see Figure 2-6 on page 27 ). When the user fills in the missing information and chooses to finish the transaction, the OPA stores this information for later reference. This is made possible by the fact that the OPA monitors all input to and output from the Web browser. Storing Web site specific information on the client side offers several benefits, to users as well as services. The user benefits from this mechanism as the OPA lessens the burden on the user to remember information (e.g., site-specific user-identifiers) and keep track of it. Web sites benefit from it, by receiving consistent information from a user each time he visits the site. Moreover, services do not need to keep large databases in order to store the user information on their side.

Transaction History

    In addition to the OPA's ability to manage personal information in online transactions, the OPA is very valuable for keeping track of completed online transactions. This is especially important in the context of P3P, where a transaction ends with an agreement. This agreement states that the user accepted the services' proposal and that the service delivered the requested resource. Each agreement has an agreement identifier which is computed from the proposal, using the MD5 Message Digest Algorithm [18] . When the OPA completes a transaction, it stores the respective agreement identifier. In addition to this, it stores the proposal itself. This information represents a user's transaction history which is stored in the user's personal information storage. The following two scenarios describe why it is useful and necessary to keep a transaction history.

    Assume a user visited a Web site and reached an agreement with the site. Also assume that it required the user's approval and the release of data. The OPA stored all of this information in the user's transaction history. When the user visits the Web site again, the OPA refers to the agreement and sends the appropriate data elements with the initial request for the Web resource. Assume the Web site also keeps a transaction history, it can verify that the request refers to an earlier agreement. If the agreement has not expired yet, the service can satisfy the user's request without sending a proposal. In this scenario, both sides benefit from having access to a transaction history by reducing the traffic and the computational efforts (e.g., no rule evaluation necessary).

    The second scenario shows how a transaction history can be used to enhance a user's privacy protection. Assuming the user has rules that allow the mutual exclusive access to subsets A and B of a user's personal information. Assume the user once visited a Web site that collected elements of A. On another occasion, the user visits the same Web site which then asks for parts of B in a second proposal. If the OPA does not have the ability to consult a transaction history it will accept the second proposal according to the user's preferences. In case a user wanted to limit the information a Web site can get, the Web site found a way to get this information. Unfortunately, APPEL does not provide anything to express such constraints but it can be implemented into the user agent. By keeping a transaction history the OPA can determine during rule evaluation that the Web site already collected elements of the subset A. Therefore, the proposal asking for data elements of the subset B need to be rejected. Both scenarios explicitly demonstrate that it is useful to keep a transaction history in terms of reducing Web traffic and computation, and enhance a user's privacy protection.

Transfer of Information

    We now want to take a closer look at how the OPA handles HTTP, P3P, and APPEL. We want to explain these aspects in detail using the functional model of the OPA. A functional model describes and visualizes the flow of information between tasks of a particular system. Moreover, a functional model can be used to illustrate a sequence of events and tasks that occur in a system. Figure 3-4 on page 45 shows such a functional model which visualizes the flow of information in a sample P3P transaction (see the first step of the P3P demo in Section 2.3.3 on page 23 ). This functional model and the OPA's generic functional model ( Figure 3-5 on page 47 ) visualize three of the OPA's characteristics.

    First of all, they demonstrate graphicly how the OPA is positioned between a user and a Web site. On the left side one can see the user and the user agent (Web browser) which is used to request and display Web resources. All traffic from the Web browser has to go through the OPA. The Web site on the right side is represented by a Web server which analyzes requests and creates responses in order to satisfy these requests.

    Secondly, the two models show the OPA's two major components (P3P module, trust engine) and the tasks that are carried out in these modules. The tasks are represented by ovals which contain a roman numeral followed by the name of the task (i.e. IX. Rule Evaluation).

    Last but not least, it illustrates the flow of control and information between the different tasks, data stores, the Web browser, and the Web server. The control flow and the flow of information is visualized through the use of dashed and solid arrows. A solid arrow describes the flow of information from one unit (task, browser, server) to another after a task was completed. Similar to the solid arrows, dashed arrows visualize the flow of data as well as the flow of control. Data is sent from one unit to another if certain conditions (text on the arrow) are met. Both models, as shown in Figure 3-4 and Figure 3-5 , are based on the Functional Model Notation. This notation is part of the Object Modeling Technique (OMT) as defined and illustrated by [Rumbaugh91] .

Sample P3P Transaction

    Before we go into the details of the OPA's generic functional model ( Figure 3-5 ), we want to explain one sequence of events that occur in a sample P3P transaction (see Figure 3-4 ). This transaction enfolds sixteen steps, starting with a user's initial request and ending with the display of the requested Web resource. The sequence of these steps is denoted by arabic numerals (1 to 16) on the arrows that connect the individual units.

    The P3P transaction shown in Figure 3-4 describes a transaction where the user requests a Web resource, which is covered by a P3P proposal. The Web server sends the respective proposal back to the client which then checks the proposal against the user's preferences. Because the proposal can be accepted seamlessly, the OPA sends the appropriate message to the server which then sends the initially requested document. Figure 3-4 visualizes the individual steps carried out by the OPA during this transaction.

    The P3P transaction starts with the user's initial request for a Web resource (1), which is then intercepted by the OPA (task I). The OPA analyzes this request (task II) and sends (3) it to task V because the request is not a user response. 5 After the P3P information has been added to the request, the modified request is sent to the Web server (4). The server responds to the modified request by sending a P3P proposal in a HTTP response (5). This response is intercepted by the OPA and forwarded (6) to task VI which looks for P3P information included in the HTTP response. It detects a P3P proposal and invokes (7) task VIII which extracts the proposal from the response and forwards (8) it with additional information (transaction evidence) to the rule evaluation task (IX). Rule evaluation accepts the evidence and returns (9) its results to task X which then invokes (10) task XII in order to create the respective P3P response (Txd message). A newly-created Txd message is then forwarded (11) to task IV which submits (12) a new HTTP request (including the Txd message) to the Web server. The Web server recognizes that the proposal was accepted and that the requested data elements were sent. It then satisfies the initial request by sending (13) a HTTP response including an Ok message and the requested document. This response is recognized by the OPA as a HTTP response that satisfies the initial request. Thus, the OPA simply forwards the response to the Web browser (14, 15 and 16).

    Sequence of events in sample P3P transaction (functional model).

    This sample P3P transaction explicitly shows how the OPA can handle Web transactions on behalf of its user. During this transaction, the user only gets involved once by requesting the particular Web resource.

Generic Functional Model

    We want to take Figure 3-4 on page 45 a step further now and use a second functional model to describe the OPA in general. This generic functional model of the OPA is shown in Figure 3-5 on page 47 . It covers all possible variations of P3P transaction and uses a somewhat different notation as Figure 3-4 . Furthermore, the generic functional model contains a set of extensions which we want to explain here before going into the details of the OPA's individual tasks.

    Generic functional model of the Online Privacy Agent (OPA).

    As mentioned earlier, a solid arrow describes the flow of information from one unit (task, browser, server) to another. In Figure 3-5 , the kind of information sent is denoted by text or numbers in brackets (i.e. [HTTP req. (1)] or [1]). Numbers in brackets are references to information that originated at another unit. For example, the initial HTTP request issued by a Web browser ([HTTP req. (1)]) is simply forwarded by task number I or II. Dashed arrows use the same way of denoting the information sent from one unit to another. For example, if task II did not find a user response it will simply forward the user's request ([1]) to task V. All the tasks shown in Figure 3-4 are also part of the generic functional model. In addition to this, the generic functional model also shows data stores (e.g., preferences, negotiation knowledge base, and personal information) and additional tasks (e.g., Negotiation). Solid arrows from a data store to a task indicate that the respective task uses information from the data store. The next subsections refer to the generic functional model (see Figure 3-5 ) and explain the OPA's individual tasks.

Interception of the HTTP Stream

    One of the key features of the OPA is the ability to intercept the HTTP stream (see task I in Figure 3-5 ). This ability allows the OPA to enhance a Web browser functionality by making it P3P-compliant. The OPA's current implementation is based on WBI which provides methods and objects to access HTTP requests and responses as they are routed through WBI. As shown in Figure 3-5 , task I will intercept all outgoing request from the Web browser as well as all incoming responses from Web servers. The first step in task I. is to determine whether the incoming HTTP message is a request or response. This can be done easily by looking at the respective message header (see Figures See Sample HTTP request. and See Sample HTTP response. ). The next step will then be the invocation of the subsequent tasks II or VI depending on the type of the intercepted HTTP message.

Analysis of Intercepted Requests
Handling Requests

    If task I intercepted a request it forwards this request to task II which looks for a user response. First, we must explain what a user response means in the context of a P3P transaction. A user response occurs when the OPA receives a P3P proposal and the rule evaluation determines that the user has to decide how to proceed with the current transaction. The user is contacted by task III which sends a HTTP response to the Web browser. This response includes information about the transaction and offers the user one or more options (links, buttons) of how to proceed. If the user chooses one of the options, the Web browser issues the user response. This user response is technically a HTTP request which contains additional information.

    Now, if task II finds a user response that indicates that the OPA can accept the respective P3P proposal and finish the transaction, it forwards the information from the user's response to task XII. This task creates the appropriate P3P message to be sent to the respective Web server. If the user's response indicates a rejection of the current transaction, task II invokes task III which will complete the transaction (informing the user, sending rejection to Web site). If task II cannot identify the intercepted request as a user response, it forwards the request to task V which then adds the P3P information to the HTTP request. This modified request is then forwarded to the Web server.

Handling Responses

    If task I intercepts a HTTP response it automatically invokes task VI by forwarding the intercepted response. Task VI looks for privacy information (P3P) contained in the response and then forwards the request to either task VII or task VIII based on the kind of P3P information found. Task VII is invoked if no P3P information was found or the P3P information found indicated that this is the last step 6 in the current P3P transaction. In case of the latter, the respective response contains the initially requested Web resource. Task VII will then forward the response to the Web browser, which displays the Web resource contained in the response. This ends the current P3P transaction. If the intercepted response contains P3P information but does not satisfy the user's initial request, task VI will forward the request to task VIII. This task extracts the P3P information from the response and invokes the rule evaluation task (IX).

Modification of HTTP Requests

    As mentioned in the previous section, each HTTP request that does not represent a user response in the context of a P3P transaction will be forwarded to task V. This task modifies the incoming HTTP request and adds P3P information to it before forwarding the modified request to the respective Web server. In order to indicate that the user's Web browser is P3P compliant, task V adds additional information to the HTTP request (see Figure 3-1 on page 36 ). The modified request is shown in the following figure. The additional information (bold text line in Figure 3-6 ) is inserted after the request's message header and represents an extension to the existing header.

    Modified HTTP request indicating P3P compliance.

    GET http://www.ibm.com/home HTTP/1.0

    Accept: */*

    Accept-Language: en-us

    Accept-Encoding: gzip, deflate

    User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)

    Host: www.ibm.com

    OPT:"http://www.w3.org/TR/1998/WD-P3P-19981109/"; ns-42

Extraction of P3P Messages

    Whereas task V adds P3P information to HTTP requests, the tasks VI and VIII are responsible for detecting and extracting P3P information from HTTP responses. Such HTTP responses differ from ordinary responses (see Figure 3-2 on page 36 ) as they contain additional message header fields. Figure 3-7 on page 50 shows a response including P3P information. Similar to a P3P request, the response as shown in Figure 3-7 contains

    HTTP response containing a P3P proposal.

    HTTP/1.1 409 Agreement required

    Server: Marvin/2.0.1

    OPT: "http://www.w3.org/TR/1998/WD-P3P-19981109/"; ns-1492

    1492-P3P1.0:

    <P3P xmlns="http://www.w3.org/.../"

    xmlns:VOC="http://www.w3.org/.../vocab"

    xmlns:DATA="http://www.w3.org/.../basedata">

    <STATES><PROP realm="http://www.CoolCatalog.com/"

    entity="http://www.CoolCatalog.com" >

    <USES>

    <STATEMENT VOC:purp="2" VOC:id="0"

    consq="personalized site!">

    <DATA:REF name="User.Name.First"/>

    <DATA:REF name="User.Bdate.Year" optional="1"/>

    </STATEMENT>

    </USES>

    <VOC:DISCLOSURE .../>

    <ASSURANCE ... />

    </PROP>

    </STATES>

    </P3P>

    Content-Type: text/html

    Content-Length: 110

    <html><body>

    ...

    </body></html>

    OPT:"http://www.w3.org/..."; ns-1492

    in order to indicate that this response contains P3P information. The P3P message in this response is a P3P proposal. 7 Task VIII extracts this proposal from the response by using WBI's standard library. This library provides the necessary functionality to access the individual fields and values of a HTTP message-header. After extracting the P3P proposal, task VIII verifies 8 whether the extracted proposal represents valid P3P. If so, the rule evaluation task (IX) is invoked by forwarding the extracted proposal.

    If the extracted proposal does not represent valid P3P, the OPA aborts the current transaction by sending a Sorry message to the respective Web server. This represents an example of an error which can occur during a P3P transaction. For clarity, these special error cases were omitted in the functional model as shown in Figure 3-5 on page 47 .

Rule Evaluation and Negotiation

    The rule evaluation task IX is invoked by task VIII. This task extracts P3P proposals from the HTTP response and forwards them with additional transaction information to the rule evaluation task. The additional transaction information can contain information about the connection (e.g., whether it is cryptographically secure), URIs, or other transaction related information that can be gathered during P3P transactions. The rule evaluation takes all the information and tries to determine how to proceed with the current transaction. The input to the rule evaluation task is called the evidence.

    The rule evaluation task is based on APPEL's rule evaluation algorithm. It takes the evidence and checks it against the user's preferences. These preferences are defined in a set of APPEL rules which can be used by the OPA to make automated or semi-automated decisions regarding the exchange of data with P3P enabled Web sites.

    For example, during a P3P transaction a Web site sends a proposal which asks for the user's first name and his year of birth (see P3P proposal in Figure 3-7 on page 50 ). In addition to the request for these two data elements, the proposal specifies what the Web site is going to do with the data (VOC:purp="2"). Moreover, the proposal contains a natural language description (consq="personalized site!"), and a statement of whether the data will be used in an identifiable matter (VOC:id="0"). This evidence is checked against the user's APPEL rules. The rules define which subset of personal information can or cannot be released under certain circumstances. Examples of APPEL rulesets can be found in Figure C-1 on page 113 , Figure C-2 on page 117 , and Appendix D on page 119 . Rule evaluation tries to find a rule that is matched by the evidence. Once a rule fires (i.e. matches the evidence), this rule specifies the next action to be taken by the OPA. The result of the rule evaluation is forwarded to task X which analyzes the results and performs one of the following steps depending on the results of the analysis:

  1. The rule evaluation determined that the user needs to manually approve that the OPA can finish the transaction. In this case, task III is invoked which will then inform the user. There are two different cases when user approval is needed. Firstly, the terms and conditions as stated in the proposal are acceptable but the user's approval is needed. Secondly, the proposal indicates that there are potential threats to the user's privacy but the OPA must not reject the proposal seamlessly. Instead, it is supposed to inform the user and have him decide how to proceed (see info-prompt and warn-prompt behaviors in Section C.2.1.1 on page 111 ).
  2. The proposal represents an obvious threat to the user's privacy and negotiation cannot be performed. 9 Thus, the proposal can be rejected seamlessly (see seamless-reject behavior in Section C.2.1.1 ). In order to do so, task III is invoked to automatically abort the current transaction and inform the user.
  3. The proposal represents an obvious threat to the user's privacy but the Web site indicated its readiness for negotiating the terms and conditions of the current P3P transaction. In this case, task XI is invoked in order to produce a counter-proposal that satisfies the user's preferences and that is reasonably close to the rejected counter proposal. The results of this task are returned to task X. If the negotiation task found a counter-proposal, task X invokes task XII which will create the appropriate P3P response.
  4. The proposal is acceptable and the OPA can complete the transaction seamlessly. In order to finish the transaction, task XII is invoked and creates the appropriate P3P response.
  5. When task X determines that the OPA can complete the transaction without notification of the user, it invokes task XII. This task is responsible for the creation of the appropriate P3P response (see Section 3.3.2.7 on page 52 ) whereas task III is invoked in order to inform the user. The negotiation task (XI) either produces a counter-proposal or signals task X that no reasonably close counter-proposal can be produced. In the latter case, task X treats the negotiation results as a rejection of the proposal and invokes task III. The concept of negotiation and the process of finding a reasonably close counter proposal are explicitly laid out in the next chapter, Negotiation of Personal Information , starting on page 59 .

Informing the User

    As mentioned in the previous sections, there are several cases when the OPA has to inform the user about the current transaction (task III). The proposal is compatible with the user's preferences but the release of the requested information requires the user's approval. This behavior is useful when users want to maintain control over the exchange of highly confidential pieces of their information (i.e., credit card numbers or social security numbers).

    Another case in which task III is invoked is when the proposal is not acceptable but the OPA should ask the user before rejecting it. This behavior is very valuable, when a user wants to be informed when asked for his home address under certain conditions he usually does not agree with. In some cases though it can be useful to ignore the warning and finish the transaction anyway.

    Last but not least, task III is invoked when the OPA can automatically reject a transaction because obvious threats to the user's privacy exist. In this case, the user will be informed that the transaction was rejected. In addition to this, task XII is invoked to create the P3P reject message and forward it on to the Web server (task IV).

Creating and Sending P3P Messages

    The functional model (see Figure 3-5 on page 47 ) shows two tasks which are responsible for the creation of P3P responses (task XII) and the submission of these responses to the Web server (task IV). Before task IV sends the response to the Web server, it creates a new HTTP request, which includes the P3P message (from task XII), and submits this response to the respective Web server. Examples of such requests can be found in Section B.3 on page 105 which illustrates a sample P3P transaction.

    Task XII is invoked when the OPA can automatically send messages or finish a transaction. It can create four different kinds of P3P responses. In each case, after creating the message, task XII forwards it to task IV which submits the response as described in the previous paragraph. Task XII will create a reject message when it was invoked by task III which informed the user about the rejection. This reject message is a P3P Sorry message and contains a reason code which explains why the proposal is rejected. In addition to this, it includes a reference number which identifies the proposal referenced by the reject message (see Figure B-3 on page 107 ).

    If task XII is invoked by task X it can create three different types of P3P responses, depending on the information received from task X. If this information indicates that a negotiation step was performed and that this step produced a counter proposal, task XII creates a P3P proposal. This proposal is then forwarded to task IV which inserts the proposal into a HTTP request as shown in Figure 3-8 . Similar to the reject message the proposal contains a reference to the Web servers proposal which was rejected during rule evaluation. The other two kinds of messages that are produced by task XII are Txd and Ok messages. A Txd message is produced when the rule evaluation determined that the Web server's proposal can be accepted seamlessly and when this proposal requests data elements. If the proposal is compatible to the user's preferences and does not request data elements, an Ok message is produced. The same messages are produced when this task was invoked by task II. In this case the OPA intercepted a user response which contains the user's approval to finish the transaction. Again, depending on whether the respective server proposal requested data elements, task XII will produce either an Ok or a Txd message.

    P3P counter proposal as a HTTP request.

    GET http://www.ibm.com/home HTTP/1.0

    Accept: */*

    Accept-Language: en-us

    Accept-Encoding: gzip, deflate

    User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)

    Host: www.ibm.com

    OPT: "http://www.w3.org/TR/1998/WD-P3P-19981109/"; ns-42

    42-P3P1.0: <P3P ...><STATES><PROP ... > ... </PROP></STATES></P3P>

Additional Comments

    For clarity, a few aspects of the OPA and P3P transactions were omitted in the generic functional model as shown in Figure 3-5 on page 47 . First of all, the functional model does not cover error cases which can occur at several stages of a P3P transaction. An example of such an error case is:

Task VIII will abort the current P3P transaction if the proposal received from the Web server does not represent valid P3P (i.e. syntax errors). In such a case, this task would invoke task XII which creates a reject message (with the appropriate error code) that is then sent back to the server.

    Besides error cases, the functional model does not cover the learning and information tracking capabilities of the OPA. As mentioned earlier, besides reading data from the personal information data store the OPA can also store transaction information (i.e., user identification, password) or proposals for later reference. In addition to this, the OPA could add additional rules to the user's preferences and the negotiation knowledge base. Moreover, it is not shown how the OPA has the ability to keep track of accepted and rejected proposals. This was done to keep the complexity of the functional model on a reasonable and demonstrative level.

Current Implementation (WBI)

    Now that we have covered the functional model of the OPA, it is time to explain the OPA's implementation in finer detail. We implemented the OPA by mapping its functional model onto WBI, which we introduced briefly in Section 2.3 on page 18 . WBI provides a platform to build intermediary Web applications which are called WBI plugins.

WBI Plugins

    A WBI plugin (e.g., the HTTP cookie manager shown in Figure 2-3 on page 20 ) is constructed from five basic building blocks:

  1. Request Editors (RE) to modify outgoing requests.
  2. Generators (G) to produce documents in response to requests.
  3. Document Editors (DE) to modify incoming documents.
  4. Monitors (M) to observe transactions without affecting them.
  5. Autonomous Functions (A), which run independently of any transaction and perform background tasks.
  6. WBI's five basic building blocks.

    These five building blocks are collectively referred to as MEGs (Monitor/Editor/Generator). Each MEG is connected to WBI through a rule that specifies what type of Web transaction it should be involved in (see Figure 3-9 ). WBI's rule language allows programmers to place constraints on the invocation of MEGs. An intermediary usually consists of a number of MEGs which collaborate to produce a new function. MEGs can be grouped together as WBI plugins. WBI's plugin definition fits the Java Beans [11] component model, such that WBI Plugins contain MEG beans. When WBI starts up, each registered plugin is initialized and registers its individual components (MEGs) as transaction listeners. When WBI receives a request from the client, it follows the following steps:

  7. The original request is compared to the rules for all REs. The REs whose rule conditions are satisfied by the request are allowed to edit the request in the order of their priority.
  8. The request resulting from the RE-chain is compared to the rules of all Generators. The request is sent to the highest priority Generator whose rule is satisfied. If that Generator rejects the request, subsequent valid Generators are called in priority order until one produces a document.
  9. The request and response are used to determine which Editors and Monitors should see the document on its way back to the client. The document is modified by each Editor whose rule is satisfied, in priority order. Monitors are also configured to monitor the document either (a) as it is produced from the generator, (b) as it is delivered back to the client, or (c) after a particular editor.
  10. Finally, the response is delivered to the requester.
The OPA's MEGs

    Based on WBI's plugin concept (i.e., collections of MEGs), we implemented the OPA by mapping its functional model onto a group of MEGs. We defined 10 four new MEGs to carry out the OPA's tasks as shown in Figure 3-5 on page 47 .

    The first goal was to implement a MEG that can intercept the HTTP requests issued by the Web browser. Thus, we created a new request editor called P3PRequestEditor which intercepts any request and adds P3P information to the request in order to indicate that the browser is P3P-enabled. WBI always invokes the P3PRequestEditor when a request is issued by the Web browser, including user responses to the current P3P transaction. In order to distinguish between user responses and ordinary HTTP requests, we had two choices how to implement this: put the respective functionality into the P3PRequestEditor, or create a second request editor. We chose the latter because WBI already offers the functionality to distinguish between the two kinds of requests.

    We defined a second request editor called P3PInteractionRE. As we mentioned earlier, each MEG has a rule which specifies under which conditions WBI is supposed to invoke the MEG. For the P3PInteractionRE, we set-up a rule which specifies that it can only be invoked by WBI when the incoming user request is a user response. A user response is a special HTTP request in the OPA's context. The OPA informs the user by sending a document to the Web browser. This document offers several choices for the user how to proceed with the current transaction. These choices are hypertext links a user can click on in order to indicate to the OPA how the user wants to proceed. When clicking on one of these links, the browser issues a new request using the URL defined for the particular link. We defined a special URL for these types of requests:

http://_p3p/P3PInteractionFormHandler

    Knowing that user responses are requests with a special URL, we defined the rule for the P3PInteractionRE, such that it will only be invoked if the request contains the special URL:

host _p3p & path /P3PInteractionFormHandler

    The rule refers to the host- and path-part of the special user response URL. In order to prevent the other request editor to be invoked on any request, we defined a rule which is simply the negation of the P3PInteractionRE-rule. With just two new MEGs we are now able to intercept two different kind of requests in the context of P3P. After an ordinary request was modified by adding P3P information to it, a standard generator is then used to forward the modified request to the particular Web server. In case a user response was intercepted, the P3PInteractionRE modifies the request according to the user's response. This includes two things:

  1. P3PInteractionRE adds the appropriate P3P response to the request.
  2. P3PInteractionRE changes http://_p3p/P3PInteractionFormHandler into http://_p3p/RedirectBrowser which will lead to the invocation 11 of the third MEG we created, the P3PRedirectionGenerator. This MEG will then request the initially requested document including the user's P3P response.
  3. The change of the URL needs to be done in order to get to the initially requested Web resource. Note that the URL is not changed directly to the URL of the initially requested document. When the P3PRedirectionGenerator is invoked to forward the request to the respective Web server it uses a standard MEG, the PageMovedGenerator. This generator produces a document which instructs the browser to go to another URL to find the desired document. This happens all the time in normal web activity, but most users never notice. The URL Location field in the web browser will actually reflect the new URL rather than the one the user actually typed in. This could not have been done inside P3PInteractionRE. Forwarding the request directly to the Web server without invoking the standard generator would have led to the following situation. Finally, after the Web server sends the document, the browser would display the document, but would still show the URL of the user response. This could lead to confusion when the displayed documents contains relative links to other resources.

    Besides the interception of requests, we had to implement the ability to intercept HTTP responses coming from a Web server. Thus, we created a document editor called P3PResponseEditor. WBI always invokes this MEG after it received a HTTP response from a Web server. This MEG will then extract the P3P information from the response and evaluate the information. When the user needs to be contacted, the P3PResponseEditor will dynamically produce a HTML document which is then sent to the Web browser using a standard MEG. This standard MEG was specifically designed to produce HTTP responses containing HTML documents. If the user does not need to be involved in the current transaction (i.e., the OPA can seamlessly accept a proposal), P3PResponseEditor issues a second request for the requested Web resource including the appropriate P3P message. This request is issued differently than browser requests such that it will not be routed through other MEGs. In this case, the WBI-feature FetchURL is used to request a Web resource. The response to this request will be directly returned to the requestor without routing it through any other MEGs. After the P3PResponseEditor receives the response from the FetchURL-request, it will then forward the initially requested document to the browser using a standard generator.

    WBI plugin: Online Privacy Agent (OPA).

    The OPA as a WBI plugin is shown in Figure 3-10 which only shows the special MEGs. For clarity, the standard WBI MEGs are omitted. The figure also shows the data- and control flow. To summarize, the OPA as a WBI plugin required the creation of four special MEGs, besides the implementation of P3P, APPEL, and negotiation. WBI already provided the interfaces and sources to access the HTTP stream and create requests and responses, which helped to reduce the time needed for development.


1. If the Web document contains images, the images are obtained through subsequent HTTP requests. Images in Web documents are referenced by links.

2. The information regarding APPEL is based on APPEL level 1, please refer to Appendix C on page 109 for information about APPEL level 2.

3. The different behaviors have different effects on the process of rule evaluation. This is explicitly laid out in [6] and briefly summarized in Section C.2.1.1 on page 111 .

4. The asterisk (*) indicates that the postal address is a structured data type and consists of several basic data elements. See [15] for details.

5. Section 3.3.2.2 on page 48 explicitly describes the definition of a user response in the context of P3P transactions.

6. The last step is usually a P3P Ok message combined with the initially requested document. Please see Figure B-6 on page 108 for details.

7. For readability, we used indentation and split the proposal in Figure 3-7 across multiple lines; over the network, a carriage-return line-feed (CRLF) pair would be added only after the </P3P> element.

8. Verification of P3P messages is done by parsing it with a XML parser.

9. Negotiation cannot be performed when the Web site does not want to negotiate or the number of negotiation rounds during the current transaction exceeded a certain limit.

10. Defining a new MEG is done by the means of inheritance. WBI's standard library can be used to define new MEGs by defining subclasses (see Java inheritance in [Flanagan97] ).

11. Note that the Generator with the highest priority will be invoked after the chain of Request Editors has been completely handled.


April 9, 1999 · Jörg Meyer · jmeyer@almaden.ibm.com