This diploma thesis describes how to manage, negotiate, and transfer personal information on the World Wide Web (WWW, or Web). This chapter provides a brief overview of the Web and how personal information is used in online transactions. It also describes some of the current issues and problems related to privacy on the Web. The end of this chapter explains the goals of this thesis and gives an overview of the remainder of this document.
As the popularity of the Web increases, the Web will continue to evolve from a means of providing an easy way of accessing (and publishing) information on the Internet to a virtual marketplace where everything can be bought or sold, just like in the physical world. As in the early days of the Web, Web browsers are still used to retrieve and display information from Web servers. Nowadays though, Web browsers can also be used to purchase goods or send electronic mail (email). This way of using the Web is becoming more and more popular. A recent survey [eMarketer98] 1 shows that the Web is currently used by approximately 36 million people worldwide and that this number is expected to increase to 142 million by the year 2002. Though the estimated total online consumer revenues for Electronic Commerce on the Web (eCommerce) were $1.5 billion in 1997, this pales in comparison to sales in the home shopping, catalog and retail industries. Currently, the growth of eCommerce cannot keep up with the growth of the Web itself. The reasons why people do not purchase on the Web vary, but two major issues are privacy and security [GTRC98] .
"Privacy: ... 1. the state of being private; retirement or seclusion. 2. freedom from the intrusion of others in one's private life or affairs: the right of privacy. 3. secrecy ..." [Webster97]
When a person enters into an online transaction he is usually asked to give out personal information. When releasing personal information over the Internet, several threats to privacy exist (unless the user provides false information):
"... Information is secure if the owner of information can control that information. Information is private if the subject of information can control that information. Anonymous information has no subject, and thus ensures that information is private. Anonymity requires security and guarantees privacy, but is neither. ..." [Camp97]
In order to ensure privacy, there needs to be security. The market already provides several security tools, such as the Secure Socket Layer (SSL)  protocol developed by Netscape. Another example is Pretty Good Privacy (PGP 3 )  . Both use Rivest-Shamir-Adleman (RSA) public key cryptography. Such security tools can help protect privacy by preventing access to the information for non-authorized parties. But privacy requires more than that. There also need to be ways of controlling the access to and the distribution of information. The following example illustrates clearly why privacy requires more than security:
Person B orders a book at an online book store by filling out a form, with his name, address, and his credit card information. If this information is sent using an SSL connection between B's computer and the online book store, the information will be perfectly secure during the transaction. Nobody can spy on the connection or alter the information during the transaction; it can be read only by the online book store. The online book store will then use this information to finish the transaction which may include the release of parts of B's information to a third party (who actually ships the book to B) over a secure connection.
If the third party in the above mentioned sample also sells this information to advertisers or other companies, then B's privacy might be violated even though there was plenty of security during the transaction.
These findings agree with the statement that, as computers are used for more tasks and are integrated with more services, people will need help with the information and work overload [Maes97] .
The question now is how to provide such help. Information about the privacy practices of Web sites is needed as well as an infrastructure to get to it. The World Wide Web Consortium (W3C)  is currently working on this problem with its Platform for Privacy Preferences Project (P3P)  . P3P provides a framework for informed online interactions. Its goal is to enable Web users to exercise preferences over the use of their personal information. P3P-compliant applications inform users about Web sites' privacy practices and allow them to delegate decisions and tasks to their computer agents. Such tasks could include the automated transfer of personal information during an online transaction. This is supported by the P3P protocol. The W3C believes that P3P can help increase people's confidence in online transactions by presenting them useful and understandable information about Web sites' privacy practices [Reagle99] . Parallel to P3P, the W3C has a project called A P3P Preference Exchange Language (APPEL)  . This language can be used to express a person's preferences. Appendix B on page 99 and Appendix C on page 109 at the end of this document provide short introductions to P3P and APPEL respectively.
Now, in real life there is more to business transactions than just saying yes or no and providing information. People negotiate contracts with special terms and conditions. In order to find out how this behavior can be implemented in online transactions, our survey included the following question (see also Section A.6 on page 96 ):
Assume you could use a system that can automatically obtain Web sites' privacy policies and check and evaluate them against your personal needs and preferences. How would you configure such a system , i.e., what would your preferences look like regarding the release of personal information?
The two most common answers were, to only give out the minimum set of information that is needed for the transaction and to only accept a transaction if the Web site promises not to resell the user's information. Otherwise, the system should abort the transaction or warn the user. Keeping in mind these preferences and the desired functionality of our basic implementation we envisioned a software agent, 4 the Online Privacy Agent (OPA), that can negotiate the terms and conditions of online transactions. In addition to this, an OPA would be able to keep track of online transactions and their respective terms and conditions.
One of the features of using P3P in combination with APPEL is the automated transfer of personal information. Many online transactions require the input of the same information, such as email address, name, and home address. Using an OPA, the terms and conditions of the transactions can be checked against the user's preferences. If this check fails, the OPA will notify the user with a warning that participation in the current transaction can violate his privacy. On the other hand, if the check succeeds, then the OPA can go ahead and seamlessly send the required information back to the Web site. This would lessen the burden on users to type in the same information multiple times. However, it is still necessary to give the user control over the release of information. (Some people might feel uncomfortable with the fact that a piece of software is giving out personal information.) The following chapters will describe how both of these goals can be accomplished with the Online Privacy Agent.
In many online transactions Web sites ask for personal information. As mentioned earlier, the kind of information requested is not always relevant to the transaction itself. With the emergence of P3P and APPEL, we believe that soon a Web site might want to grant a visitor access to its Web resources based on the amount of information it can get from the visitor. Another example would be a Web site that offers discounts on purchases in its online store. The Web site might offer higher discounts if it is allowed to sell users' address information to advertisers. In both cases, an OPA would be a helpful assistant to automatically negotiate the terms and conditions for a user when registering with or purchasing goods from Web sites. The OPA would apply the user's preferences in the online transaction and try to negotiate an agreement in one or multiple rounds talking to the Web site.
When performing online transactions, there is currently no automated way of keeping track of the online transactions in which a user participated (registering with an airline Web site, purchasing a book). In some cases, the user might get a confirmation number at the end of the transaction, a user identification and password pair, or a confirming email. The user currently has to store this information in a notebook or write it down in his organizer in case he needs it later. The online transaction information ends up in many places whereas it should be stored in one place where it is easy to find. An OPA can help keep track of such information and store it together with the terms and conditions of the transaction. This information can be used and transferred seamlessly in subsequent transactions with Web sites. A good example is a subsequent visit to a Web site which requires a user password. With an OPA, there is no need for the user to remember the identification password pair for a particular Web site. This will be especially useful with the increasing number of Web sites requiring user identification.
This thesis illustrates and describes the Online Privacy Agent (OPA), a software agent to manage, negotiate and transfer personal information on the Web. As described in the previous sections, an OPA covers various aspects of privacy and personal information in online transactions. This thesis will focus on the aspects of negotiation and transfer of personal information using P3P and APPEL and discuss them in detail. Other aspects such as tracking online transactions will be discussed briefly throughout this document.
Chapter 2 , Agent Architecture , describes OPA's architecture and its components. It also illustrates the OPA's usage and its current implementation. At the end of Chapter 2 , a short overview of agent technology is given, including a comparison of the OPA to existing agent technology.
Chapter 3 , Management and Transfer of Personal Information , provides an overview of the OPA's context, and briefly introduces the Hypertext Transfer Protocol (HTTP), P3P, and APPEL. Furthermore, Chapter 3 explains how the OPA can be used to manage a user's personal information and to keep track of online transactions. This document then goes into finer detail and describes explicitly, with a functional model, how the OPA monitors, manipulates and transfers information.
Chapter 4 , Negotiation of Personal Information , describes the OPA's ability to negotiate during online transactions. We will describe the concept of negotiating 5 personal information by introducing a framework for the automated negotiation of sets of information. The chapter closes with a description of how the framework was applied in the context of online transactions by the means of P3P and APPEL.
In each of the chapters, this document refers to the OPA's current implementation. We describe the OPA's usage ( Chapter 2 ) and its architecture ( Chapter 2 and 3), as well as the implementation of individual components ( Chapter 4 ). Several appendices at the end of this thesis provide summaries, brief introductions and overviews about various aspects of this thesis. See Appendix A for details about the survey which was conducted as a part of this thesis to find out about Web users' experiences with privacy and personal information in online transactions. A brief introduction to P3P can be found in Appendix B . Appendix C provides an overview of and information about APPEL. Appendix D illustrates a sample APPEL ruleset.
1. Throughout this document, books, papers, articles, or magazines are referenced by the authors last name or the company's name, and the year of publication in brackets (see Bibliography on page 123 ).
4. The term agent is a commonly used term in computer science, although there is no consensus definition for it. [Nwana96] and [Bradshaw97] provide overviews of the types and characteristics of existing agent technologies.
5. The aspect of (multi-round) negotiation was recently removed from the current P3P specification because the W3C considered it to be too complex and to be a reason for Web sites not to deploy P3P. However, this document will show that multi-round negotiation is useful to the user (and the service), and can be implemented with reasonable amounts of effort.