|
|
Page Filter
Try It Out
To see what the PageFilter plugin does, you should view
some web pages twice: first with the plugin disabled, and then
with the plugin enabled.
- Setup the Plugin
- Start WBI. Setup your web
browser to use WBI as a
proxy.
- Register the PageFilter
plugin. At the WBI console, type (on one line)
register com/ibm/wbi/examples /pagefilter/pagefilter.reg
- Check whether the plugin is registered and enabled. Go to the
WBI Setup page. The PageFilter plugin
should be listed in the table with a checkmark next to its name. If the plugin
is not listed, try registering it again. If the checkmark is not there, click
on the box to the left of the plugin name.
- Open another browser window. Use that window to try out the plugin, and use
this window to display the documentation. (To open another window using
Microsoft Internet Explorer, go to File -> New -> Window. To open a window
using Netscape Navigator, go to File -> New -> Navigator Window.)
- View Web Pages Without the Plugin
- Now disable the plugin so that you can see what the original web
pages look like. Go to the WBI Setup page.
Disable the PageFilter plugin by clicking on the box to the left of its name.
- Access the IBM Homepage, the
IBM Support Homepage, and
the WBI Homepage. Notice
that they all get displayed in the browser and contain working links.
- View Web Pages With the Plugin
- Now enable the plugin so that you can see how it changes the pages
that are displayed in your browser. Go to the WBI
Setup page. Enable the PageFilter plugin by clicking on the box to the
left of its name.
- Access the IBM Homepage. Notice that it
appears the same way as before. Now access the
IBM Support Homepage.
Notice that some of the links that were visible before have been replaced
by normal (non-link) text. Finally, try to access the
WBI Homepage. This should
take you to a page of links, rather than to the WBI Homepage.
What It Does
The PageFilter plugin blocks a web browser from displaying sites that
are not on a list of approved sites. When a user tries to access an
unapproved site, the browser displays a web page that contains links
to all of the approved sites. If a web page contains a link to an
unapproved site, the link text is replaced by normal (non-link)
text. A web form can be used to add to the list of approved sites.
How It Works
Architecture
The plugin maintains a database of approved web sites/pages. New
sites/pages can be added to the database by using a web form. Each
time the browser makes a request, the database is consulted as to
whether the requested page is approved. To be approved, the
requested URL must match at least one of the entries in the database.
If the request is approved, then the page is retrieved and is
displayed to the user. If the request is not approved, then the
browser is diverted to a web page that contains a list approved links.
When an approved page is retrieved from the web, each link is checked
to determine whether it points to an approved page. If it points
to an approved page, then the link is not edited. If not, then the
link is removed (i.e., the anchor tag is removed but the anchor text
remains).
MEG Model
The PageFilter plugin consists of two generators and two editors. The following
diagram illustrates the way a request gets processed by the PageFilter plugin.
In the diagram, "RE" refers to the one request editor, "G" refers to either
of the two generators, and "DE" refers to the one document editor.
- Step 1 (RE):
When the browser makes a request, the request editor checks
to see whether the requested web page is in the database of approved
pages. If the requested page is in the database, then the request
editor does not change the request. If the page is not in the
database, then the request editor changes the URL in the request.
- Step 2 (G):
The PageFilter plugin uses two generators, one for each of the cases
in Step 1 (request approved vs. request not approved). Requests for
approved pages are handled by WBI's default generator. This generator
acts as a transparent proxy, retrieving the requested page from its
server. Requests for unapproved pages are handled by a special
pagefilter generator. This generator creates a web page containing links to
approved sites.
- Step 3 (DE):
Once a page has been received from a generator, a document editor
checks whether the page contains any links to unapproved sites. If
there are any such links, the document editor replaces these links
with normal text. Once the document editor is done, the page is ready
to be displayed in the browser. Note that the document editor does
not actually need to be run on pages that come from the pagefilter generator,
as the links on these pages always point to approved sites.
Implementation Details
- Filtering out unapproved pages:
The PageRequestEditor (a request editor) checks whether the requested
page is approved or not. If it is approved, then the PageRequestEditor
throws a RequestRejectedException. If not, a new request is created
(with a different URL) which will make the browser display a web page
of links to approved pages/sites.
- Editing links:
The PageFilterEditor (a document editor) changes approved pages by
editing out links to unapproved web pages/sites. A
LinkAnnotationEditor is used to remove links to
unapproved sites. More precisely, PageFilterEditor
extends LinkAnnotationEditor. This means that each link
(i.e., each anchor tag and its anchor text) is passed to the
editLinkmethod. If the href of the tag
matches a pattern in the database (the link points to an approved
site), then editLink does nothing. Otherwise,
editLink sets the link tag to null,
indicating that the link is to be removed. Note that only the anchor
tag is removed in this case, and not the anchor text.
- Retrieving approved pages:
WBI's DefaultHttpGenerator retrieves approved web pages.
- Creating page of approved links: The
PageFilterGenerator handles requests which were created
by the PageRequestEditor. The generator creates a
String of HTML code containing links to approved sites. Then, it
uses a StaticHtmlGenerator to display the HTML code in
a web page.
- Storing approved pages/sites:
Each approved web page or web site is represented by a
Page object. A Page contains a
pattern to match URLs against and a sample
URL and title. The sample is used to generate the page
of links to approved sites. The plugin maintains a small database of
these Page objects.
- Changing the "approved" database:
The PageFilterFormGenerator deals with the web form to add
Page objects to the "approved" database. The generator
can create the form and process any data that was entered into the
form. The database is actually an instance of Section.
- Some key WBI classes that were used:
Source Files
- Page.java
- Contains the class definition for Page (data structure for approved page).
- pagefilter.ini
- Initial database of approved sites/pages.
- PageFilter.java
- Contains the class definitions for PageFilter (the plugin itself),
PageFilterGenerator (page of links to approved pages), PageFilterFormGenerator
(form to change list of approved sites), PageFilterEditor (edit out links to
unapproved sites), and PageRequestEditor (change request URL if destined for
an unapproved site).
- pagefilter.reg
- Contains the code necessary to register the plugin.
- Pattern.java
- Contains the class definitions for Pattern and PatternPart (data
structures used to match requested URLs against entries in approved database).
|