Every day the Web gains more importance for consumers and corporations
alike. However, while access to Web-based networks brings tremendous value, the number one
complaint among Web users is how long it takes to download Web pages. Whether those pages
are delivered from informational Web sites or enterprise applications, the delays are
unacceptable.
Throwing bandwidth at the problem will not help significantly. Slow Web
pages are caused by a combination of distance between users and servers, and congestion of
heavily trafficked networks beyond the control of any one organization.
The only possible solution for fast Web page response time is to move
the data closer to end users. Network caching, which stores fresh copies of the most
frequently requested Web objects near groups of users, can deliver Web pages with the same
response time people have come to expect from traditional applications.
Within the scope of my degree work, I designed and implemented a
caching proxy server that handles HTTP traffic. The proxy features:
- A robust caching mechanism, that is optimized to minimize the time that end users wait
for documents to load by giving preference to documents that must be loaded over the
slowest Internet links, that are from servers that take long time to connect to, that have
been referenced more frequently, and that are small. This is achieved by incorporating
estimations of bandwidths between the local server and remote servers, estimations of
connection times to remote servers, frequency of references to cached documents, and
cached document sizes into the cache removal algorithm. The weights of the parameters are
all fully customizable.
- Multithreading, to allow concurrent handling of several HTTP requests. Great care has
been given to concurrency control, to ensure mutual exclusion between threads where needed
and to avoid deadlocks.
- Strong initialization and shut-down routines. These enable recovery of all data (cached
documents, servers statistics, configuration of the proxy server) upon restart that
follows a proper shut-down, and of all consistent data upon restart that follows a crash.
- The ability to define a list of servers whose documents will not be cached by the proxy,
and the ability to define a proxy to which all unfulfilled requests will be forwarded.
- A graphical user interface, that features a log area, a cache hits-misses pie chart and
a cache occupancy chart. Customization of the proxy is done via the graphical interface.
The application is implemented in Java, and as such it benefits from cross-platform
portability.