A Tool for Statistical Analysis of Network Traffic
Rosabel Hernandez
Master of Science, June 1997
INTRODUCTION
Modern communication networks are highly dynamic entities that undergo constant
changes (e.g. network topology, user population, services and applications,
network technologies, protocols, etc.). The dramatic growth of user population
and, therefore, the increasing utilization of network applications (the web,
multimedia services, electronic mail, file transfer protocols, etc.) cause enormous
congestion and the network equipment may rapidly become obsolete (buffers too
small to store incoming data, switching equipment too slow to handle large data
traffic).
The main objective of traffic analysis and traffic modeling is to gain a good
understanding of the actual dynamics of network traffic and to make use of this
know-how when designing, managing and controlling existing or future networks.
There are three main steps to be followed in the process of knowing how network
traffic behaves: (i) collecting network traffic measurements (monitoring the
network), (ii) extracting the relevant information from measurements (data mining),
and (iii) building a model (statistical analysis).
One of the main on-going projects in the Network Design and Traffic Research
Group in the Applied Research area at Bellcore, Morristown, N.J. is "data mining," statistical analysis and mathematical modeling of enormous amounts of traffic
measurements from a variety of working high-speed packet networks. The results
of the analysis effort are used for the developent of mathematical models of
network traffic which are not only useful in practice but also aid the design,
management and control of modern high-speed communication networks. One of the
most critical challenges that analysts have had to face is the almost complete
lack or readily available and user-friendly tools for dealing with this type
of data in such large volume. I actively participated in this project by supporting
the data mining work through the development of a graphical user interface (GUI)
(called Madero) which allows traffic analysts and network engineers to extract
relevant information at various levels of interest from numerous traffic measurements
containing hundreds of megabytes of data.
Research supported by the Minnesota Center for Industrial
Mathematics (MCIM)