Sets, Bags, and Rock and Roll: Analyzing Large Data Sets of ...
Source: www.cert.org
Topic: Rock and Roll
Sort Desciption: The mantra “Sex, Drugs, and Rock and Roll” enjoyed currency in the 1960s. To the. ears of an older generation, Rock and Roll was just a particularly ...
Content Inside: Sets, Bags, and Rock and Roll ⋆ Analyzing Large Data Sets of Network Data John M c Hugh 1 Cert Coordination Center, Carnegie Mellon University, Pittsburgh, PA 15313, USA, jmchugh@cert.org Abstract. As network traffic increases, the problems associated with monitoring and analyzing the traffic on high speed networks become in- creasingly difficult. In this paper, we introduce a new conceptual frame- work based on sets of IP adresses, for coming to grips with this problem. The analytical techniques are described and illustrated with examples drawn from a dataset collected from a large operational network. 1 Introduction It is not unusual for relatively modest networks today to exhibit trans border flows on the order of megabits per second. Monitoring even a small network with a few hundred hosts can generate many gigabytes of TCPDUMP data per day. Capturing only headers can reduce the volume somewhat, and more compact formats based on abstractions such as Cisco’s NetFlow can reduce the volume further. Even so, the volume of data collected is sufficient to overwhelm many analysis tools and techniques. In general, the problem is one of grouping and classifying the data in such a way that uninteresting phenominae can be pushed aside, allowing the investigator to extract and further scrutenize tata that is of interest. Recently, CERT has been involved in the analysis of large sets of NetFlow data. To support this effort, they have developed a set of tools, collectively known as the SiLKtools 1 . In the remainder of the paper, we begin by sketching our thesis and analysis approach. We then digress to describe the NetFlow data collected, noting that the analysis can be applied equally well to tcpdump or other data forms with a bit of preprocessing. The basic functionality of the SiLKtools suite and some of the extensions made in support of our analysis efforts are then described. The remainder of the paper will present examples of the analyses that we c ...
analyzing,
sets bags rock and roll,
sets bags and rock and roll