Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Towards improved functionality and performance of intrusion detection systems Singh, Sunjeet 2011

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2011_spring_singh_sunjeet.pdf [ 2.99MB ]
Metadata
JSON: 24-1.0052037.json
JSON-LD: 24-1.0052037-ld.json
RDF/XML (Pretty): 24-1.0052037-rdf.xml
RDF/JSON: 24-1.0052037-rdf.json
Turtle: 24-1.0052037-turtle.txt
N-Triples: 24-1.0052037-rdf-ntriples.txt
Original Record: 24-1.0052037-source.json
Full Text
24-1.0052037-fulltext.txt
Citation
24-1.0052037.ris

Full Text

Towards Improved Functionality and Performance of Intrusion Detection Systems by Sunjeet Singh B.E. Computers, University of Pune, 2007  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Computer Science)  The University Of British Columbia (Vancouver) January 2011 c Sunjeet Singh, 2011 �  Abstract Based on analysis from collected network traces, a decade of literature in the field of intrusion detection, experiences shared by people in the network security domain, and some new heuristics, this thesis explores several directions in which to extend the functionality and performance of existing Intrusion Detection Systems(IDS). We first present a new method for detecting a whole range of TCP attacks, and an extension of that method for detecting Distributed Denial of Service attacks. We then analyze two directions for enhancing performance: using cloud services to flexibly scale to higher IDS throughput; and leveraging hardware functionality in modern network cards for efficient multi-core processing.  ii  Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iii  List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  v  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi  Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  x  1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  2 Analysis of Networks . . . . . . . . . . . . . 2.1 Network Administration Tools . . . . . . 2.1.1 Tcptrace . . . . . . . . . . . . . . 2.1.2 Argus . . . . . . . . . . . . . . . 2.1.3 Picviz . . . . . . . . . . . . . . . 2.2 Available Datasets . . . . . . . . . . . . . 2.2.1 Publicly Available . . . . . . . . . 2.2.2 Private Datasets . . . . . . . . . . 2.3 Composition of Networks- at Layers 2,3,4  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . 3 . 4 . 4 . 4 . 5 . 7 . 8 . 10 . 10  3 Survey of Intrusion Detection . . . . . . . . . . . . . . . . . . . . . .  iii  15  3.1  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  16 17 17 18 19 20 20 21 21  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  . . . . .  24 25 27 28 30  5 Augmenting Existing IDS With DOS detection . . . 5.1 Bro Intrusion Detection System . . . . . . . . . . 5.2 Augmenting Bro with New Detection Mechanism 5.3 A Closer Look at the Bro Filter Stage . . . . . . .  . . . .  . . . .  . . . .  . . . .  . . . .  . . . .  . . . .  . . . .  . 33 . 33 . 35 . 36  3.2  3.3  Intrusion Detection Techniques . . . . . . 3.1.1 Type of Input . . . . . . . . . . . 3.1.2 Granularity of Input . . . . . . . . 3.1.3 Detection Techniques . . . . . . . 3.1.4 Anomalies Addressed . . . . . . . Open-source Intrusion Detection Systems 3.2.1 Bro . . . . . . . . . . . . . . . . 3.2.2 Snort . . . . . . . . . . . . . . . Denial-Of-Service Detection Techniques .  4 Proposed TCP Attacks Detection Method 4.1 Definition of Good Connection . . . . 4.2 Application of Definition to Traffic . . 4.3 How to Detect Attacks in Bad Traffic . 4.4 Evaluation . . . . . . . . . . . . . . .  . . . . .  . . . . .  6 Improving the Performance of Existing IDSes . . . . . . . . . . . . . 38 6.1 Using Cloud Computing . . . . . . . . . . . . . . . . . . . . . . 39 6.2 At the Operating-System Level . . . . . . . . . . . . . . . . . . . 44 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  54  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  56  iv  List of Tables Table 3.1  Table 3.2 Table 6.1 Table 6.2  Classification of Intrusion Detection Systems based on granularity of input. The numbers in the References column index bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Different Denial of Service (DOS) attack detection approaches . 23 Number of packets successfully received by kernel without TNAPI support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Number of packets successfully received by kernel with TNAPI support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51  v  List of Figures Figure 2.1 Figure 2.2 Figure 2.3  Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 3.1 Figure 3.2 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4  Example tcptrace output . . . . . . . . . . . . . . . . . . . . Example argus output . . . . . . . . . . . . . . . . . . . . . . Example Picviz output. The parallel axes are Time, Protocol, Source IP address and Destination IP address; and the given curves represent values of these attributes for all packets . . . Analysis of the Lawrence Berkeley National Laboratory (LBNL) dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hypertext Transfer Protocol (HTTP) connection types analysis of the LBNL dataset . . . . . . . . . . . . . . . . . . . . . . . Closer look at the unidirectional HTTP connections in the LBNL dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unidirectional connection analysis heat-map of the LBNL dataset  5 5  6 11 12 13 14  Parallel co-ordinates representation of different technique-input combinations explored in literature . . . . . . . . . . . . . . . 19 Bro vs. Snort feature comparison . . . . . . . . . . . . . . . . 22 Layout of a confusion matrix . . . . . . . . . . . . . . . . . . 25 The Transmission Control Protocol (TCP) state machine . . . . 26 Cumulative Distribution Function (CDF) of packet loss incurred by destination hosts . . . . . . . . . . . . . . . . . . . . . . . 29 Packet loss density distribution over time for victim d3 (top row). d3 shares lost packets with s1 , s2 , s3 and s4 . . . . . . . 31  vi  Figure 4.5  Denial of Service attack analysis results for the The Cooperative Association for Internet Data Analysis (CAIDA) dataset . .  31  Figure 5.1  Bro architecture . . . . . . . . . . . . . . . . . . . . . . . . .  34  Figure 6.1 Figure 6.2 Figure 6.3 Figure 6.4 Figure 6.5 Figure 6.6 Figure 6.7  IDS cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . Bro cloud configuration . . . . . . . . . . . . . . . . . . . . . Architecture of multi-core Bro . . . . . . . . . . . . . . . . . Conventional network architecture . . . . . . . . . . . . . . . Proposed network architecture . . . . . . . . . . . . . . . . . Output of the PF RING-aware user application pfcount . . . . Bro, Bro Multiple and Bro RING for varying packet rates: The x-axis represents incoming packet rate on the network wire, and the y-axis represents the packet rate received by Bro.  40 44 46 47 48 50  vii  52  Glossary TCP  Transmission Control Protocol  DOS  Denial of Service  DDOS  Distributed Denial of Service  IDS  Intrusion Detection System  CAIDA  The Cooperative Association for Internet Data Analysis  MIT  Massachusetts Institute of Technology  DARPA  Defense Advanced Research Projects Agency  LBNL  Lawrence Berkeley National Laboratory  PGDL  Picviz Graph Description Language  TCP  Transmission Control Protocol  UDP  User Datagram Protocol  IMAPS  Secure Internet Message Access Protocol  SSH  Secure Shell  HTTP  Hypertext Transfer Protocol  FPGA  Field-Programmable Gate Array viii  ANI  Active Network Interface  IP  Internet Protocol  SQL  Structured Query Language  VLAN  Virtual Local Area Network  IPS  Intrusion Prevention System  PCA  Principal Component Analysis  ROC  Receiver Operating Characteristic  NIDS  Network Intrusion Detection System  CDF  Cumulative Distribution Function  DPD  Dynamic Protocol Detection  NUMA  Non-Uniform Memory Access  ix  Acknowledgments I had a wonderful experience in doing my Masters degree and I owe it completely to my supervisors Dr. William Aiello and Dr. Patrick McDaniel. I am extremely grateful to both of them for providing invaluable guidance throughout my degree despite their extremely busy schedules. I stand inspired by their technical and leadership skills and hope to meet more people like them in the future. I thank Dr. Andrew Warfield for being my second-reader and for providing valuable feedback, and I also thank him and Dr. Charles Buck Krasic for their inputs during the development of part of this thesis. I thank the members of the Networks, Systems and Security Laboratory at UBC, and in particular Shriram Rajagopalan and Mahdi Tayarani Najaran for all the help they offered. I would also like to thank NSERC’s ISSNET (Internetworked Systems Security Network) for funding this research and for providing me several opportunities to learn and to interact with people researching in my field. I would also like to extend my gratitude to the technical staff at University of British Columbia- Michael Sanderson and Sean Godel for sharing their relevant experiences in network administration and helping me acquire data from the university network. Finally, a big thank you to my parents, brother and sister for their continuous love and support that braced me for the daily challenges of grad school and helped make it a truly memorable experience for me.  x  Chapter 1 Introduction A computer network is a collection of computers and devices interconnected by communication channels that facilitate communications among users and allows users to share resources. Networks are prevalent in the home and in the workplace today. For any enterprise, networks provide information and communication, which are two of the most important strategic issues for its success. However, these networks that enterprises and homes so heavily rely upon, are far from perfect. Along with information that is to be shared and the overhead required to successfully exchange this information in a consistent and efficient manner, networks typically contain varying amounts of un-necessary traffic that at the very least causes inefficiencies but can also cause damage to these networks, computers and their users. This unwanted traffic could arise from multiple reasons that could include malicious users, inconsistencies in the network, simple lack of adherence to standards or just faulty network protocol design. While this traffic may or may not be of malicious intent, it contradicts the overall purpose of the network. And in order to recognize and restrain it, organizations employ Intrusion Detection System (IDS). These systems come in different ’shapes and sizes’ so to say, and play an important role in keeping networks available, efficient and secure for their users. 1  As attacks on networks grow in complexity, size and number just as networks do and with the prevalence of zero-day attacks, research in the area of networks and intrusion detection has proven to be critical. It has already been proven that intrusion detection is a NP-hard problem[5], which means that detection of all types of intrusion in every type of case is impossible. Thus Intrusion Detection Systems constantly improve to employ new techniques to cope with evolving attacks. The work presented in this thesis is aimed at improving the existing Intrusion Detection Systems in terms of both- functionality and performance. We present a new attack detection technique, explore it’s effectiveness, and try to add it’s functionality to an existing IDS. In the process, we gain a good understand of the architecture of IDSes, and explore some ideas on how to improve their performance and scale to greater network speeds.  2  Chapter 2 Analysis of Networks Ever since the origin of computer networks in the 1960s, the number of network protocols has only grown. As the technology underlying the network protocols (hardware) and that utilizing them (applications) evolve, new protocols that allow best use of these technologies are adapted. While some of these protocols are short-lived, others have proven to be persistent and they are now tightly integrated in the Internet infrastructure today. As the viciousness of attacks on networks increases, networks also are in a constant attempt to improve, which leads to the effect that ineffective protocols either improve or get replaced by new ones. As a result, security is growing as a concern, and there is increasing focus on adherence to network standards. So overall, well administered networks are expected to be a lot healthier than what they used to be in the past. A recent study [9] states that 34% of the flows in an enterprise’s traffic were found to be non-useful, meaning that they either explicitly failed or they did not elicit a response. With this as a motivation, we try to understand what an enterprise’s network looks like and how one can go about solving the problem of reducing unhealthy flows. This section lays the foundation for this thesis by presenting an in-depth analysis of an enterprise’s network trace by first looking at some state-of-the-art tools 3  in the area of network administration and analysis in Section 2.1, and then the data-sets that are available to researchers in this field in Section 2.2.  2.1  Network Administration Tools  The task of network analysis involves operating on large data-sets and whenever data is large, effective summarization plays a vital role in making this data easy to understand. Several tools exist that make the task of network administration and analysis easier and they are available as both open-source or commercial software. Three popular open source tools that were also used for this research are discussed below.  2.1.1  Tcptrace  tcptrace can be used to generate various statistics and plots from a raw network pcap trace. Instead of displaying packet-level information like tcpdump, tcptrace aggregates the packets to a connection level. Every line of output contains information about a single connection- the source and destination address and ports, duration of the connection, bytes of data exchanged, etc. tcptrace can be run on a network dump-file (referred to as dumpfile in the following example) trivially as followstcptrace dumpfile tcptrace understands various network dumpfile formats like tcpdump, snoop, etherpeek, etc. For User Datagram Protocol (UDP) which is a connection-less protocol, every line in the output shows a request-reply pair. tcptrace is also compatible with other useful companion programs like tcpslice, xplot, tcpurify, etc. An example of default tcptrace output is shown in Figure 2.1.  2.1.2  Argus  argus, just like tcptrace, is an open-source tool that processes packet data and generates summary network flow data. 4  Figure 2.1: Example tcptrace output  Figure 2.2: Example argus output An example of argus output is shown in figure Figure 2.2. Many sites use argus to generate audits from their live networks. argus beats tcptrace not only in functionality (more detailed statistics about connections) but also in the ability to handle larger volumes of traffic. It comes with a suite of tools that allow analysis of argus data (ratop), integration with a Structured Query Language (SQL) database (rasql), network traffic anonymization (ranonymize), etc. argus is in many ways similar to Cisco’s Netflow. The main difference between the two is that argus is a bi-directional flow monitor so it tracks both sides of a network conversation when possible, and reports the metrics for the complete conversation in the same flow record. While Netflow is a uni-directional flow monitor, reporting only on the status and state of each half of each conversation, independently.  2.1.3  Picviz  When using visualization, data is often represented as a pie chart, histogram, or 3D plot. While this can give initial insights into large data, this is a severe reduction of all the dimensions that each event carries. As an example, often, the time is not 5  Figure 2.3: Example Picviz output. The parallel axes are Time, Protocol, Source IP address and Destination IP address; and the given curves represent values of these attributes for all packets part of the graph, and when it is, only one or two dimensions are left. Picviz helps get around this limitation of conventional visualization methods. Picviz helps understand networks by visualizing events in multi-dimensions with the help of parallel coordinates plots. In a parallel co-ordinates plot, each property of data is assigned a (parallel) vertical axis, and a data item would represent a curve that intersects each one of those vertical axes at the right value, as shown in Figure 2.3 Picviz can handle million of events and the input language is easily scriptable. It follows a two-step processes in which it first converts packet files into an intermediate format called Picviz Graph Description Language (PGDL), and then generates the plot. Picviz can be used to analyze any kind of log file, including network traffic, and is able to spot patterns and outliers easily with large datasets.  6  2.2  Available Datasets  Input data-sets for network analysis or intrusion detection can either be a real-time feed of network traffic (called active analysis), or a file that contains previously recorded network traffic (passive analysis). For research purposes, availability of data-set poses a significant challenge because of the following reasons1. There is not one representative data-set. Networks are diverse and vary depending upon size, organization, underlying hardware, administration and protocols in use. No single data-set can be representative of all networks. Thus, the fact that a new approach has been tested on a particular data-set has limited implications about the effectiveness of the approach. 2. Packet capture limitations. The speed at which traffic can be captured without incurring any packet losses is limited by the hardware that is used for capture. When traffic for a whole network is captured, it is generally done so at a central location to avoid the risk of missing packets that take different paths. For highspeed networks, the amount of traffic flowing can be overwhelming even for state-of-the-art capture apparatus. These missing packets cause ambiguity in analysis. 3. Privacy issues. Although traffic capture is a common procedure among organizations that monitor their networks, release of that traffic for research purposes is rare. Captured traffic traces can provide information about an organization’s network structure, services, addresses, user-data, operating-systems, etc. which makes it’s release bounded by legal agreements or simply commercial concerns, where an organization does not want to release any internal structural information to it’s competitors. Some organizations take the path of anonymizing the dataset by encrypting or altogether truncating sensitive 7  fields of the trace, thereby increasing the difficulty of certain types of analysis or making them impossible. Hence, anonymization comes at the price of loss of information for analysis while not anonymizing would mean giving away sensitive information and so choosing the right degree of anonymization is a tough decision to make. The threat of newly-devised reverseengineering approaches that were not expected at the time of anonymization causes anonymization policies to be chosen on the conservative side, which leads to reduction in information that can be extracted for research. In choosing which data-set to use, researchers have two options. Firstly, there are some existing publicly-available legacy datasets that either were artificially synthesized or anonymized and have deficiencies as discussed earlier. In the second case, they may resort to private datasets that are privacy-protected, in which case only the results of the analysis approach can be published to the community but not details of the data set. This makes it hard to argue about the validity of the approach. This section discusses the important characteristics of the publicly-available datasets and the private ones that were used for this research.  2.2.1  Publicly Available  MIT/DARPA The first publicly released data-set, it was released in the year 1998 by Information Systems Technology Group of Massachusetts Institute of Technology (MIT) Lincoln Laboratory, under Defense Advanced Research Projects Agency (DARPA) and Air Force Research Laboratory sponsorship. Network traffic was collected on a simulation network. The trace was later improved upon by adding several weeks of data from the years 1999 and 2000. A portion of the trace was labelled as training data and it contained no attack traffic. The remainder data contained several types of attack instances, some documented and others left for analysis. 8  Some known issues exist in this data-set and it has been criticized in several publications, yet, this data-set seems to be the data-set of choice and was used extensively for the evaluation of several intrusion detection approaches in the 1990s and 2000s. LBNL Enterprise Tracing Project The Berkeley Enterprise Tracing Project was started with the aim to release datasets that would facilitate network research. The objective of the research was to arrive at the right techniques and level of anonymization to make network data safe to share while at the same time preserve the information within the data that is important for research. 11 GB of traffic headers through October 2004 to January 2005 were collected that span more than 100 hours of activity at the Lawrence Berkeley National Laboratory (LBNL) laboratory in the United States and an anonymization policy was applied. CAIDA The goal of The Cooperative Association for Internet Data Analysis (CAIDA) is to collect several different types of data at geographically and topologically diverse locations, and make this data available to the research community to the extent possible (again, while preserving the privacy of individuals and organizations who donate data or network access). The data-sets are available on a per-request basis. Among its many available data-sets, the one used for this research was its 2007 Distributed Denial of Service (DDOS) Attack dataset. This contains approximately one hour of anonymized traffic traces from a DDOS attack on August 4, 2007. This type of denial-of-service attack attempts to block access to the targeted server by consuming computing resources on the server or by consuming all of the bandwidth of the network connecting the server to the Internet. The total size of the dataset is 21 GB.  9  2.2.2  Private Datasets  University of British Columbia At the Computer Science department of University of British Columbia, a virtual routing layer is set up on top of the physical routing layer, which allows one virtual router to route all traffic going to, from and between different Virtual Local Area Network (VLAN)s that are setup in the department building. 4 hours of traffic was collected using Argus by a machine running a 10Gbps Intel 82599 ethernet card, Intel Quad core processor at 2.27 GHz, Intel X58 chipset, 8GB RAM and 250 GB Seagate hard-disk at 7200 RPM. Pennsylvania State University At the Computer Science department in Pennsylvania State University, 24-hour traces were collected at 3 different hosts- two on a lab subnet and one on the faculty subnet, using tcpdump. Unlike the earlier traces, these traces concern host traffic as opposed to network traffic. All these hosts were inside the the Computer Science network and hence behind a firewall. The hosts were off the shelf Windows and Mac computers, and no packet drops were reported by tcpdump.  2.3 Composition of Networks- at Layers 2,3,4 In an attempt to learn the composition of networks from the LBNL trace, first some basic plots that summarize the trace are presented. Figure 2.4(a) shows a timeline plot of the number of packets. This graph shows the number of packets captured every second, over a period of 10 minutes. In Figure 2.4(b) we see the use of Internet Protocol (IP) along with some other contemporary networklayer protocols like Novell-Ethernet and DECNET, some of which are obsolete. A port histogram showing distribution of packets among different ports is shown in Figure 2.4(c). This analysis assumes that port numbers are statically identified by packet headers. A transport layer protocol histogram in Figure 2.4(d) shows 10  (a)  (b)  (c)  (d)  Figure 2.4: Analysis of the LBNL dataset the fraction of Transmission Control Protocol (TCP) against other transport-layer protocols. As the next step, the Hypertext Transfer Protocol (HTTP) protocol is analyzed in more detail with the help of tcptrace in order to understand it’s connection behaviour. Six different types of connections were reported by tcptrace, and the share of those connections in the trace is shown in Figure 2.5 (a) and (b). The majority of connections were ’complete’ connections, which by definition are connections that were terminated by TCP FIN messages from both sides. Second in number to complete connections were ’Reset’ connections, which were terminated by one end sending a TCP Reset packet after the connection was idle for a long time. ’Unidirectional’ connections are those in which no reply was received 11  HTTP analysis based on TCP connection states (Total tcp connections = 261133)  Complete Complete, reset Reset Reset, unidirectional Unidirectional Not terminated  200000  HTTP analysis based on TCP connection states  180000  (Total tcp connections = 261133)  160000 140000 120000 100000  Complete Complete, reset Reset Reset, unidirectional Unidirectional Not terminated  80000 60000 40000 20000 0 Complete, reset Complete  Reset  Reset, unidirectional Not terminated Unidirectional  (a)  (b)  200000 180000  160000 Complete: Connection was terminated by issuing FIN segments on both sides Complete, reset: Connection was ended by client. Both sides issue FIN, client issues RST twice. 140000 Client sends first FIN. 120000 Reset: Client sends RST after no packets were exchanged for some time (eg100000 17 sec) or after server sends FIN. 80000 Reset, unidirectional: Only RST packet seen 60000 Unidirectional: Communication in one direction, did not receive reply Not terminated: Normal communication, no termination packets captured. 40000  Figure 2.5:  HTTP  connection types analysis of the LBNL dataset  for requests and hence communication was in one direction only. These typically arise from scanning activity, down time or misconfigurations, but could also occur due to inability to capture packets flowing in the other direction. Other types of Complete: Connection was terminated by issuing FIN segments on both sides connections were ’Not Terminated’, where normal communication took place but Complete, reset: Connection was ended by client. Both sides issue FIN, client issues RST twice. Client sends first FIN. Reset: Client sends RST after no packets were exchanged for some time (eg 17 sec) or after server no connection termination packets were captured; ’Reset, unidirectional’ where sends FIN. Reset, unidirectional: Only RST packet seen Unidirectional: Communication in one direction, did not receive reply only a TCP RST packet was seen between two and ’Complete, reset’ in Not terminated: Normalhosts; communication, no termination packets captured. which both sides requested FIN for termination but RST packets were then sent. The ’Complete’ connections are perfectly healthy connections from a TCP standpoint, because these connections were initiated, packets were exchanged between both sides, and then terminated successfully. Even the ’Reset’ connections are deemed to be harmless, as the only difference here is that instead of following the standard FIN termination approach, a RST packet was used. These can also be classified as healthy connections. But for ’Unidirectional’ connections, we go a step further to analyze where these originate from and which hosts they target in order to understand their cause. In Figure 2.6(a), we analyze where the unidirectional connections originate. For those machines that initiate >90% unidirectional connections, we look at the top 10 according to total number of HTTP connection attempts. We see that most of the unidirectional connections originate from one machine, but several other machines are also responsible for unidirectional connections. Figure 2.6(b) lists the top 10 servers in terms of connection 20000 0  Complete, reset  Complete  12  Reset  Reset, unidirectional Not terminated Unidirectional  (a)  (b)  (c)  (d)  Figure 2.6: Closer look at the unidirectional HTTP connections in the LBNL dataset volume, and shows the distribution of connection type among the incoming connections of each of these servers. In figures Figure 2.6(c) and Figure 2.6(d), we take a closer look at the unidirectional connections from the client’s and server’s side respectively. At the client, we see that >90% of the total HTTP clients have zero unidirectional connections, while 7-8% clients make only unidirectional connection attempts. 1% of the clients are such that they contain both- unidirectional as well as bidirectional connections. On the server side, we see that 65% of the HTTP servers suffer from all connection requests being unidirectional, meaning that none of the incoming HTTP requests were accepted. Approximately 32% of the HTTP servers accepted all incoming requests, and 5% of the HTTP servers 13  Figure 2.7: Unidirectional connection analysis heat-map of the LBNL dataset accepted some requests while ignored others. Finally, Figure 2.7 shows a heat-map of the unidirectional connections being directed out of different nodes in the network. On the left hand side, we have the top 9 clients that make unidirectional connection attempts, and on the right hand side, all the different server IP addresses to which connection attempts are made. Due to the excessive number of such destinations, only one is labelled. A line between a client and server denotes a unidirectional connection attempt. Black lines represent <10 unidirectional connections, blue for 10-50, green for 50-100, yellow for 100-200 and red for ≥200 such unidirectional connections between the same pair of hosts. A similar analysis of different protocols like Secure Internet Message Access Protocol (IMAPS) yields different results. For example, a majority of IMAPS connections were ’Unterminated’ because as a property of the protocol, connections are long-lasting.  14  Chapter 3 Survey of Intrusion Detection Intrusion detection is the process of detecting actions that attempt to compromise the confidentiality, integrity or availability of a resource. When applied to Networks, the resource can be one or more entities on the network- computers, users, network devices. The concept of Intrusion Detection has been around since as early as 1980 and since then, the idea has evolved from that of simply inspecting network audit data to detect simple attacks to today’s complex dedicated IDSes that identify patterns in large datasets and detect sophisticated attacks. Intrusion detection can be performed both- manually or automatically. Automatic intrusion detection is much more effective than manual, and a system that performs automated intrusion detection is called an IDS. An IDS can be hostbased, if deployed on hosts and it monitors host activity like system calls, network logs, etc.; or network-based, if deployed to cater for a network and monitor the flow of network packets to several hosts. Modern IDSes are usually a combination of these two approaches. IDSes help defend against various types of attacks such as Denial of Service (DOS), DDOS, scanning, etc. When a probable intrusion is discovered by an IDS, typical actions to perform would be logging relevant information to a file or database, generating an email alert, or generating a message to a pager or mobile phone. An IDS may go a step further and block suspicious traffic in which 15  case it can be called an Intrusion Prevention System (IPS). This section presents a detailed classification of various Intrusion Detection Systems proposed in literature based on their unique characteristics in Section 3.1, introduces two popular open-source IDSes in Section 3.2 and then presents a brief comparison of several existing DOS/DDOS attack detection approaches in Section 3.3.  3.1  Intrusion Detection Techniques  Many different approaches to intrusion detection have been proposed, which can broadly be divided into two categories- Signature detection and Anomaly detection. In Signature detection, network traffic is examined for preconfigured and predetermined attack patterns known as signatures. Many attacks have distinct signatures and as good security practice, a collection of these signatures in the detection process must be constantly updated to mitigate emerging threats. In Anomaly detection, first a baseline is established based on normal traffic evaluations. Then, the current network traffic is sampled to this baseline and if it is outside the baseline parameters then an alarm is triggered. This section presents a detailed survey of anomaly-based intrusion detection techniques proposed in the period from 2000 to 2009. Intrusion detection approaches differ from each other in one or more of the following characteristics1. Type of Input 2. Granularity of input 3. Detection Technique 4. Anomalies Addressed 5. Computational complexity 6. Dataset used for evaluation 16  7. Evaluation technique 8. User involvement We will look at the first 4 in detail.  3.1.1  Type of Input  This decision of what data is input to an IDS is made by keeping into account the requirements of the anomaly detection algorithm, policy of the enterprise and sensitivity of data. The IDSes surveyed look at one or more of the following1. Volumes of flows or link-data (aggregated in the form of a matrix) 2. Connection statistics or flow-level data 3. Attributes of packet headers 4. Payload  3.1.2  Granularity of Input  Input from a network can be collected at different granularities1. Packet level 2. Byte level 3. Flow level 4. Traffic matrix Due to the anomalous behaviour of some network events it is not viable to draw conclusions about maliciousness/benignity from a single network packet. Thus it is necessary to aggregate data at higher levels, with the additional advantage of considerably reducing the size of the input to the IDS and hence making it much more practicable, as was seen in the University of British Columbia Computer Science dataset collected at flow-level using Argus. A table showing the classification of IDSes based upon input granularity is shown in Table 3.1. 17  Table 3.1: Classification of Intrusion Detection Systems based on granularity of input. The numbers in the References column index bibliography.  3.1.3  Input Granularity  References  Byte Level Packet Level Flow Level Traffic Matrix  [16] [16], [1] [2], [12], [11], [13], [25] [24], [26]  Detection Techniques  Four different types of anomaly detection approaches were observed1. Volumetric analysis followed by some sort of heuristics ([2], [11], [25]) 2. Principal Component Analysis (PCA) followed by outlier decision methods ([16], [12], [26]) 3. Entropy estimation followed by thresholds ([13], [1]) 4. Kalman filter followed by statistical methods ([24]) Each of the above techniques has its advantages and pitfalls. Volumetric analysis techniques work solely based upon the number of exchanges between entities on the network and so they are in general fast, but they falsely report some common anomalies like flash-crowds and fail to detect sophisticated attacks. The PCA technique does a good job of modelling normal traffic but commonly suffers from false positives, because of the anomalous nature of network traffic. PCA techniques oblivious to the payload may not be able to distinguish worms from pointto-multipoint transfers Similar is the case with entropy-estimation and Kalman filter based techniques, that they suffer from false positives and false negatives. Yet, these techniques can be used in combination with each other and with signaturebased detection techniques to effectively detect intrusive behavior. It is interesting to visualize how these different techniques can be employed on different levels of 18  1. Packet-level 2. Traffic matrix 3. Flow-level 4. Byte-level Due to the anomalous behavior of some network events it is not viable to draw conclusions about benignity from a single network packet. Hence it is necessary to aggregate data in some form or the other, with the additional advantage of considerably reducing the size of the input to the IDS and hence making it much more practicable. It is also interesting to observe which type of data has been utilized by which detection technique. This information is presented in the form of parallel co-ordinates in figure 2. Detection Technique  Input data granularity  PCA  Packet  Volumetric  Flow  Entropy  Byte  Filter  Traffic Matrix  Figure 2: Parallel co-ordinates representation of various technique-input combinations explored in literature  Figure 3.1: Parallel co-ordinates representation of different technique-input III. Network anomalies addressed: combinations in due literature Anomalies in aexplored network can occur to outages, configuration changes, flash crowds or abuse. According to [10], the types of network wide anomalies are1. Aplha (high rate point-to-point byte transfer). 2. DoS, DDoS 3. Flash crowd 4. Scan 5. Worm 6. Point to Multi-point 7. Outage 8. Ingress-Shift (traffic is moved from one ingress point to another). No technique by itself is sufficient to efficiently identify all anomalies. Volumetric techniques do not reliably distinguish a flash crowd from a DDoS attack. Entropy based approaches suffer from high false positive rates when detecting flash crowds. PCA techniques oblivious to the payload may not be able to distinguish worms from point-to-multipoint transfers. Each technique has its own pitfalls and thus an optimal IDS would constitute a complementary mixture of techniques.  input granularity, as demonstrated in a parallel co-ordinates representation in Figure 3.1  3.1.4  Anomalies Addressed  Anomalies in a network can occur due to outages, configuration changes, flash IV. Scope of input analyzed: crowds or abuse. According the that types ofdataset network anomalies IDSes also differ in the to level[16], of inspection the input is treatedwide with. This decision is madeareby keeping into account the policy of the enterprise, sensitivity of data and requirement of the anomaly detection algorithm. The IDSes studied look at one or more of the following1. Volumes of flows or link-data (aggregated in the form of a matrix) 2. Connection statistics or flow-level data 3. Attributes of packet headers 4. Payload [12] [12] states that the port number field in packet headers is not good indicator of the protocol carried  1. Alpha (high rate point-to-point byte transfer) 2. DoS, DDoS  3. Flash crowd 4. Scan 5. Worm 6. Point to Multi-point 7. Outage 8. Ingress-Shift (traffic is moved from one ingress point to another). There are other parameters that can be used to differentiate between IDS approaches. For example, based upon computational complexity, IDSes can be classified as being either offline or online. Another parameter for classification of IDSes could be the dataset that was used for evaluation. This can give us 19  an idea of the strengths and weaknesses of the approach as well as reliability of the evaluation. Or maybe by evaluation technique, in which Receiver Operating Characteristic (ROC) curves was found to be the most popular one. And while some IDSes required manually labelled data-sets to learn, others were nonsupervised. Overall, anomaly-based intrusion detection seems to have been a hot topic of research in the time period 2003-2007. In this study of 10 years of literature on this topic, it can be concluded that network activity can be ambiguous enough to not be easily distinguishable. Often, differentiating between benign network activity and malicious intrusion attempts requires knowledge and understanding of not only a particular packet or connection, but also of the concerned host and network behaviour. No one technique alone is sufficient for effectively identifying all kinds of intrusion attempts. In the limited computational resources that are available for intrusion detection, an attempt is made to approximate decisions by employing heuristics and probabilistic models. This gives rise to false positives and negatives in any solution, that these solutions seek to minimize.  3.2 Open-source Intrusion Detection Systems 3.2.1  Bro  Proposed in 1999, developed at LBNL and the International Computer Science Institute, Bro is an open-source, Unix-based Network Intrusion Detection System (NIDS) that actively monitors network traffic and looks for suspicious activity. Bro detects intrusions by first parsing network traffic to extract its application-level semantics and then executing event-oriented analyzers that compare the activity with patterns deemed troublesome. Its analysis includes detection of specific attacks (including those defined in terms of events, but also those defined by signatures) and unusual activities (e.g., certain hosts connecting to certain services, or patterns of failed connection attempts). Bro provides a signature mechanism similar to Snort’s, and includes Snort20  compatibility support (see Section 3.2.2 below). Bro can analyze network traffic at a higher level of abstraction than just signature-matching, and has powerful facilities for storing information about past activity and incorporating it into analysis of new activities. We go into the details of Bro’s architecture and working in Section 5.1.  3.2.2  Snort  Originally released in 1998 , Snort is an open source network intrusion prevention and detection system (IDS/IPS) developed by Sourcefire. Combining the benefits of signature, protocol, and anomaly-based inspection, Snort is one of the most popular IDS/IPS solutions with over 205,000 registered users[23]. Just like Bro it is an open source network intrusion prevention system, capable of performing real-time traffic analysis and packet logging on IP networks. It can perform protocol analysis and content searching/matching, and can be used to detect a variety of attacks and probes, such as buffer overflows, stealth port scans, CGI attacks, SMB probes, OS fingerprinting attempts, and much more. Snort has three primary uses: It can be used as a straight packet sniffer like tcpdump, a packet logger (useful for network traffic debugging, etc.), or as a full blown Network Intrusion Prevention System. Figure 3.2 shows a brief featurecomparison between the two systems.  3.3  Denial-Of-Service Detection Techniques  An IDS tries to maximze functionality to detect as many different types of traffic anomalies as it can, and different IDSes specialize in detecting different types of anomalies. This section presents a survey on the various approaches that have been proposed in literature for the detection of a very specific type of attack- the Flooding Denial of Service attack. In this type of attack, the target host is flooded with requests for resources from malicious sources which leads to saturation of resources such as bandwidth, processor or memory at the target computer, which  21  Figure 3.2: Bro vs. Snort feature comparison cannot then service legitimate requests. Table 3.2 classifies all these approaches based on the input signal from the traffic, and then describes the detection technique adopted along with values of parameters, if specified in the original publication. In future sections, a new DOS detection technique is proposed that is based on using the number of lost packets in a connection as a signal for detection of denial of service activity.  22  Table 3.2: Different DOS attack detection approaches Signal  Technique  Reference  Fan-in, packet rate  Attack ramp-up, spectrum and spectral analysis: Check for 2 thresholds- (a) number of sources that connect to the same destination within 1 sec exceeds 60, or (b) traffic rate exceeds 40Kpackets/s A moving 3-hour local deviation window. The signal is divided into low-band, mid-band and highbad. Attention is focused on low-band. Estimate the power spectral density of the signal to reveal information of periodicity. A normal TCP flow exhibits strong periodicity around its roundtrip time in both flow directions, an attack flow usually does not. Each network device maintains a data-structure, MULTOPS, that monitors certain traffic characteristics. MULTOPS (MUlti- Level Tree for Online Packet Statistics) is a tree of nodes that contains packet rate statistics for subnet prefixes at different aggregation levels. They show what attack bandwidth dynamics look like for different kinds of flooding DOS attaks- constant rate attack, pulsing attack, increasing rate attack and gradual pulse atttack. Identify Aggregates and rate-limit them. An aggregate is a collection of packets from one or more flows that have some property in common- could be source or destination address prefix to a certain application type. Backscatter Analysis: ”Sample” backscatter of DOS activity at a random address space to get an idea of DOS activity taking place on the Internet, by assuming that each host on the internet is equally likely to get backscatter traffic By observing the time series of the entropy of packet size.  [10]  Packet-rate, deviation score Packet-rate  Packet-rate  Packet-rate  Aggregates  Packet address, timestamp Packet-size  23  [3] [19]  [8]  [17]  [14]  [18]  [7]  Chapter 4 Proposed TCP Attacks Detection Method This section presets a general solution for detecting all TCP attacks which is intuitive solution, and based on adherence of TCP connections to the TCP protocol. We first separate out bad TCP connections from the good ones based on a certain definition of goodness. The bad connections are then analyzed for TCP attacks, based solely on information gathered from TCP headers- source and destination addresses, ports, time, sequence numbers and flags. By applying some heuristics to this information(as discussed in later sections), we try to detect the occurrence of attacks as well as identify the sources of these attacks. This approach is different from previously explored approaches in the field, because1. The definition of goodness is intuitive and has not been used in literature. 2. This solution detects for the whole range of TCP protocol attacks instead of focusing on a subset or a single type of attack. And as we believe that with increasing emphasis on security and adherence to standards, networks today behave a lot better at the TCP protocol level as compared to networks from 4-5 years ago. Using data gathered from Pennsylvania 24  Using PlanetLab, we will set up a network of virtual hosts and have them run some normal services as well as some attack tools. With this setting in place , we will capture traffic at both the network and host levels using tcpdump. We need only tcp headers. This dataset would qualify as a realistic currentworld network trace. We can also use the MIT/DARPA Intrusion Detection dataset from 1999/2000 which has been widely adopted for IDS testing and so will add credibility to our evaluation. EvaluationWe will run our tool on the datasets to build a confusion matrix that looks likeTrue positives  True negatives  False positives  False negatives  Table 1: Confusion matrix  Figure 4.1: Layout of a confusion matrix  Then, we will use some well known algorithms used for detecting specific anomalies as ground truth to compare how well tool performs. We do not our all-purpose algorithm to perform as well as Stateour University, it was observed thatexpect well administered networks are very clean these specialized Detection algorithms each attack, but we argue that thebyresults will be andIntrusion have a strong compliance to TCPforstandards. We evaluate our approach somewhere close. We will summarize the results in the form of a table as followsdiscussing how effective it was in detection of DOS/DDOS attacks in the datasets available to us by talking about results of the values of a confusion matrix that Data Set Our tool Volumetric techniques Entropy techniques looks like Figure 4.1. The following subsections explain the definition of goodPlanetLab k k' k'' ness, results from it’s application to real world traffic, and how to detect attacks. MIT/DARPA ... ... ...  4.1  Table 2: Evaluation table  Definition of Good Connection  A TCP connection is termed as good if it satisfies the following two conditionsDESCRIPTION We view a network trace as a set of tcp connections. We divide these connections into good and bad, 1. Theremeets are nothe illegitimate TCP state transitions, and where a good connection following two criteria1. Strictly adheres to the TCP state machine. There are nopackets. lost or retransmitted packets. 2. No lost or 2. retransmitted Several of goodness can be thought but theFrom one presented here fairly well It is intuitive that an ideal definitions TCP connection will obey these two of criteria. our experience, is intuitive practical. This also a collected rather conservative approach protected networks today and are very clean. Foristraces from 3 hosts on 2 because differentthesubnets within the Penn Stateaim University network, saw of that of these the network connections is to be able to detectwe all types TCP93-94% attacks and two criteria cover all were good according to our We attributed 4-5%should to the be imperfection since the capture of definition. them. A conservative approach practical in in thisdata casecollection because wellwas started on machines that were already running and had some pre-established network connections. administered networks were found to be quite clean and contained ”manageable” Thus, 98-99% of the TCP connections of the trace collected at hosts was good. bad traffic when this definition was applied.  On the other hand, the connections that violate atleast one of these two conditions are 'bad' connections, of the connections criteria and potentially Justification malicious. These contain network and port scanning (which are deviations from the state machine) and DoS attacks, DDoS and tcp reset spoof attacks (all lead to The range of TCP protocol attacks that we try to detect is- DOS/DDOS attacks, Port lost/retransmitted packets). Scanning, Network Scanning, SYN flood, Insertion and Evasion attacks[15], and  We start drilling down into these bad connections by grouping them according to source host. So for every host, we look at how many bad connections it25makes, how many good connections it makes, and what are the sets of destination hosts and ports that are contacted by that host. We apply some basic  !"#$%&%'()&*)+*',-*(.-%/ !"#$%&'(&)*++,-./*+&012&23--,2243556&,2.175/28,9 :)!&2.1.,&;1-8/+,&.<1+2/./*+2&.8<*3=8&*+,&*4&.8,&>&?1.82&12&28*0+&/+&.8,&:)!&2.1.,&;1-8/+,&9/1=<1;& 7,5*0&4<*;&.8,&/+/./15&)@A$%B&2.1.,&.*&%$:#C@'$"%B&2.1.,D  0,-*012*3'%'-*4%5,(&-  Figure 4.2: The TCP state machine  !"#$%&''(&B3</+=&91.1&.<1+24,<E&+*&5*2.&*<&<,.<1+2;/..,9&?1-F,.2D&:8,&<,12*+&4*<&81G/+=&.8/2&-*+9/./*+& /2&.81.&/.&/2&/;?*<.1+.&.*&,+23<,&.81.H ID %G,<6&2,+.&?1-F,.&812&7,,+&1-F+*05,9=,9(&!1-F,.2&.81.&0,<,&2,+.&73.&+*.&1-F+*05,9=,9&-*359& Reset spoof. A network trace is treated as a set of TCP connections and every 7,&'+2,<./*+&1..1-F&1..,;?.2E&08,<,&?1-F,.&012&1--,?.,9&*+56&76&.8,&'+.<32/*+&B,.,-./*+&262.,;& 1+9&+*.&76&.8,&,+9H8*2.&1+9&8,+-,&+*&#)J&012&2,+.D connection is treated as a pair of TCP state machines with exchange of packets KD L*&3+H+,-,221<6&1-F+*05,9=,;,+.2(&M++,-,221<6&1-F+*05,9=,;,+.&;,1+2&.81.&.8,&*</=/+15& ?1-F,.&012&+*.&<,19&76&.8,&'+.<32/*+&B,.,-./*+&$62.,;&73.&012&1--,?.,9&76&.8,&,+9H8*2.&1+9& between them. Ideally, both state machines should follow all three TCP phases#)J&012&/223,9D&:8/2&/2&1+&%G12/*+&1..1-FD  connection initialization, data transfer and connection termination. At any given !"#$%&'''(&)*++,-./*+&012&?<*?,<56&.,<;/+1.,9D TCP state, depending upon the input, the deterministic TCP protocol allows tran:)!&2.1.,&;1-8/+,&.<1+2/./*+2&.8<*3=8&*+,&*4&.8,&155*0,9&?1.82&4<*;&.8,&%$:#C@'$"%B&71-F&.*&.8,& )@A$%B&2.1.,D sition to a fixed set of states. If a TCP connection witnesses any illegitimate state transitions, there could be a possible attack. For example, a transition from the TCP SYN state to FIN-RECEIVED state would be illegitimate (could be caused by an invalid combination of TCP flags in a packet), or something more simpler such as a transition from SYN-RECEIVED directly to CLOSED state (could be indicative of scanning activity). The TCP State Diagram is depicted in Figure 4.2. The reason for having the condition of no lost or retransmitted packets in the goodness criteria is to ensure that every packet is acknowledged and that no un-  26  necessary acknowledgements were sent. Packets that were sent but not acknowledged could be Insertion attack attempts, where a packet was accepted by an intermediate IDS but not by the end-host and hence no ACK was sent back. Unnecessary ACKs could mean that the original packet was not accepted by the IDS but the ACK was, which is an Evasion attack. Lost packets could also point towards possible denial of service activity.  4.2 Application of Definition to Traffic It is intuitive that an ideal TCP connection will obey these two criteria of goodness. From our experience, fairly well protected networks today are very clean. For traces collected from 3 hosts on 2 different subnets within the Penn State University network, we saw that 93-94% of the network connections were good according to our definition. We attributed 4-5% of the not-good connections to the imperfection in data collection since the capture was started on machines that were already running and was ended abruptly after 24 hours. So we had to take into account these connections that started at the Connection Established state in the recorded trace, were terminated successfully and occurred within the first 5 minutes of capture; or the ones that were established successfully, saw no termination but occurred within the last 5 minutes of capture. Naturally, these connections were removed from the ’bad’ category. Thus, 98-99% of the TCP connections of the traces collected at hosts were good. On the other hand, the connections that violate at least one of these two conditions are termed as ’bad’ connections, and potentially malicious. These connections may contain the various TCP attacks that we discussed- network scanning, port scanning, TCP reset spoof attacks (which are all deviations from the state machine), DOS, DDOS, insertion and evasion attacks (all lead to lost/retransmitted packets).  27  4.3  How to Detect Attacks in Bad Traffic  Now that there is only 1-2% of the total traffic left to analyze, a variety of analysis techniques can be applied to these bad connections. We propose use of volumetric techniques. The connections are first grouped according to source host. So for every host, we look at how many bad connections it makes, how many good connections it makes, and what are the sets of destination hosts and ports that are contacted by that host. We apply some basic volume heuristics here to identify whether or not this host is a scanner. This Scanner detection can be improved further and other attacks like TCP Reset Spoof, Insertion and Evasion can be inferred by analyzing the addresses of the lost or retransmitted packets, but the focus of this section and the scope of the thesis is limited to the details of detecting DDOS attacks, although our approach does allow detection of all other TCP attacks as well. DDoS Attack Detection Based on the hypothesis that packet loss can be used as an indicator for DDOS attacks, this section discusses an approach to detect DDOS victims and sources associated with causing these attacks, using the MIT/DARPA dataset. In our approach, bad connections are grouped by destination host: For every destination host d, the set of hosts {s1 , s2 ,... sn }that were the sources of bad connections to host d is maintained. Then, for every (si ,d) pair, a count of the number of incoming ’bad’ connections that destination d received from source si and a sum of the number of lost packets from each of these connections is maintained. We use lost packets as an indicator of DDOS attack at the victim. So first, the MIT/DARPA data set was used to visualize the distribution of lost packets at destination hosts and the results were reported in a Cumulative Distribution Function (CDF) in Figure 4.3. In the figure, d1 , d2 , d3 ,... represent destinations and along with each destination is a list of sources that made connections to those destinations. The values next to the source names indicate the number of lost 28  volume heuristics here to identify whether or not this host is a scanner. Next, we look for DoS attacks. So we group the bad connections by destination host. So for every host, we look at the set of hosts that were the sources of bad connections to this host and how many bad incoming connections it received per host. From our analysis of the MIT/DARPA data set, the distribution of lost packets at hosts was as shown in the CDF below.  Figure 2: CDF of packet loss for destination host d3. The text below the destination labels is lists of the sources at the other end of a bad connection, the number of packetsFigure lost, total4.3: number of packets and total of bad between the two hosts involving packet loss. CDF of packet lossnumber incurred byconnections destination hosts  We believe that lost and retransmitted packets can be used as an indicator for Denial of Service attacks. number of packets, the number of connections between the So we analyze packets, the top total hosts in terms of lostand packets. We look at the distribution oftwo lost packets over time and identify therespectively. sources that were involved at the other end of the connection. For every potential hosts DoS victim, we come with thethesettopofdestinations associatedthat sources andpacket identify common As theupnext step, incurred loss the are further an- sources out of these. The sources that appear in causing multiple destinations to lose packets are very likely to be alyzed because these might be victims of DDOS attacks. By looking at the CDF, DoS/DDoS sources. establish athe threshold of the first five destinations, because these are into the major However, whileweanalyzing loss over time per host, instead of dividing time bins and looking at to the potential DDOS victims, we look at values of that lost packets percontributors bin, we look at total loss packet densityloss. perFor bin,these which is a weighted function of the bin and neighbouring bins. The losspackets densityover distribution vs. timeallgraph for a victim in the the distribution of lost time and identify the sources that shared a MIT/DARPA dataset was as seen in figure connection with3-the victim at the time of packet loss. Thus, we wish to divide the time into bins and for each bin, look at how connections from individual sources contribute to overall packet loss at the victim. Except that since our bins would have hard boundaries, it is necessary to smooth the values in the bins and instead of calculating number of packets lost per time bin, we calculate the packet loss density which is a weighted function of the values of that bin and it’s neighbour-  29  Figure 3: Loss density distribution over time. The top plot represents host d3. d3 shares lost packets with s1, s2, s3 and s4.  ing bins, using the following formulaN−1  ∑ [(bi−k )2/2k + (bi+k )2/2k ]  k=0  where N is a parameter. bi represents the value of bin number i which is nothing but the number of packets lost in the ith time-bin. The parameter N decides how many sorrounding bins affect the value of the current bin. bi = 0 ∀ i < 0 The result is visualized in Figure 4.4. This plot of loss-density distribution over time for host d3 and all the sources associated with d3 allows us to visualize which sources were associated with DOS/DDOS at d3 at any particular point of time. For every potential DDOS victim, we calculate the set of these associated sources and identify the common sources out of these. The sources that appear to cause multiple destinations to lose packets are very likely to be DOS/DDOS sources. Note that it is important to consider all sources that were connected to the victim at the time of packet loss and not only the ones that witnessed lost packets in their connections to the victim because it might be the case that the DDOScausing connections do not witness any packet loss while legitimate connections to the victim suffer. A similar analysis was applied to the CAIDA dataset. The destination loss CDF is shown in Figure 4.5(a) and a packet loss over time graph for d1 was produced as is shown in Figure 4.5(b). All the packets that destination d1 lost are attributed to connections from source s1 which would happen in a DOS attack.  4.4 Evaluation The DDOS attack detected in the MIT/DARPA trace was a false positive. This trace contained several labelled attacks, but none of the attack types was a flooding DoS 30  o we analyze the top hosts in terms of lost packets. We look at the distribution of lost packets ove me and identify the sources that were involved at the other end of the connection. For every potentia oS victim, we come up with the set of associated sources and identify the common sources out o hese. The sources that appear in causing multiple destinations to lose packets are very likely to be oS/DDoS sources. owever, while analyzing the loss over time per host, instead of dividing time into bins and looking a ost packets per bin, we look at loss density per bin, which is a weighted function of the values of tha in and neighbouring bins. The loss density distribution vs. time graph for a victim in the MIT/DARPA ataset was as seen in figure 3-  Figure 3: Loss density distribution over time. The top plot represents host d3. d3 shares lost packets with s1, s2, s3 and s4.  Figure 4.4: Packet loss density distribution over time for victim d3 (top row). d3 shares lost packets with s1 , s2 , s3 and s4  CDF of packet loss for destination hosts 1  d4  d5  d20 d21 d22 d23 d24 d25 d26 d13 d14 d15 d16 d17 d18 d19 d9 d10 d11 d12 d6 d7 d8  d3  0.8  d2  %Loss  0.6  d1  0.4  0.2  0 0  5  10  15 Host  20  25  30  (a)  (b)  Figure 4.5: Denial of Service attack analysis results for the CAIDA dataset  31  attack. Reportedly, a mail-bomb attack, in which an E-mail server is bombarded with large amounts of incoming email, was started but denial of service could not be achieved. This was detected by the proposed technique, so we report it as a false positive. However, this false positive looks good on our approach because it shows that we can provide an early warning for an attack. In Section 6.2, as we see the packet loss characteristics of an application as load increases and use packet loss as an indicator of application scalability, we can think of it as an early indicator of application failure. The DOS attack detected in the CAIDA dataset was a true positive. The CAIDA trace contained only one such instance, and that was detected so there were no false positives or false negatives. The lack of publicly available data sets containing instances of such attacks was an impediment to a thorough evaluation, as there were no other flooding DOS / DDOS attack instances in the traces we analyzed. Even emulation of such an attack trace using a tool like Emulab would require considerable resources and effort for it to be a close simulation.  32  Chapter 5 Augmenting Existing IDS With DOS detection Denial of Service attacks are an important threat to today’s networks. A recent example was seen in November 2010 when WikiLeaks was targeted just before the website was expected to release classified U.S. documents [20]. In the previous section, we established that our approach for detecting DDOS attacks was effective and successfully detected the one attack instance that was known to us. The Bro Intrusion Detection System, as introduced earlier in Section 3.2 as a commonly used Intrusion Detection System, contains several attack-detection modules that augment Bro’s functionality by detecting for specific types of attacks in the network traffic, lacks a DOS/DDOS detection module. In this section, we gain an understanding of Bro’s architecture and then try to port our DDOS detection module to Bro. We find that there are some challenges in doing so and we discuss why Bro’s architecture is not optimal for adding a DDOS detection functionality.  5.1 Bro Intrusion Detection System Bro is conceptually divided into three layers, as shown in Figure 5.1.  33  brary used by tcpdump [JLM89]. Using libpcap g significant advantages: it isolates Bro from details of network link technology (Ethernet, FDDI, SLIP, etc.) greatly aids in porting Bro to different Unix variants (wh also makes it easier to upgrade to faster hardware as it comes available); and it means that Bro can also ope on tcpdump save files, making off-line development analysis easy. Another major advantage of libpcap is that if the operating system provides a sufficiently powerful ke packet filter, such as BPF [MJ93], then libpcap do loads the filter used to reduce the traffic into the kernel. C sequently, rather than having to haul every packet up to u level merely so the majority can be discarded (if the fi accepts only a small proportion of the traffic), the reje packets can instead be discarded in the kernel, without fering a context switch or data copying. Winnowing do the packet stream as soon as possible greatly abets mon ing at high speeds without losing packets. The key to packet filtering is, of course, judicious se tion of which packets to keep and which to discard. For application protocols that Bro knows about, it captures ev packet, so it can analyze how the application is being u In tcpdump's filtering language, this looks like:  Real-time notification  Policy script  Record to disk  Policy Script Interpreter Event stream  Event control  Event Engine  Tcpdump filter  Filtered packet stream  libpcap Packet stream  Network  port finger or port ftp or tcp port 113 or port telnet or port login or port 111  Figure 1: Structure of the Bro system Figure 5.1: Bro architecture  That is, the filter accepts any TCP packets with a so Bro does for six Internet applications: FTP, Finger, Portmapor destination port of 79 (Finger), 21 (FTP), 113 (Ide Ident, Telnetis and Rlogin. 7 gives status of the im- because The lowest layer-per,”libpcap” also called thethe filtering layer, a tcpdump 23 (Telnet), 513 (Rlogin), and any TCP or UDP packets w plementation and our experiences with it, including a brief a source or destination port of 111 (Portmapper). In addit its performance. 8 offers some thoughts on filter is applied to allassessment traffic of and only the traffic that Bro is interested in analyzing Bro uses: future directions. Finally, an Appendix illustrates how the is allowed to pass through to higher Thetogether resulting filtered packet stream is 0 different elements of thelayers. system come for monitortcp[13] & 7 != ing Finger traffic. then handed up to the next layer, the Bro event engine. to capture any TCP packets with the SYN, FIN, or RST c trol bits set. These packets delimit the beginning (SYN) The ”event engine” layer first performs several integrity checks assure endto (FIN or RST)that of each TCP connection. Because TCP 2 Structure of the system packet headers contain considerable information about e the packet headers are well-formed, and if the checks succeed, then the event enTCP connection, from just these control packets one Bro is conceptually divided into an “event engine” that reextract start time, duration, participating ho duces a stream of (filtered) packets towith a stream higher-level gine looks up the connection state associated theof tuple of the two IP connection addresses ports (and hence, generally, the application protocol), and network events, and an interpreter for a specialized language of bytes sent in each direction. Thus, by captu and the two TCP or that UDP portto numbers, newMore state if nonenumber already exists. is used express a site'screating security policy. generon the order of only 4 packets (the two initial SYN pac ally, the system is structured in layers, as shown in Figure 1. Bro does not rely onThe static port numbers in the thegreatest packet header to identify which exchanged, and prothe final two FIN packets exchanged), lower-most layers process volume of data, can determine a great deal about a connection even tho and hence must limit the work performed to a minimum. As tocol is within the payload, but instead runs Dynamic Protocol Detection (DPD we filter out all of its)data packets. we go higher up through the layers, the data stream diminTheafinal filter we use is: ishes, allowing for more processing per data This baby inspecting the payload of successive packets of aitem. connection until definitive sic design reflects the need to conserve processing as much ip[6:2] & 0x3fff != 0 as possible, order to meet the goalsand of monitoring highprotocol is identified. Basedin on this protocol the incoming packets, several speed, large volume traffic flows without dropping packets. which captures IP fragments, necessary for sound tra  events are raised, which are queued and passed on to the next upper layer, the analysis, and also to protect against particular attacks on monitoring ”event handler” layer. might look like ”SYN Packet Received”, orsystem ”Con-5.3. 2.1 Events libpcap  When using a packet filter, one must also choose a sn shot length, which determines how much of each pa should be captured. For example, by default tcpdump u  From the perspective of the rest of the system, just above the network itself is libpcap [MLJ94], the packet-capture li-  34  3  nection Established”, for example. The ”Policy Script Interpreter” or ”event handler” layer receives a set of events, and for each event, it runs an associated set of event handlers. These event handlers depend upon a site’s security policy and implement attack detection logic. As an example, event handlers could be ”http connection attempt”, or ”check scan”, or ”check synflood”, etc. Some data structures maintain state across these three layers and are used to look-up and update connection related information. The lower-most layers process the greatest volume of data, and hence must limit the work performed to a minimum. As we go higher up through the layers, the data stream diminishes (as packets are converted to events, and events to alerts), allowing for more processing per data item. This basic design reflects the need to conserve processing as much as possible; in order to meet the goals of monitoring high-speed and large volume traffic flows without dropping packets. Another key facet of Bro’s design is the clear distinction between the generation of events versus what to do in response to the events. With this separation of mechanism from policy, Bro follows to be a highly modular architecture that allows flexibility in functionality by simply adding or removing modules, and leaves many options for optimization in order to handle high-speed traffic streams.  5.2  Augmenting Bro with New Detection Mechanism  In it’s default configuration, Bro’s Policy Script Layer contains several different attack detection modules that each analyze different portions of the net incoming network packets. Another attack module, our DDOS detection module will be added, which would be interested in looking at all TCP activity in the trace right from receiving every packet and aggregating packets to the connection level to tracking the bad connections and lost packet count. It would involve writing a DDOS event handler in the Policy Script layer that performs attack detection logic, writing corresponding events at the Event Engine 35  layer that are raised when packets of a connection are lost, and manipulating the data structures to track lost packets. Finally, the filter layer would have to be modified to allow all TCP packets to pass through.  5.3  A Closer Look at the Bro Filter Stage  As discussed earlier, the DDOS detection module would need Bro to accept every TCP packet. Bro’s Policy Script Layer contains several different attack detection modules that each specify a specialized filter to the Bro filtering layer, to allow the traffic that each is concerned with to pass through. The Bro filtering layer calculates a logical OR of all these specialized filters and applies an optimization step to generate one universal filter. This universal filter is applied to the traffic at the filtering layer. The default Bro filter expression is((((((((((port telnet or tcp port 513) or (tcp dst port 80 or tcp dst port 8080 or tcp dst port 8000)) or (port 111)) or (tcp port smtp or tcp port 587)) or (port 6667)) or (tcp src port 80 or tcp src port 8080 or tcp src port 8000)) or (port ftp)) or (tcp[13] & 7 != 0)) or (port 6666)) or ((ip[6:2] & 0x3fff != 0) and tcp)) or (udp port 69) If a DDOS module were to be added, the filter expression will be logically ORed by a (tcp) term and the net expression would allow all TCP traffic to pass through. With this in place, the filter layer is proving ineffective in doing it’s job of reducing the data forwarded to upper layers, and this is where a layered architecture has its pitfalls, as now all analyzers in the Event Engine layer receive all TCP traffic. Another architectural complication that was observed in the Bro IDS was related to support for it’s DPD feature. This feature is needed because statically reading port numbers from packet headers is not a definite indicator of the protocol present in payload, because of common use of unconventional port numbers and port forwarding. Bro employs DPD where it looks at successive packets’ payloads and applies heuristics to identify the protocol in question. Meanwhile, it 36  spawns several analyzers (that generate events) for all the prospective protocols and as and when some of these choices get eliminated, kills the analyzers for those. This processing takes place at the Event Engine layer, only after the Filtering Layer has already filtered packets based upon static port numbers. So the DPD feature is useful only if the filtering layer is disabled, and the filtering layer step makes sense only if static ports are known and DPD is disabled. In the default Bro configuration both are enabled, which is fundamentally incorrect.  37  Chapter 6 Improving the Performance of Existing IDSes Threats are real world and so attack detection time is crucial. The risk of damage having already been inflicted before a threat has been detected makes intrusion detection in live network monitoring desirable. For live or active intrusion detection, it is necessary that an IDS scale to variations in packet speeds. This makes performance of IDSes an important concern and this section presents two approaches that make IDSes (specifically Bro) faster and allows them to scale better. Currently, Bro follows a single-threaded model where a packet enters, passes through its three stages of operation- the filter stage, event analysis and event handling- and a decision on whether or not that packet is to be allowed is taken. All this processing is done by a single thread of execution, and once the processing of one packet is finished, Bro moves on to the next arriving packet. A problem here is that if the incoming packet rate is such that the time between arrival of two packets is less than the processing time for the first packet, Bro will start to lag and be unable to cope with the network traffic speed. In such situations, Bro will lose packets and as a result fail to protect the network against intrusion attempts. Thus, it is important to speed up Bro and to make it scale to be able to handle higher packet rates. 38  This is a well known problem and the most popular solution adopted by the Bro Community is the Bro Cluster. Instead of running a standalone instance of Bro, a Bro Cluster is set up that helps Bro scale by splitting the work among a number of back-end nodes and aggregating the analysis results. However, the a cluster-based solution faces some challenges, mainly in the setup and maintenance cost, and do not make for a practical solution especially for small organizations. This section approaches the goal of higher performance for intrusion detection at two different levels. First, we try to make use of cloud computing to effectively distribute IDS processing between different nodes to make the overall intrusion detection process faster and thus able to handle higher packets rates. And secondly, we approach the problem at an Operating Systems level, where we make use of software programs and modern network cards to effectively distribute intrusion detection between different cores of a single machine.  6.1  Using Cloud Computing  Extension of the Bro Cluster to a cloud would be a valuable addition to Bro because of the dynamic scaling capability of the cloud architecture. In an enterprise where the amount of traffic flowing across the boundary could vary a great deal, a cloud architecture also provides the flexibility that would be needed in such a scenario. Moreover, clouds have proven to be an economical and practical solution for enterprise environments. The existing Bro Cluster The IDS cluster is designed in terms of four main components (see figure Figure 6.1).Front-end nodes distribute the traffic to a set of Backend nodes. The Backends run Bro instances that perform the traffic analysis and exchange state via communication Proxies. Finally, a central Manager node provides the clusters centralized user interface. A cluster-based approach serves several benefits-  39  Fig. 1. Cluster architecture. Figure 6.1: IDS cluster  Transparency. The system conveys to thetwo operator the impression of interBackend 1.nodes are commodity PCs with NICs: one for receiving traffic actingand with only single Bro, producing results a single Bro would if it the comfrom the frontend, one afor communication withas the manager and could cope withbackend the total load. munication proxies. Each node analyzes its share of the traffic using an instance of the NIDS, forwards results to volumes the restgrow of the and 2. Scalability. Since network traffic with cluster time, we as cannecessary, easily receives information from backends requiredanfor globalload. analysis of activity. add more nodesother to the cluster to accommodate increased All backend nodes are using the same NIDS configuration and thus perform the 3. on Commodity hardware. In general, we leverage the enormous flexibility and same analysis their traffic slices. economies-of-scale thatare operations on commodity hardware bring over The communication proxies a technical tool to avoidcan direct communicaof customnodes, hardware (e.g., ASICs FPGA). for monitoring tion among alluse backend which does or not scaleHowever, well due to requiring fully very high-speed links, weOften may need to resortproxy to specialized hardware for meshed communication channels. a single node suffices. It connects the Frontends, as these need to processitpackets at fullfrom line-rate. to each backend, forwarding information receives any of them to the others. 4. Ease of Use. The operator interacts with the system using a single host as The manager node provides the operator-interface to the cluster. It aggrethe primary interface, both for accessing aggregated analysis results and for gates and potentially filters the results of the backend nodes to provide the tuning the systems configuration. user with a coherent view of the overall analysis. The manager also facilitates easy management of clusters the cluster, such as performing configuration updates and However, face some challenges in cost and maintenance, and somestarting/stopping of nodes. 40 While conceptually proxies and managers are separate entities, we can combine them on a single host or co-resident with a backend system if their individual workloads permit (as is the case for some of our current installations).  3  Distributing Load  times do not make for a practical solution especially for small organizations. A cloud computing based solution offers a more appealing solution, that has the following advantages over the cluster1. Scalability. As compared to a cluster, the cloud is more scalable and easier to manage. There is endless amounts of storage and computing capacity. Service costs are based on consumption, and customers pay for only what they use. 2. Easy Implementation. Without the need to purchase hardware, a company can get its cloud-computing arrangement off the ground in record time - and for a fraction of the cost of an on-premise solution. 3. Manageability. By placing storage and server needs in the hands of an outsourcer, a company essentially shifts the burden placed on its in-house IT team to a third-party provider. The result is that in-house IT departments can focus on business-critical tasks without having to incur additional costs in manpower and training. 4. Quality of Service. Network outages can send an organizations IT department scrambling for answers. But in the case of cloud computing, it is up to a companys selected vendor to offer 24/7 customer support and an immediate response to emergency situations. Thats not to suggest that outages dont occur. For example, In February 2008, Amazon.com’s S3 cloud-computing service experienced a brief outage that affected a number of companies. Service was restored within three hours. Bro on the Cloud There are many ways in which an IDS can be thought of in a cloud setting. It could mean that the whole IDS runs on one or more machines of the cloud, and a copy of all of an organization’s network traffic is sent to this IDS, which utilizes the cloud’s superior processing capabilities to perform its operation. This 41  approach is useful especially when the organization’s network is running on the same cloud, otherwise it is costly to transfer the great volume of traffic flowing to an enterprise’s network to the cloud, and there could be privacy issues with such a transfer. Another scenario could be to perform basic processing on-site at the organization and then send a reduced data stream to the cloud, for example in the case of Bro, carrying out the filtering and analysis steps locally but performing the attack detection step on the cloud network by sending information about events raised in batches. The problem with such a solution is that it still involves the cost of transferring data to and from the cloud, raises privacy concerns, and also involves modifying the source code of the original IDS. In this work, we assume that the whole IDS is running on the cloud. The Amazon EC2 cloud already has a Snort Amazon Machine Instance (AMI) available for customers. [22] This instance can sit in front of the cloud network to perform intrusion detection. We go a step further and instead of installing stand-alone Bro on a cloud image, configure the Bro Cluster on Amazon EC2 cloud. An IDS cloud implementation will help realize all the aforementioned benefits of cloud computing but at the same time also invite some challenges. This section presents the challenges that are faced by any cloud-based Intrusion Detection System. Challenges in the Cloud One inconvenience in the cloud setting is that the machines on a cloud do not have fixed IP addresses or domain names like they do in a cluster setting. Every time an instance starts, its updated IP address needs to be registered with the IDS Cloud Manager. But the main challenge is the implementation of the front-end. While in a cluster, the job of the front-end (which is to load-balance across the worker nodes in the cluster) can be effectively performed by installing specialized hardware, this option does not exist for cloud. Let’s take a look at this challenge more closely. In the Bro cluster, the front-end node load-balances the incoming traffic by 42  distributing it among the back-end worker nodes in a way that minimizes internode communication, to maximize performance. Typically, all packets belonging to a particular connection get sent to the same worker node. This forwarding of packets is achieved at the front-end node by simply rewriting the source and destination MAC address fields of the packet and sending it out on the cluster’s network, while keeping the remaining fields of the packet intact so as to retain all packet information for intrusion detection. The front-end operation can be performed by either using dedicated hardware like CISCO 3750 E switch or P10s Field-Programmable Gate Array (FPGA); or in software by using a software-based load balancer like the Click Modular Router[4]. In the Bro Cloud, MAC address rewriting is not enough, simply because nodes in the cloud may not be within the same subnet. When the modified packet reaches a router, it is dropped because of the destination IP address and destination MAC address mismatch. Hence, one has to either encapsulate the original packet in a new packet that contains the source IP and MAC addresses of the front-end and destination IP and MAC addresses of the worker, or one has the option of configuring a Virtual Private Cloud, in which the two nodes would act as if they were on the same subnet. Either way, there is encapsulation overhead involved, which is detrimental to the performance of the front-end. An optimal solution would still be for the cloud provider to provide specialized hardware support for the purposes of load-balancing. In an experiment, a Bro cluster was successfully configured in the Amazon EC2 cloud. This involved fixing the IP address challenge as mentioned above, as well as configuring seamless Secure Shell (SSH) access between all nodes using EC2’s authentication method. Front-end, not considered as a part of the Bro Cluster by Bro, still remained a challenge. The output of the cluster’s Manager of the cloud-based Bro deployment looked like figure Figure 6.2  43  need to purchase cloud-computing time - and for a remise solution.  acing storage and urcer, a company in-house IT team that in-house IT cal tasks without ower and training.  ges can send an for answers. But to a company’s support and an ons. That’s not to mple, In February mputing service ed a number of ored within three  greater than those ces on a cloud do es. Every time an s to be registered mless SSH access most importantly, -balancing at the  Queue -> ToDevice(eth3);  that simply accepts a packet from the eth0 interface, modifies it’s source and destination MAC addresses and sends it out on the eth3 interface. Once configured the output of the Bro Cloud on the manager is as seen in Figure 4.  Figure Output BroCtl shell Figure 6.2: 4: Bro cloudofconfiguration  6.2 At the Operating-System Level As we saw in Section 6.1, cloud implementations of IDSes face a critical challenge VI. CONCLUSION in the implementation of an effective load-balancer and don’t provide a practical In conclusion, the IDS current poses solution over the existing clusterBro idea.Cluster Instead,Architecture another initiative with athe bottleneck in the front-end. The front-end can either be goal to scale Bro’s performance in mind has been underway- the Bro multi-core implemented in hardware using a CISCO 3750 E switch or project. In 2007, an architecture was proposed[21] that involved exploiting multiP10’s FPGA, or using a software-based load balancer like processor fortook scaling Thepath ideaand hereexposed was fairlychallenges simple- since Click. functionality This project theBro. latter all packets across aarchitecture single connection state,the they should all be processed by in the cloud that share impede use of software-based the load same balancers, core in orderwhich to optimize memory access at the localoverheads. caches. Similarly, impose further processing An optimal solution will involve some support in from the cloud results from analysis at different cores should be aggregated a memory-efficient provider in mapping allocating machines for core the is IDS so that device the way. The job of packets to the right donecloud, by a hardware frontend implementation madewire, as efficient possible. sitting in between the processor can and be network called theasActive Network Interface (ANI). Since 2007, the multi-core Bro code is still under development and the ANI device never saw production. This section describes an attempt to substitute the ANI with software by making use of functionality that exists on many modern network cards- multi-queues. The software component that allows us to take advantage of this hardware functionality is the existing TNAPI and PF RING libraries [6], which have already been demonstrated to drastically improve the performance of several multi-threading applications. Thus, the goal of this section is to apply TNAPI and PF RING to Bro in order for it to scale. Multi-threaded Bro Due to the combination of the rising sophistication of attacks requiring more complex analysis to detect and the relentless growth in the volume of network traffic 44  that we must analyze, it is becoming increasingly difficult for uni-processor systems to implement effective systems for preventing network attacks with their failure in recent years to sustain the exponential gains that for so many years CPUs enjoyed. Thus, taking advantage of multi-core processors has become a necessity. However, this requires an in-depth approach, or otherwise it can prove to be counter-productive. In [21], the authors extend the simple Bro single-threaded architecture as was described in Figure 5.1, to frame an architecture customized for parallel execution of network attack analysis in Bro. This architecture is shown in Figure 6.3. At the lowest layer of the architecture is an Active Network Interface, a custom device based on an inexpensive FPGA platform that does the job of dispatching packets to the analysis components executing in different threads. The ANI derives its dispatch decisions based on a large connection table indexed by packet header five-tuple. The role of the ANI becomes crucial in that in order to optimize memory access and thus performance, the ANI should forward all packets belonging to a connection to the same core. An important point is that unlike for the rest of the architecture, the authors made the presumption that the ANI can be custom hardware, specialized for the task. However, although this hardware may be easily implementable using FPGAs[21] and not very costly, it does not make for a practical solution since it was never sent to production. Instead, the ANI can be substituted by leveraging some of the existing hardware features on modern network cards to provide the same functionality, with the required performance gains as described ahead in this section. Existing Network Architecture As can be seen in Figure 6.4, contemporary network architecture is centered on a network adapter that tries to improve network performance by splitting a single RX Queue into several queues, each mapped to a processor core. The idea is to balance the load, both in terms of packets and interrupts, across all cores to improve the overall performance. However, device drivers are unable to preserve 45  Thread  Thread  Cached Queues  Thread  L1 D-Cache  Thread  Thread  Thread  Thread  we have argued that gree of parallelism ring. However, doue the parallelism. detection/prevention ned load-balancing cation between the ese approaches buy prevents significant require fine-grained  CPU Core 2  CPU Core 1 Thread  cessors does in fact ploit the full power , we must explicitly el fashion: dividing e minimizing inter-  L2 Cache & Main Memory Core 1 Pkt-Q Core 1 Event-Q  Conn Table Host Table  Core 2 Pkt-Q Core 2 Event-Q  ...  L1 D-Cache Cached Queues  ...  Core 1 MSG-Event-Q Core 2 MSG-Event-Q  ...  External MSG-Event-Q  Active Network Interface  Packet Dispatch Pending Pkts  Figure 6.3: Architecture of multi-core Fig. 1. Structure of architecture for parallel executionBro of network attack analysis.  ulti-core processorsthis design up to the application as they merge all queues in one as it used to iously, we need to happen with legacy adapters featuring only one queue because in most operating for network security § III.using In §packet IV we discuss our packets are fetchedanalysis by a singeinthread polling techniques. w-level threads that systems, in more concrete terms and outline we how plan becomes the first bottleneck in the architecture. do so, however, we Thisarchitecture to implement and evaluate a full NIPS built according to the On the memory management front, in most operating systems, captured packapproach. In § V, we evaluate the potential of a crucial part ionality (i.e., active ets are moved to user land via mapped memory. As it turns out, for most network of our architecture with real-world network traffic. In § VI, st ensure that pack- monitoring or intrusion detection applications, these packets dont need to make we present the prototype implementation and its performance nt processing gives the journey from the network driver to the user land through the kernel network measurements. § VII covers the rich related work in this area. stack. Thus, incoming packets need not be copied into a memory area that holds We conclude in § VIII. detection [18], [52], the packet until it gets copied to user land via memory map, thus avoiding unnectone detection [55], essary kernel memory allocation and de-allocation. Instead, zero-copy could start II. OVERVIEW d-and-control [11]) directly at the driver layer and not just at the networking layer. This would get rid begin our discussion with an overview of the architecacross threads, but of our We second bottleneck, which is haunted by the problem of efficient memory communication to ture we envision; Figure 1 illustrates its overall structure. At management for multi-core systems. the bottom of the diagram is the “Active Network Interface” he memory locality (ANI). This component provides an in-line interface to the 46 later (after they have been ts with the ways in network, reading in packets and eads within a CPU approved) forwarding them. It also serves as the front-end for be able to express dispatching copies of the packets to the analysis components dent to the memory executing in different threads. The ANI drives its dispatch decisions based on a large n CPU, so we can tations of analysis connection table indexed by packet header five-tuple. The table  into several queues, each mapped to a processor core. The idea is to balance the load, both in terms of packets and interrupts, across all cores hence to improve the overall performance. However, device drivers are unable to preserve this design up to the application as they merge all queues in one as it used to happen with legacy adapters featuring only one queue because in most operating systems, packets are fetched by a singe thread using packet polling techniques. This becomes the first bottleneck in the architecture.  necessary to change the mech when an incoming packet has be is responsible of spawning one and binding it to the same core w are received.  PF_RING has been adapted to TNAPI before upper layers inva the authors implemented insid “virtual network adapter”, a fea capture accelerator. Monitoring a physical device (e.g. eth1) for queues, or to a virtual device packets from a specific queu applications to be easily split in of execution, each receiving a traffic.  Conventional Networking Architecture FigureFigure 6.4:3:Conventional network architecture  On the memory front, in most operating Proposed ChangesTNAPI management and PF RING systems, captured packets are moved to user land via mapped The two bottlenecks in the sectionmonitoring are treated or using two exmemory. As itdiscussed turns out, for previous most network intrusion detection applications, these packets don’t need to isting concepts- TNAPI (Threaded New Application Programming Interface) and Figure 4: Proposed N make the journey from the network driver to the user land PF RING. In a nutshell, TNAPI polls all available RX Queues simultaneously by through the kernel network stack. Thus, incoming packets need be copied into a memory area that holds the packet it to multiple kernel not threads (one kernel thread per queue) and passes the until packet As both the NIC RX ring gets copied to user land via memory map, thus avoiding PF RINGs. Depending on the PF RING configuration, these packets can then statically, TNAPI doe allocated unnecessary kernel memory allocation and de-allocation. follow Instead, the standard journey into start the kernel or at bethe pushed to a PF buffers RING-(skb) as vanilla Linux do zero-copy could directly driverdirectly layer and payload from the RX NIC ring not just at the layer. Thus would get rid6.5. of our aware application. Thisnetworking can be visualized as shown in Figure means that for each incoming second bottleneck, which is haunted by the problem of efficient allocation/deallocation as well TNAPI is implemented directly into the NIC driver, as it is necessary to change memory management for multi-core systems. PCI bus the mechanism that notifies the kernel when an incoming packet has been re- is avoided. This solution involves chang ceived. The TNAPI driver is responsible of spawning one thread per RX queue kernel level (PF_RING) as wel per port, and binding it to the same core where interrupts for such queueThus, are re-Bro will have to be mo V. PROPOSED CHANGES- TNAPI & PF_RING exposed by PF_RING in an effic ceived.  PF RING has been adapted to read the queue identifier from TNAPI before The two bottlenecks discussed in the previous section are upper layers value. In such a way the authors implemented inside treatedinvalidate using twoitsexisting conceptsTNAPI (Threaded New Application Programming Interface) and PF_RING. In a VI. R nutshell, TNAPI polls all available RX Queues simultaneously by kernel threads (one kernel thread 47 per queue) and passes the packet to multiple PF_RINGs. Depending on the PF_RING Several experiments wer configuration, these packets can then follow the standard functionality and performance o journey into the kernel or be pushed directly to a PF_RINGwithout TNAPI and PF_RING aware application. machines of similar configuratio processors with 32 GB RAM r  iques. This becomes the first  TNAPI before upper layers invalidate its value. In such a way the authors implemented inside PF_RING the concept of “virtual network adapter”, a feature provided only by high-end capture accelerator. Monitoring applications can either bind to a physical device (e.g. eth1) for receiving packets from all RX queues, or to a virtual device (e.g. eth1@2) for consuming packets from a specific queue. The latter solution allows applications to be easily split into several independent threads of execution, each receiving and analyzing a portion of the traffic.  etworking Architecture  nt front, in most operating ved to user land via mapped most network monitoring or these packets don’t need to Figure 4: Proposed Networking Architecture work driver to the user land Figure 6.5: Proposed network architecture Thus, incoming packets need that holds the packet until it As both the NIC RX ring and PF_RING RX ring are memory map, thus avoiding PF RING the concept of virtual network adapter, featureinto provided allocated statically, TNAPI does not copy apackets socketonly by highlocation and de-allocation. buffers (skb) as vanilla Linux does; instead it can copies the packet ectly at the driver layer end and capture accelerator. Monitoring applications either bind to a physical payload from the RX NIC ring to the PF_RING RX ring. This Thus would get rid of our device (e.g. eth1) receiving packets from all RX queues, or to a virtual device means thatforfor each incoming packet, costly skb memory ed by the problem of efficient allocation/deallocation as well from memory mapping across (e.g. eth1@2) for consuming packets a specific queue. Thethelatter solution re systems. PCI bus is avoided.  allows applications to be easily split into several independent threads of execution, This solution involves changes at the driver level (TNAPI), each receiving a portion kernel and levelanalyzing (PF_RING) as wellofasthe thetraffic. user-application level. Thus, Bro will have to be modified to use ring the RX As both the NIC RX ring and PF RING RX are Queues allocated statically, ES- TNAPI & PF_RING exposed by PF_RING in an efficient manner. TNAPI does not copy packets into socket buffers (skb) as vanilla Linux does; inin the previous sectionstead are it copies the packet payload from the NIC RX ring to the PF RING RX ring. pts- TNAPI (Threaded New This means that for each incoming packet, costly skb memory allocation/dealloface) and PF_RING. In a RESULTS cation as well memory mappingVI. across the PCI bus is avoided. e RX Queues simultaneously ad per queue) and passes theThis solution involves changes at the driver level (TNAPI), kernel level (PF RING) Depending on the PF_RING Several experiments testmodified the as well as the user-application level.were Thus, performed Bro will havetoto be to use the n then follow the standard functionality and performance of the existing Bro IDS with and hed directly to a PF_RINGRX Queues exposed by PF in an efficient without TNAPI andRING PF_RING. For themanner. experiments, two machines of similar configuration were used- Intel Xeon 8-core processors with 32 GB RAM running Ubuntu Linux, with 10  48  Results Several experiments were performed to test the functionality and performance of the existing Bro IDS with and without TNAPI and PF RING. For the experiments, two machines of similar configuration were used- Intel Xeon 8-core processors with 32 GB RAM running Ubuntu Linux, with 10Gbps Intel 82598EB Network Interface Cards. To measure performance, one machine was used as an experiment machine to test out different software configurations while the other one was used to launch network traffic at varying speed using the tool tcpreplay to test performance. As a first experiment, TNAPI and PF RING were installed and configured. PF RING was set to the mode of operation where all packets are not forwarded to the kernel to pass through the network stack but instead discarded immediately. This experiment was a success, and the result was verified by trying to (unsuccessfully) SSH into the machine running PF RING which was hijacking the packets. As for the next experiment, a simple user application called pfcount was deployed on top of PF RING, which simply counted the number of packets reaching different threads that were polling different RX Queues. Figure 6.6 shows the output of this application, which proves that all RX Queues were being used and the PF RING-exposed queues were successfully short-circuited to the NICs actual RX Queues. Flow Director is a hardware feature provided by several network cards that routes all packets belonging to the same connection to the same RX Queue. This is desirable from Bro’s point of view because packets of a connection share state. So the next experiment involved making use of the Flow Director feature provided by the network card, but the experiment was unsuccessful. The TNAPI driver for that particular network card did not support this feature. For the next experiment, the performance gain achieved by using TNAPI alone was established. For this, number of packets successfully received by the kernel was monitored using /proc/net/dev in two cases- where the TNAPI driver was either present or absent. As can be seen in tables Table 6.1 and Table 6.2, a 49  PF_RING which was hijacking the packets. For the next experiment, a simple user application called pfcount was deployed on top of PF_RING, which simply counted the number of packets reaching different threads that were polling different RX Queues. Figure 5 shows the output of this application, which proves that all RX Queues were being used and the PF_RING-exposed queues were successfully short-circuited to the NIC’s actual RX Queues.  between two versions Bro conventional Bro installatio with TNAPI and PF_RIN graphs in Figure 8 visual demonstrate the results obta Packets/sec 34000 50000 65,000 120000 130000 166000 170000 171000 173000 175000 178000 210000 215000 230000 240000  Table 1: Bro-Rin  Figure 6: Output of the PF_RING-aware user application pfcount  Figure 6.6: Output of the PF RING-aware user application pfcount  Flow Director a hardware featurewith provided several Figure 8: Bro-Rin performance gain of 65% ispercent was observed TNAPIbywhich is attributed network cards that routes all packets belonging to the same to theconnection simple facttothat RXRX Queues areThis getting polled simultaneously theall same Queue. is desirable from Bro’s by the From these results, it i kernel.point of view because packets of a connection share state. So in Ring. Thus, TNAPI and PF experiment,the I performance tried to use of the Flow Director feature In another another experiment, Bro was tested between two versionsperformance. Although thi provided by the network card, but could not succeed in this. Bro and new, where Bro isthe theTNAPI conventional while Bro this new remains a positive res MyBro suspicion is that driverBroforinstallation my particular accordance with the PF_R meansnetwork a Bro instance with TNAPI and PF RING enabled. Between the card didrunning not support this feature. runs on a single user threa two configurations, it experiment, was noticed Ithat Bro outperformed Bro new, al-Queues arrive on differ For my next established the performance gainwhichRX by using TNAPI alone. For this, to number of packets thoughachieved a performance degradation, was considered be a positive result because architecture of multi-proce successfully received by the kernel was monitored using arrive on the thread runn the phenomenon is in accordance with the PF RING concept. The Bro installation /proc/net/dev in two cases- where the TNAPI driver was either substantially worse for a TN runs on a single thread, different RX Queues arrive on present oruser absent. As whereas can be packets seen in from Figure 7, a performance 50  Table 6.1: Number of packets successfully received by kernel without TNAPI support Trial no.  Received Packets  Dropped Packets  1 2  634,929 668,941  734,188 700,029  Table 6.2: Number of packets successfully received by kernel with TNAPI support Trial no.  Received Packets  Dropped Packets  1 2  1,095,874 1,069,673  273,171 299,459  different user threads. Owing to Non-Uniform Memory Access (NUMA) architecture of multi-processors, for the packets that do not arrive on the thread running Bro, memory-access time is substantially worse for a TNAPI setting than for a non-TNAPI setting where packets arrive at the same user thread. Thus, a performance gain is not possible until the user application, in this case Bro, has been adapted to run multiple user-level threads that treat packets incoming on different RX Queues. In the final experiment, in an attempt to simulate multi-threaded Bro and truly evaluate the performance of our approach, we run Bro in three different configurations- Bro, Bro Multiple and Bro RING. In Figure 6.7, Bro is the default Bro configuration, wherein one instance of single-threaded Bro is running on the 8-core test machine, and the software modules PF RING and TNAPI are disabled. For both Bro Multiple and Bro RING, we have one single-threaded Bro instance running per core, so a total of 8 Bro instances, each on a different core, but the difference between the two being that PF RING and TNAPI are disabled for Bro Multiple and enabled for Bro RING. We believe that using Bro RING, al51  300000  Bro Bro_Multiple Bro_RING Ideal  Bro packet receiving rate (packets/sec)  250000  200000  150000  100000  50000  0 0  50000  100000  150000  200000  250000  300000  Line packet rate (packets/sec)  Figure 6.7: Bro, Bro Multiple and Bro RING for varying packet rates: The x-axis represents incoming packet rate on the network wire, and the y-axis represents the packet rate received by Bro. though the setup does not exactly simulate multi-threaded Bro since there should be a process that aggregates attack detection across all these individual Bro instances, this configuration should at least provide us an upper-bound on the performance. The plot in Figure 6.7 demonstrates the results obtained. It is clear that Bro RING is closer to ideal behavior than Bro. Due to efficient transfer of packets to a particular Bro instance with the help of TNAPI andPF RING, Bro RING reduces the processing time for each packet and allows Bro to handle higher packet rates. Our explanation for the poor performance of Bro Multiple is attributed to singe-threaded polling on the RX queues, and the synchronization required for 8 Bro user threads that compete for each incoming packet.  52  Conclusion Since uni-processor systems have failed to maintain the trend of exponential speedup, it becomes necessary to make the shift to multi-processor architecture. In making Bro adapt to multi-core architectures, the Bro multi-threading project is already underway, but it proposes a FPGA hardware device called the Active Network Interface that is not readily available. This project eliminates the need for the ANI device by leveraging some of the hardware features that modern network cards provide along with existing tools at the driver and kernel level- TNAPI and PF RING, and argue for an overall speed-up because of a more parallelized and better-suited architecture for applications such as network monitoring and intrusion detection where the packets path to kernels network stack need not be taken. In this section, it was established that TNAPI and PF RING are well suited for Bro, and thus the ANI hardware can be successfully substituted with a software solution. The results demonstrate this speed-up. Since the Bro multi-threaded source code is still under construction, it could not be used to test for the exact gain in performance, but we were able to conduct an experiment to simulate multithreaded Bro and evaluate an upper-bound on the performance.  53  Chapter 7 Conclusion In a detailed analysis of a network trace from the past, we understood the composition and connection behaviour in a network, with emphasis on the TCP protocol. From this exercise and a survey of the intrusion detection techniques proposed over the past decade, we devised a novel and general technique that detects for all types of TCP attacks. We delve deeper into the details of detecting a particular class of TCP attacksbandwidth based Denial of Service (DoS) attacks. By using packet loss as an indicator of Denial of Service, we devised an algorithm that detects for the occurrence of Denial of Service attacks. After evaluating the effectiveness of this approach, we tried to add its functionality to an existing Intrusion Detection System (IDS), but faced some challenges in doing so which severely limit the performance of the IDS. After taking a closer look at the IDS architecture and exploring why it’s architecture is not suited to the addition of a DoS detection module, we also suggest two other ideas of improving the performance of existing IDSes- one by using cloud computing and the other by optimizing for multiple cores. A cluster-based implementation of Bro IDS was successfully deployed in a cloud setting, but we concluded that IDS performance in cloud computing is restricted by the performance of the load-balancer, which poses a significant implementation challenge 54  in the cloud environment. The other idea of making use of multi-queues in modern network cards to fully utilize multi-core functionality proved to improve the scalability of Bro IDS significantly. Overall, this thesis has explored several ideas that contribute to the functionality and performance of existing Intrusion Detection Systems.  55  Bibliography [1] Y. G. Andrew. Detecting anomalies in network traffic using maximum entropy estimation, 2005. → pages 18 [2] P. Barford and D. Plonka. Characteristics of network traffic flow anomalies. In In Proceedings of ACM SIGCOMM Internet Measurement Workshop, 2001. → pages 18 [3] P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. In In Internet Measurement Workshop, pages 71–82, 2002. → pages 23 [4] Bro. Cluster frontends. http://www.bro-ids.org/wiki/index.php/ClusterFrontends. → pages 43  [5] F. Cohen. Computer viruses: Theory and experiments, 1984. → pages 2 [6] L. Deri and F. Fusco. Exploiting commodity multicore systems for network traffic analysis, july 2009. [online]. available: http://ethereal.ntop.org/multicorepacketcapture.pdf. → pages 44 [7] P. Du and A. S. Detecting dos attacks using packet size distribution. In In Bionetics, 2007. → pages 23 [8] T. M. Gil and M. Poletto. Multops: a data-structure for bandwidth attack detection. In In Proceedings of 10th Usenix Security Symposium, pages 23–38, 2001. → pages 23 [9] S. Guha, J. Ch, N. Taft, and K. Papagiannaki. How healthy are todays enterprise networks? → pages 3  56  [10] A. Hussain, J. Heidemann, and C. Papadopoulos. A framework for classifying denial of service attacks. In In Proceedings of ACM SIGCOMM, pages 99–110, 2003. → pages 23 [11] T. Karagiannis, K. Papagiannaki, and M. Faloutsos. Blinc: Multilevel traffic classification in the dark. In In Proceedings of ACM SIGCOMM, pages 229–240, 2005. → pages 18 [12] A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. In In ACM SIGCOMM, pages 219–230, 2004. → pages 18 [13] A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In In ACM SIGCOMM, pages 217–228, 2005. → pages 18 [14] R. Mahajan, S. M. Bellovin, S. Floyd, J. Ioannidis, V. Paxson, and S. Shenker. Controlling high bandwidth aggregates in the network. ACM Computer Communication Review, 32:62–73, 2002. → pages 23 [15] G. R. Malan, D. Watson, F. Jahanian, and P. Howell. Transport and application protocol scrubbing. In In Proceedings of INFOCOM 2000, pages 1381–1390. IEEE, 2000. → pages 25 [16] A. L. Mark, M. Crovella, and C. Diot. Characterization of network-wide anomalies in traffic flows. In In ACM/SIGCOMM IMC, pages 201–206, 2004. → pages 18, 19 [17] J. Mirkovic, G. Prier, and P. Reiher. Attacking ddos at the source, 2002. → pages 23 [18] D. Moore, G. Voelker, and S. Savage. Inferring internet denial-of-service activity. In In Proceedings of the 10th Usenix Security Symposium, pages 9–22, 2001. → pages 23 [19] C. mou Cheng, H. T. Kung, and K. sin Tan. Use of spectral analysis in defense against dos attacks. In In Proceedings of the IEEE GLOBECOM, pages 2143–2148, 2002. → pages 23 [20] S. Musil. Wikileaks: We are under denial-of-service attack. http://news.cnet.com/8301-1023 3-20023932-93.html. → pages 33 57  [21] V. Paxson, N. Weaver, and R. Sommer. An architecture for exploiting multi-core processors to parallelize network intrusion prevention. In In Proceedings of the IEEE Sarnoff Symposium, 2007. → pages 44, 45 [22] Snort. Snort now available on the amazon cloud. http://www.snort.org/ news/2010/07/07/snort-now-available-on-the-amazon-cloud/, . → pages 42 [23] Snort. Best ids/ips solution. http://www.scmagazineus.com/best-idsips-solution/article/130871/, . →  pages 21  [24] A. Soule, K. Salamatian, and N. Taft. Combining filtering and statistical methods for anomaly detection. In In Proceedings of IMC, 2005. → pages 18 [25] D. Whyte, P. van Oorschot, and E. Kranakis. Tracking darkports for network defense. In In ACSAC, 2007. → pages 18 [26] Y. Zhang, Z. Ge, A. Greenberg, and M. Roughan. Network anomography. In In IMC, 2005. → pages 18  58  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0052037/manifest

Comment

Related Items