PERG-Rx : an FPGA-based pattern-matching engine with limited regular expression support for large pattern databases

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

PERG-Rx : an FPGA-based pattern-matching engine with limited regular expression support for large pattern databases Ho, Johnny Tsung Lin

Abstract

Pattern-matching is a fundamental technique found in applications like a network intrusion detection system (NIDS) and antivirus. By comparing an input data stream against a set of predefined byte-string patterns, malicious activities can be detected. Due to the increasing pattern database size, hardware engines are frequently being used to accelerate pattern-matching because software-based solution can no longer keep up. This thesis presents PERG, a FPGA-based pattern-matching engine targeted for the large pattern database found in Clam AntiVirus (ClamAV). Previous hardware pattern-matching engines have targeted the Snort database used in NIDS. PERG is the first hardware pattern-matching engine designed for the ClamAV pattern database, which is several times larger than Snort in both the number of patterns and the range of pattern lengths. Among hash-based pattern-matching hardware, PERG is the first to integrate limited regular-expression support. In ClamAV, this feature is needed for polymorphic virus detection. Finally among hash-based hardware pattern-matching engines, PERG is the first to implement Bloomier filters. With Bloomier filters, the process of exact matching to verify a false positive in PERG is much simpler and faster than other hash-based pattern-matching engines which use traditional Bloom filters. To reduce hardware resources, PERG uses a pattern compiler to transform the pattern database into an optimized hardware implementation. The compiler packs as many pattern strings as possible into each Bloomier filter unit, a hardware module, by consolidating several different pattern lengths into the same filter unit. To further save on hardware resources, limited regular-expression support for wildcard patterns is provided using a lossy scheme, which trades off a slight increase in false-positive probability to gain significant savings in hardware resources. False-negative probability in PERG remains zero. Using the ClamAV antivirus database, PERG fits 84,387 patterns containing over 8,645,488 characters into a single modest FPGA chip with a small (4 MB) off-chip memory. It uses just 26 filter units, resulting in roughly 24.7x improved density (characters per memory bit) compared to the next-best NIDS pattern-matching engine which fits only 1/126th the characters. PERG can scan at roughly 170MB/s and match the speed of most network or disk interfaces.

Item Metadata

Title	PERG-Rx : an FPGA-based pattern-matching engine with limited regular expression support for large pattern databases
Creator	Ho, Johnny Tsung Lin
Publisher	University of British Columbia
Date Issued	2009
Description	Pattern-matching is a fundamental technique found in applications like a network intrusion detection system (NIDS) and antivirus. By comparing an input data stream against a set of predefined byte-string patterns, malicious activities can be detected. Due to the increasing pattern database size, hardware engines are frequently being used to accelerate pattern-matching because software-based solution can no longer keep up. This thesis presents PERG, a FPGA-based pattern-matching engine targeted for the large pattern database found in Clam AntiVirus (ClamAV). Previous hardware pattern-matching engines have targeted the Snort database used in NIDS. PERG is the first hardware pattern-matching engine designed for the ClamAV pattern database, which is several times larger than Snort in both the number of patterns and the range of pattern lengths. Among hash-based pattern-matching hardware, PERG is the first to integrate limited regular-expression support. In ClamAV, this feature is needed for polymorphic virus detection. Finally among hash-based hardware pattern-matching engines, PERG is the first to implement Bloomier filters. With Bloomier filters, the process of exact matching to verify a false positive in PERG is much simpler and faster than other hash-based pattern-matching engines which use traditional Bloom filters. To reduce hardware resources, PERG uses a pattern compiler to transform the pattern database into an optimized hardware implementation. The compiler packs as many pattern strings as possible into each Bloomier filter unit, a hardware module, by consolidating several different pattern lengths into the same filter unit. To further save on hardware resources, limited regular-expression support for wildcard patterns is provided using a lossy scheme, which trades off a slight increase in false-positive probability to gain significant savings in hardware resources. False-negative probability in PERG remains zero. Using the ClamAV antivirus database, PERG fits 84,387 patterns containing over 8,645,488 characters into a single modest FPGA chip with a small (4 MB) off-chip memory. It uses just 26 filter units, resulting in roughly 24.7x improved density (characters per memory bit) compared to the next-best NIDS pattern-matching engine which fits only 1/126th the characters. PERG can scan at roughly 170MB/s and match the speed of most network or disk interfaces.
Extent	688814 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-09-21
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0067697
URI	http://hdl.handle.net/2429/12921
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2009-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

PERG-Rx : an FPGA-based pattern-matching engine with limited regular expression support for large pattern databases Ho, Johnny Tsung Lin

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights