hyperscan introduction

hyperscan is a high-speed regular expression matching engine based on Intel, which has been developed for many years and continuously optimized and improved, with very high efficiency. Although it does not support regular expressions as comprehensively as pcre, it is very suitable for network devices. Users can use hyperscan for rule matching on the data plane (Data Plane) of network devices to achieve high-performance DPI/lPS/IDS and other applications

Open source code: https://github.com/01org/hyperscan

Applications of hyperscan in IDS and IPS products

Environmental requirements

GCC, v4.8.1 or higher

Clang, v3.4 or higher (with libstdc++ or libc++)

Intel C++ Compiler v15 or higher

Libraries dependent on

Dependency	Version	Notes
CMake	>=2.8.11
Ragel	6.9
Python	2.7
Boost	>=1.57	Boost headers required
Pcap	>=0.8	Optional: needed for example code only

The gcc version must be greater than v4.8.1

Download wget ftp://gcc.gnu.org/pub/gcc/releases/gcc-4.8.2/gcc-4.8.2.tar.bz2Download the source code package

# tar -jxvf gcc-4.8.2.tar.bz

# cd gcc-4.8.2.tar.bz2

# https://www.freebuf.com/articles/es/contrib/download_prerequisites During the execution, 3 packages will be downloaded, mpfr, gmp, mpc,

Compile and install gmp

# cd gmp && mkdir build && cd build/

# https://www.freebuf.com/articles/configure --prefix=/usr/local/gcc/gmp-4.3.2 && make && make install

Compile and install mpfr

# cd https://www.freebuf.com/mpfr && mkdir build && cd build/

# https://www.freebuf.com/articles/configure --prefix=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2 && make&&make install

Compile and install mpc

# cd https://www.freebuf.com/mpc && mkdir build && cd build

# https://www.freebuf.com/articles/configure --prefix=/usr/local/gcc/mpc-0.8.1 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2 && make && make install

Add shared library path

# vim /etc/ls.do.conf Add the following content

/usr/local/gcc/gmp-4.3.2/lib

/usr/local/gcc/mpfr-2.4.2/lib

/usr/local/gcc/mpc-0.8.1/lib

Save and exit, then execute ldconfig

Compile gcc

# cd https://www.freebuf.com/

# mkdir build

# cd build

# https://www.freebuf.com/articles/configure --prefix=/usr/local/gcc --enable-threads=posix --disable-checking --enable-languages=c,c++ --disable-multilib

# make && make install

# yum remove gcc gcc-c++ && updatedb Remove the old version

# Link to the new version

# cd /usr/bin && ln -s /usr/local/gcc/bin/gcc gcc && ln -s /usr/local/gcc/bin/g++ g++

Cmake installation

You can check if cmake is installed

# rpm -qa | grep cmake

cmake-2.8.12.2-4.el6.x86_64

ragel installation

Download and install ragel

# tar zxvf ragel-6.9.tar.gz && cd ragel-6.9 && https://www.freebuf.com/articles/es/configure && make && make install

python version upgrade (2.7 version of python is required when compiling boost)

Upgrade python to 2.7

# wget http://www.python.org/ftp/python/2.7.3/Python-2.7.2.tgz

# mkdir /usr/local/python-2.7.2, then compile and install, specify prefix as the new directory.

# https://www.freebuf.com/articles/es/configure --prefix=/usr/local/python-2.7.2

# make && make install

# mv /usr/bin/python /usr/bin/python_old

# ln -s /usr/local/python-2.7.2/bin/python2.7 /usr/bin/python

# python -V Verify that the version is correct 2.7.2

pcap installation

Check if pcap is installed on the system

# rpm -qa | grep pcap

libpcap-1.0.0-6.20091201git117cb5.el6.x86_64

boost installation

Download the installation package

# tar -xvf boost_1_60_0.tar && cd boost_1_60_0

# Generated bjam and b2 installation tools

# https://www.freebuf.com/articles/es/b2

# https://www.freebuf.com/articles/es/b2 install

hyperscan installation

# wget https://codeload.github.com/01org/hyperscan/tar.gz/v4.3.0

# tar xvzf v4.3.0 && cd hyperscan-4.3.0/ && mkdir hs_build &&cd hs_build

# cmake https://www.freebuf.com/hyperscan-4.3.0

# cmake --build .

# make install

hyperscan verification

# https://www.freebuf.com/articles/es/bin/unit-hyperscan　　Verify hyperscan,提示libstdc version problem.

# https://www.freebuf.com/articles/es/bin/unit-hyperscan: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by https://www.freebuf.com/articles/es/bin/unit-hyperscan)

# Solution

# find / -name libstdc++.so.6

/usr/local/gcc/lib64/libstdc++.so.6

/usr/lib64/libstdc++.so.6

# cd /usr/lib64/ && mv libstdc++.so.6 libstdc++.so.6_bak

# cp /usr/local/gcc/lib64/libstdc++.so.6.0.18 /usr/lib64

# ln libstdc++.so.6.0.18 libstdc++.so.6

# Run test program https://www.freebuf.com/articles/es/bin/unit-hyperscan OK appears the following prompt

1607307003_5fcd8efbdd192d03e3fd1.png!small

Design goals of hyperscan

High performance, including normal application scenarios and boundary conditions

Smaller database (data formed by compiling regular expressions)

Smaller stream state data (stream state) when running in stream mode. In this mode, each stream needs to maintain its own stream state.

In addition, there are some design requirements or limitations:

The runtime library must be implemented in C because some data plane environments do not support C++

Memory cannot be requested arbitrarily at runtime; the memory used only includes database, matching temporary data (scratch), and stream state (in stream mode).

The database must have a flat memory layout to allow serialization/deserialization, or can be moved from one place in memory to another (this means that it cannot contain pointers internally)

Important concepts

Compilation: It is to compile multiple regular expressions into hyperscan database, and some flags and mode parameters can be passed in when calling the compilation interface to control the matching behavior and running mode, the main API:

hs_compile()

hs_compile_multi()

hs_compile_ext_multi()

Matching: It is based on the compiled database to match the data and obtain the matching results.

When hyperscan performs matching, it needs a temporary data (scratch), which needs to be allocated before the data plane runs (not allocated and released at runtime to ensure performance), and it needs to ensure that only one matching process is using the same block of temporary data at the same time.

If the stream mode is used, it is necessary to pre-allocate stream state data for each stream.

There are mainly 3 operating modes: BLOCK, STREAM, and VECTORED.

The BLOCK mode matches multiple data blocks separately;

The STREAM mode treats a specific set of databases as a single STREAM, maintains state information for each STREAM, and can match across data blocks;

The VECTORED mode can match multiple data blocks at once. Databases compiled with different mode parameters cannot be mixed during matching.

The main matching APIs are

hs_scan()

hs_scan_vector()

hs_scan_stream()

1607307045_5fcd8f257526696b12c79.png!small

Configuration file

The background configuration file hyperscan.conf example:

-desc this is a config file of pcre rule

-name pcre_rule

-offset 1

-depth 10

-min_payload 100

-relation 1

-action 13

-pattern ^01aa*

Parameter description

desc	Valid string	Description of regular protection rule
name	Valid string	Name of regular protection rule
offset	Integer	Offset of matching packet payload (offset after TCP header)
depth	Integer	Length of regular expression matching, that is, the length matching starts from the offset
min_payload	Integer	Minimum payload length
relation	Integer	Protection group ID??
pattern	Valid regular expression string	Regular expressions, note that each protection group can be configured with a maximum of 5 regular rules
action	Integer	Actions to be executed after matching the required regular expressions: SC_FW_DROP, SC_FW_ACCEPT, ACL_DROP_ADDBLACK, ACL_DROP_SENDRST