Classifying ethernet data packets based on raw bit patterns
Kenworthy, W.D. (2010) Classifying ethernet data packets based on raw bit patterns. In: 3rd International Conference on Knowledge Discovery and Data Mining, 9 - 10 January, Phuket.
|PDF - Published Version |
Download (197kB) | Preview
*Subscription may be required
Currently most operations on network data packets are controlled by the applicable protocols such as TCP/IP. However, there is scope to examine and classify the data without resorting to processing through a protocol stack. To do this, use can be made of the complex and sophisticated algorithms developed for the analysis of biological and genomics data. This makes use of similarities in the way information is stored in biological structures and network data traffic. It can be shown that network data flows have many of the same structural characteristics as biological DNA - areas of conservation (an area of data that has the same composition as an area in another packet of data will often have similar functionality), "motifs" with particular functions and the equivalent of "junk DNA" - areas where seemingly random changes occur. This paper looks at the novel application of algorithms designed to process DNA data to analyse and classify Ethernet network data packets based on the patterns discernible in the data rather than the more traditional method of matching fixed fields within the data based on protocol specifications. We are able to show that these algorithms are able to successfully and accurately classify packets of data into groups whose members have similar characteristics based on actual content rather than meta-data. This provides a unique and useful method of grouping and classifying packets that could be of use in diverse applications such as IDS systems, and the search for, and identification of specific types of data.
|Publication Type:||Conference Paper|
|Murdoch Affiliation:||School of Information Technology|
|Copyright:||© 2010 IEEE|
|Notes:||Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This paper appears in: 3rd International Conference on Knowledge Discovery and Data Mining, WKDD 2010; Phuket; 9 January 2010 through 10 January 2010; Category number P3923; Code 80211|
|Item Control Page|