Date of Award

Fall 2014

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computational Analysis and Modeling

First Advisor

Vir Phoha

Abstract

In a local network or the Internet in general, data that is transmitted between two computers (also known as network traffic or simply, traffic) in that network is usually classified as being of a malicious or of a benign nature by a traffic authentication system employing databases of previously observed malicious or benign traffic signatures, i.e., blacklists or whitelists, respectively. These lists typically consist of either the destinations (i.e., IP addresses or domain names) to which traffic is being sent or the statistical properties of the traffic, e.g., packet size, rate of connection establishment, etc. The drawback with the list-based approach is its inability to offer a fully comprehensive solution since the population of the list is likely to go on indefinitely. This implies that at any given time, there is a likelihood of some traffic signatures not being present in the list, leading to false classification of traffic. From a security standpoint, whitelists are a safer bet than blacklists since their underlying philosophy is to block anything that is unknown hence in the worst case, are likely to result in high false rejects with no false accepts. On the other hand, blacklists block only what is known and therefore are likely to result in high false accepts since unknown malicious traffic will be accepted, e.g., in the case of zero-day attacks (i.e., new attacks whose signatures have not yet been analyzed by the security community).

Despite this knowledge, the most commonly used traffic authentication solutions, e.g., antivirus or antimalware solutions, have predominantly employed blacklists rather than whitelists in their solutions. This can perhaps be attributed to the fact that the population of a blacklist typically requires less user involvement than that of a whitelist. For instance, malicious traffic signatures (i.e., behavior or destinations) are usually the same across a population of users; hence, by observing malicious activity from a few users, a global blacklist that is applicable to all users can be created. Whitelist generation, on the other hand, tends to be more user-specific as what may be considered acceptable or benign traffic to one user may not be considered the same to a different user. As a result, users are likely to find whitelist-based solutions that require their participation to be both cumbersome and inconveniencing.

This dissertation offers a whitelist-based traffic authentication solution that reduces the active participation of users in whitelist population. By relying on activity that users regularly engage in while interacting with their computers (i.e., typing), we are able to identify legitimate destinations to which users direct their traffic and use these to populate the whitelist, without requiring the users to deviate from their normal behavior. Our solution requires users to type the destinations of their outgoing traffic requests only once, after which any subsequent requests to that destination are authenticated without the need for them to be typed again.

Empirical results from testing our solution in a real time traffic analysis scenario showed that relatively low false reject rates for legitimate traffic with no false accepts for illegitimate traffic are achievable. Additionally, an investigation into the level of inconvenience that the typing requirement imposes on the users revealed that, since users are likely to engage in this (typing) activity during the course of utilizing their computer's resources, this requirement did not pose a significant deterrent to them from using the system.

Share

COinS