Date of Award

Spring 5-2022

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Pradeep Chowriappa

Abstract

As is evident in areas of privacy, security, and ethics, the hindrances to research is the lack of validated real-world data. Therefore, people resort to creating their own dataset and/or artificially increasing the size of existing datasets. However, in areas like countermeasures of phishing, this is not only insufficient but could introduce bias in the dataset in the process.

To raise the awareness of bias in Machine Learning (ML) / Artificial Intelligence (AI) and its consequences, this work tries to gauge one of its occurrences reliably, namely selection bias when generating more samples from existing samples in a dataset. However, there is currently no cross‐disciplinary or cross‐sector consensus in approaches to identifying or validating measurements, metrics, and key indicators of bias, or how data should be measured or understood in context.

The problem presented in this thesis relies on investigating the effects of selection bias on Adversary-Aware Online Support Vector Machines (AAOSVM) with the help of support vectors to represent selection bias.

Share

COinS