Date of Award

Winter 2020

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computational Analysis and Modeling

First Advisor

Sumeet Dua


Behavioral disorders are disabilities characterized by an individual’s mood, thinking, and social interactions. The commonality of behavioral disorders amongst the United States population has increased in the last few years, with an estimated 50% of all Americans diagnosed with a behavioral disorder at some point in their lifetime. AttentionDeficit/Hyperactivity Disorder is one such behavioral disorder that is a severe public health concern because of its high prevalence, incurable nature, significant impact on domestic life, and peer relationships. Symptomatically, in theory, ADHD is characterized by inattention, hyperactivity, and impulsivity. Access to providers who can offer diagnosis and treat the disorder varies by location.

The ever-increasing use of social media can be effectively employed in the diagnosis and treatment of the disorder. Study of behavior and in extension, the study of individuals with behavioral disorders is made easier through the uninhibited setting in which posts are created on social media platforms.

Outside the United States, diagnosis rates of the disorder are low, as it is mainly considered to be an American disorder. This impression was reinforced by the perception that the disorder is caused by social and cultural factors common to American society. However, in reality, the disorder can as quickly affect people of different races and cultures worldwide, but recognition of the disorder in the medical community has been slow. This may be due to its adverse impact on an individual, their families, and society.

This dissertation focuses on providing clinicians with a clinical decision support system to overcome the societal stigma associated with the disorder and to ensure the accurate and efficient diagnosis of individuals with the disorder. The results provided in this dissertation assist in the diagnosis of individuals with Attention Deficit Hyperactivity Disorder. Data for individuals with the disorder is collected through posts of self-reported diagnoses on Twitter using the Twitter API. Previous research has proved that there are differences in behavior before and after the diagnosis of the disorder. To capitalize on this, symptomatic differences of the disease before and after diagnosis are discovered and evaluated. The symptoms of the disorder, namely, inattention, hyperactivity, and impulsivity, are quantified using measures of sentiment and semantics. A separate group of users without the disorder, the control group, are collected for validation. The analysis poses a three-class classification problem, with the classes being pre-diagnosed, postdiagnosed, and control groups. Decision trees are used to force all possible outcomes in the semantic and sentiment differences in the three classes of users to create a clear delineation. Behavioral disorders diagnosed by a clinician are based on identifying whether a patient deviates from an identified normal. This is evaluated by answering a set list of questions that quantify behavior. To achieve the same without manual intervention, ease in interpretability - decision trees are chosen. Classification using a decision tree is on a tweetlevel and a user-level. Four cases are used both analyses: pre-diagnosed vs. post-diagnosed group, pre-diagnosed vs. control group, post-diagnosed vs. control group, and prediagnosed vs. post-diagnosed vs. control group.

The analysis on a user-level provides a higher degree of accuracy, with 93% accuracy for the case post-diagnosed vs. control group. The accuracy of the cases identifies the number of people who can be correctly classified into their respective groups. Low accuracy for the tweet-level results fortifies the opinion that the sparsity of information in tweet level analysis is a disadvantage. This is overcome by analyzing on a user level. The accuracy of the classifier can be further improved upon by the addition of features such as age and gender. The addition of these features may also be useful in predicting time to remission and peak of the disorder in future studies.