NuFit Media Physician Database Project
This project was created for NuFit Media. My team created a comprehensive database of all the physicians in America and all relevant associated data, including which medical school they attended, what hospital they worked at, and any malpractice issues. We used the Python Selenium package to automatically download datasets from several different sources and read them into a Python dictionary database.
Machine Learning + Water Pumps
The objective of this project was to use Machine Learning methods to correctly classify water pumps in Tanzania as working, needing repair, or broken based on data collected for each water pump in the country. The data were obtained from DrivenData.org, which obtained the data from Taarifa. Our program processes the data and then runs multiple supervised, unsupervised and ensemble learning techniques. Our goal was to find a method and find parameters that minimized our error rate.
Disappointing Wins: an fMRI Study
This study investigated the neural correlates of ambivalence in a gambling paradigm (Larsen et al., 2004). Participants played a series of gambles while fMRI data were collected. These data were obtained from Dartmouth College. I worked on processing analyzing the data using AFNI.
Eliminate the Negative, Accentuate the Positive: Emotion Regulation of Ambivalence in a Gambling Paradigm
This study investigated the effects of emotion regulation on ambivalence in a gambling paradigm (Larsen et al., 2004). Specifically, can emotion regulation reduce or eliminate ambivalence levels? Behavioral and ERP data were collected over the course of the session. Preliminary results suggest that emotion regulation, especially positive regulation, is effective in reducing ambivalence.
The objective of this project was to investigate whether simple machine-learning methods (e.g., Naive Bayes, Decision Trees, Random Forest, and Support Vector Machines) could be used to diagnose dementia in patients based on their responses to the Boston Cookie Theft Task. Transcripts were collected from the Pitt Corpus. We found that using alphanumeric, stemmed word-features, we were able to obtain F1-scores greater than 80% for several ML methods, indicating that our feature selection and methods were an effective way of diagnosing dementia for this particular task.