I presented two papers at the IEEE SoutheastCON 2016 conference in NorfolkVA on April 1, 2016:
- April, 2016: An Empirical Analysis of Feature Engineering for Predictive Modeling [PDF][Slides]
- April, 2016: Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms [PDF][Slides]
The first article is very much related to my phd dissertation topic. For this paper, I generated datasets for neural networks, support vector machines, random forests, and gradient boosting machines trying to see what types of equations they could learn; and more importantly, what types of equations they cannot learn. My dissertation topic is in the area of feature engineering, so I am very interested in what types of equation representations of features you can augment a model’s feature vector with to enhance its predictive power. This conference paper is based on research I did for my phd, while I was exploring dissertation topics.
The second article is on frequent itemset mining. For this paper, I examined several common frequent set mining items to see what effects the underlying dataset had on the algorithm runtime. Frequent itemsets are outside my research area. This paper was based on a paper that I wrote near the beginning of my phd program.
Both of these papers relied heavily on experimentation, and my code is available at my github site.