Heaton Research

My Current Software Toolkit

In this post I will summarize some of the computer programming technologies that I make use of. I am a data scientist with a computer science and information technology (IT) background. I am the author of several books on artificial intelligence (AI).

Computers

Between my books and the class that I teach, I need to stay fluent with Windows, Macintosh, and UNIX. For servers I prefer UNIX and often make use of Amazon’s Linux distribution when I run tasks on Amazon AWS. I no longer own a PC tower, if I need to leave something running with considerable compute power, I use AWS. I have not gone to UNIX on my laptop yet, but am considering it. I particularly like the Mint operating system. My main laptop is a Mac Book Pro, but I also make use of a Microsoft Surface Book.

Data Science

For data science projects the primary programming language that I use is Python. I particularly like the Anaconda Python distribution. I also make use of the R programming language. For data science I find that it is necessary to stay proficient in both R and Python. For Data Science, my favorite Python libraries are:

  • XGBoost One of the most versatile machine learning models out there. Though, I am starting to play with Light GBM some.
  • TensorFlow - Currently my primary deep learning framework.
  • Keras - A very nice high level abstraction for deep learning. I make use of Keras in my deep learning course and my books.
  • Scikit-Learn - Great for using machine learning models outside of XGBoost and TensorFlow.
  • Numpy/Scipy - Also great for numeric processing in Python.
  • DEAP - I have always had an interest in Genetic Programming, since making use of it in my dissertation.

Web development

Web development is necessary for me to present my ideas to the world. Though I find myself increasingly using markdown for just about everything I still need to work with, I still find I must deal with HTML and CSS. I make use of Bootstrap, currently but am evaluating other options for when I might move to using ReactJS more. I’ve used several programming Languages on the backend, such as Java, C#, and PHP. However, currently I am trying to standardize on Javascript for the full stack. I create AI Javascript examples for my books, so I have to know Javascript. Because I have to use Javascript on the front end, why not use NodeJS on the backend. I currently use Hexo to generate my website. Whenever possible I try to stay static. I am evaluating Gatsby to allow easier integration of JSX/ReactJS. I also use MathJax for equation rendering.

Other Languages

I spent most of my IT career programming C++, Java, and C#. These languages will always be near and dear, but mainstream data science projects in them are just not that common. The exception would be C++ that is used to build the core processing capabilities of many of the machine learning projects. I continue to support the Encog project, where I implemented several machine learning algorithms in Java/C#. Encog is my main reason for firing up Visual Studio or IntelliJ these days.