DFW R Users Group

The DFW R User's Group has a monthly Meetup meeting.  On Saturday September 22, 2018, I gave an overview of machine learning with R based on my book.

Here is a link to the presentation:  If you just see html, click the "open with chrome" link at the top.

Here is a link to the quick demo.

Download the demo and presentation files here.

I had a great time and I highly recommend this group to anyone interested in R.

Data Sets

One of the hardest parts of starting out with machine learning used to be finding good data. The UCI Machine Learning Repository was the most comprehensive site, and still is a great resource. The site currently has over 440 data sets. Later,  Kaggle  and similar sites became popular. Google AI is becoming a comprehensive resource for both tools and data. As of this writing, Google has made available over 50 data sets for machine learning, and I'm sure more will come in the future.

It's an exciting time to get into machine learning!

Chess with Robots

I just found out about this ongoing project at UTD affiliated with our Robotics and Automation Society. The picture is of UTD's Chess Plaza on the central mall of our beautiful university. The goal of the project is to build robots that can autonomously navigate this chessboard. People can then download an app and play a chess game against the robots.  How cool is that? I'll post again when the project is ready to play.

Can NLP Save the World?

I was reading over Maureen Dowd's article this morning about Twitter civility, when my eye stopped at this statement by Farhad Manjoo, that Twitter had “tweaked its central feed to highlight virality, turning Twitter into a bruising barroom brawl featuring the most contentious political and cultural fights of the day.” Really? I've had a Twitter account for a few years and I never get any politically virulent tweets in my personal feed, but then I only follow NLP and ML researchers, people I have worked with, a couple of literary sites, etc., pretty nerdy stuff. I vote,  I keep up with issues by reading national news, I occasionally write my Senator and await his patronizing reply. I would call myself mildly politically active but I have always thought that posting something political on the Internet was pointless. It seems that a lot of people feel otherwise. I've been missing out on the bruising barroom brawl. Thankfully.

The article went on to mention that Twitter was p…

Why R?

It's been interesting to watch the competition between Python and R over the past few years to be the numero uno language for machine learning. In my opinion, you should learn both. I use Python for NLP and R for machine learning, although sometimes I do a little NLP in R and a little machine learning in Python. It's basically a Toyota v. Honda debate. They're both great.  The latest (2017) survey from Kaggle shows where we are:
Python is the most used tool but statisticians prefer R for their ML work. Titles vary by country for these professionals, but the most common is Data Scientist.The majority of survey participants have a Master's or PhD degree.The top four algorithms were: logistic regression, decision trees, random forests, and neural networks.  Why do I teach machine learning with R?
The main thing I like about R for beginning machine learning aficionados is that it gets out of the way. The syntax is straightforward enough that you can focus on the machine lea…

Here we go . . .

Today I'm starting this blog to focus on my primary interests: machine learning and natural language processing. Initially the focus will be on my new book on machine learning. Code samples are available on my github:

I accidentally started writing the book during Spring Break this year. I was thinking about the undergrad course I teach and how there was not a book that fit what I needed to teach. My first thought was to organize my notes in latex. This turned into a book. I'm almost finished, and plan to have it ready to go by 8/1/2018. Stay tuned!