Kaggle Grocery Part 1: Pandas Sampling and Dask

by: Abbie Popa

17 Jan 2018

I recently predicted grocery sales for a kaggle competition. In this competition, we were responsible for using data from six tables to predict how many units of different items would sell on future dates. This competitions presented several challenges, including merging multiple tables, working with a data frame that was larger than RAM, and working with categorical variables that had many classes. This is part one, where I discuss how I dealt with the large data frame. I will discuss my handling of categorical variables with h2o in part 2. I will update this post with a link when part 2 is available.

Continue Reading

Deep Learning Crash Course

by: Abbie Popa

14 Jul 2017

If you have a twitter feed like mine (i.e., nerdy) you can hardly go a day without seeing some mention of “deep learning.” In fact a quick glance at google anayltics shows that searches for deep learning have been rising over the past 5 years. I included “linear regression” to have a point of comparison. (You’ll note the famous “people search for this more when school is in session” trend associated with linear regression.)

Continue Reading

From CSVs to SQL

by: Ryan Phillips

03 May 2017

Everyone knows that matlab is terrible, and I never want to use it again once I get out of this rattrap. But in order to do some serious data work in the serious world, you need to use a combination of Python and SQL. On the third hand, I couldn’t just throw my grad school life away and break free (I tried that, and it didn’t work).

Continue Reading


Older Blog Posts

About us

We are a collection of Psychology and Neuroscience graduate students from UC Davis who are interested in data science, user experience, and local beer. Our shared goal is to help each other prepare for a life (i.e. job) outside of academia, or perhaps, take a more modern approach to a life inside. You can read the latest blog post to the left, find older posts in the archive, or check out some of our projects.