A Few Cool Things I Found This Week.

24 May 2015

by: Kyle Frankovich


This past week I came across a few resources I thought were really helpful. Typically, I’d send out an internal list of links to our Google Group, but I thought it might also be a good idea if we start collecting such information in a way that is also available to anyone who is interested.

Hands-on with dplyr: This is a really great introduction by Dmitry Grapov to one of my favorite R packages, dplyr. If you use R and are unaware of dplyr, you owe it to yourself to check it out. It’s a package created by the wonderfully-named Hadley Wickham that makes data manipulation/cleaning/pre-processing/analyzing very intuitive and easy. As Dmitry mentions in his guide, I also find this infographic cheat sheet by Rstudio very helpful when using the package. I have it printed out at my desk at the lab because I’m a nerd.

Speaking of Mr. Wickham, I also discovered his very own dplyr tutorial this week, which you can find here.

As Beth summarized in her earlier post, many of us have been focusing on trying to learn python. While I’ve been trying to force myself to use it more often in my work, I’ve also found myself still nursing my R addiction. So which one should I focus on as I approach data analysis problems? Thankfully the good people at R-bloggers put together a really great infographic that breaks down the important differences between the two languages from a data science perspective.

Another helpful resource I was glad I found this week was this video tutorial on python data analysis by Sarah Guido, a data scientist at Bitly. In it, she covers using iPython Notebooks, the pandas library (so far to me it seems similar to dplyr in R), plotting, and even some scikit-learn machine learning algorithms. As someone who is new to data analysis in python, I highly recommend checking it out. It was really helpful.

I’d also like to give a quick shoutout to PyCharm, the most recent python IDE I’ve been testing out. Nothing else I’ve tried has stuck, and I kept finding myself longing for something as simple, elegant, and functional as RStudio. The other python IDEs I’ve tried out have frustrated more than they’ve helped, and so far PyCharm has been the only one that has impressed me. There’s still a lot I need to figure out with it, but so far this one is looking like a winner.

Finally, I just thought this was a cool article discussing running python in a browser. trinket allows you to write, run, and share code from any browser or device, including your phone. I’m not sure if I’ll end up replacing the iphone breaks I take throughout the day to read websites and whatnot with code writing sessions, but I still like the fact that it’s possible.