datalab.cc Mission Statement

datalab.cc name and tagline (with background - cropped) - 1710 x 461

I just wrote my first ever (non-sarcastic) mission statement. The online Mission Statement Generator suggested the following:

  • Our business is to seamlessly operationalize world-class catalysts for change to stay relevant in tomorrow’s world.
  • We will work cooperatively to professionally coordinate next-generation content as well as continue to synergistically fashion ethical meta-services while encouraging personal employee growth.
  • We envision to conveniently orchestrate professional opportunities as well as continue to assertively optimize best practice deliverables.

Nice, but I’m pretty sure those have all been taken already. Instead, I wrote this:

  • At datalab.cc, we believe that data can change the world for the better. We give people around the globe the understanding and skills they need to make the most of data in their lives, their jobs, and their communities. We give people power through data.

And I think mine will pass the Turing Test.


Big changes coming for datalab.cc, or, “Power through data” on its way

Time-for-change-8620349_l

I’m taking the entrepreneurial plunge. datalab.cc is changing from a blog to an e-commerce, video tutorial site. I’ve filed the papers of incorporation, I’ve got an address and a phone number (844.4.DATALAB) and I’m receiving applications for administrative assistants. Next, web designers, videographers and editors, audio engineers, Internet security, commerce applications, and more marketing and sales that I ever dreamed existed. (But not by me. I still remember the line at the Sea World sea lion show: “Why pay a professional to do it right, when you can do it yourself?” Professionals will be hired to do this right. Just you see.) My totally arbitrary deadline to have this up and running is November 1 of this year (2014). That’s 108 days from today, or 78 working days. Get a move on.

This exercise is focused narrowly, in that everything I do (or have done) for this website will be focused on working with data in one way or another. On the other hand, I hope to go as deep and as broad as possible within that rubric. That is, I want to have tutorials on everything from grade school common core data topics like drawing bar charts with crayons up to professional predictive analytics (and lots of AP stats, college intro stats, and office manager Excel courses, too). I hope to cover as many computer applications and platforms as possible. In addition, I have a global intent: I want to release the tutorials not only in English but also in Mandarin Chinese, Spanish, Hindi, and Arabic (which adds a few billion people to the potential market). This is my contribution towards the democratization of data science, or, as my new tagline reads, “power through data.” That’s power for everyone, everywhere.

[And I'm very glad that MOO.com waits a few hours before they process orders for business cards. I accidentally put "power though data" and only noticed it two hours later. PR disaster averted.]

But now I have get back to work. I’m changing the world, you know.


R promo video, or, A fanboy rejoices

The above is a short commercial for the statistical programming language R put together by the fine people at Revolution Analytics. Get yourself fired up and get your analytics on!

 


I’m published! “R Succinctly” available now as a free download!

R Succinctly Cover

I’ve spent the last few months writing an introductory book on the statistical programming language R called R Succinctly. It’s a short (128 pages) introduction to R that goes from “Hello World!” and bar charts to cluster analysis and principal components in as concise a manner as possible. The book is free, courtesy of Syncfusion, a company that makes software developer tools. You need to register at the book’s page – go to http://t.co/HDPltCzfPg – and then you can download the book in PDF or Kindle .mobi format.

Curiously, in the author blurb I mention two of my projects – the Utah Data Dive and Dance && Code – but I fail to mention this blog, datalab.cc, the related Twitter account @datalabcc, or my YouTube tutorials at (http://youtube.com/bartonpoulson) (which now have over a million hits!). Nor do I mention my personal page, bartonpoulson.com. Go figure.

Please pass this along to anyone you think might be interested and thanks!


lynda.com courses for datalab participants, or, How to become a statistical know-it-all

big_lynda_740

I’m a huge fan of lynda.com’s online training library. I’ve made several courses for them and I also use their courses for my own learning on topics like MySQL, Photoshop, and Ableton Live. They have several courses – including five of mine – that would be directly relevant to anybody interested in the datalab or the Utah Data Dive. Be aware that lynda.com is a subscription site: basic access – which means videos but no exercise files – costs $25 per month and premium access – which gets the exercise files – is $37.50 per month. (Here’s the link for membership info.) I know that sounds like a lot when you compare it to, say, YouTube, but it’s a bargain when compared to, say, college tuition. In fact, a little while ago, Fast Company put together several guides to what they called “$10,000 degrees” and lynda.com was mentioned on nearly all of them (e.g., design, technology, and business). And, as a bonus, several college campuses and libraries – such as the Salt Lake City Public Library – offer free access to lynda.com courses, so check them out.

Also, lynda.com has many more courses on topics like Python programming and database management that could be useful to someone wanting to work with data, but I’m trying to keep this list a little closer to what I would suggest for my own students in the Behavioral Sciences (and any other non-tech person).

And so here is my list of recommended lynda.com courses, grouped approximately by function.

Gathering data

Spreadsheets

Statistics programs

Visualization & presentation

Hope that helps!


Python ain’t cuttin’ it, or, I’m going back to Excel

chinese-python-poster

Despite our best efforts to work with Python in the UVU Data Lab this semester, it has become clear that this approach was not the best choice. The idea was to (a) learn Python in six weeks (riiiight..); (b) learn IPython in another 2 weeks (ha, ha, ha); (c) learn how to use SciPy, NumPy, pandas, and Matlibplot in maybe another few weeks (ho, ho, ho); (d) learn how to access Twitter data in a week (woo hoo); (e) learn how to do cluster analysis, predictive analytics, and so on in a few weeks (I’m blushing now); and (f) do some actual research.

Well, that was the plan. My students were troopers and I’m very proud of what they were able to accomplish, given they had collectively about zero experience with programming, but we didn’t get very far past item a. Sigh…. Nevertheless, it was an important experiment and learning experience and my students were able to make the most of it. They all participated in a presentation at the Rocky Mountain Psychological Association’s conference in Salt Lake City. You can see a PDF of their presentation entitled “Turning on the firehose: Mining Twitter for psychological data” by clicking here. It’s more an aspirational document then a summation. But it DOES give me an excellent idea of what I need to do over the next 12 months to get us ready for the Utah Data Dive. Chief among those will be to go to where the students are – do as much as possible in Excel or SPSS – rather than trying to recreate them as computer scientists. (Again, see John Foreman’s excellent book Data Smart: Using Data Science to Transform Information into Insight for a thorough example of what can be accomplished in Excel.)

And there you have it.


Extracting data from PDFs, or, Crowbars for tables

crowbar_36335_lg

The always-fabulous School of Data just featured a review and walk through of Tabula, which helps prevent researchers from banging their heads against their computers by extracting data from tables in PDFs. School of Data writer Marco Túlio Pires has these words:

PDF files are pesky. If you copy and paste a table from a PDF into a new document, the result will be messy and ugly. You either have to type the data by hand into your spreadsheet processor or use an app to do that for you. Here we will walk through a tool that does it automatically: Tabula.

Tabula is awesome because it’s free and works on all major operating systems. All you have to do is download the zip file and extract a folder. You don’t even have to install anything, provided you’ve got Java on your machine, which is very likely. It works in your browser, and it’s all about visual controls. You use your mouse to select data you want to convert, and you download it converted to CSV. Simple as that.

See the rest of his review at “Tackling PDFs with Tabula.” And free your data!


Follow

Get every new post delivered to your Inbox.

Join 405 other followers