Now Available: “RAW: An Introduction”

Now Available: “RAW: An Introduction”

We’re thrilled to announce our newest FREE online course! “RAW: An Introduction” is now available without charge to everyone. RAW is a free, online application by Density Design that allows you to create beautiful, creative, and insightful data visualizations in your web browser. RAW skips over graphics like bar charts and histograms that you could easily create in other applications. Instead, it builds on the power of D3.js to create unusual and sophisticated charts, like streamgraphs, alluvial charts, Voronoi tessellations, and circular dendrograms. In 22 videos and over 54 minutes, you'll learn how to import your data into RAW, quickly create custom charts, and export or embed your creations to share your insights with others.

“RAW: An Introduction” is available for FREE to all datalab.cc members. Just sign in or register for a free account, and get started exploring your data with RAW.

Now Available: “Typeform: An Introduction”

Now Available: “Typeform: An Introduction”

datalab.cc’s newest online course is now available! We’re proud to announce “Typeform: An Introduction.” This course gives a complete overview of the wonderful online form generator / survey creator Typeform (see Typeform.com). We love Typeform because it make it so easy to create forms, the forms are beautiful, and the layout and process are uniquely friendly to respondents. In the 2 hours and 33 minutes of this course, spread across 26 streaming videos, you’ll learn about all of the functionality of Typeform’s free “Basic” plan. (Future courses will explore the additional capabilities of the paid “Pro” and “Pro+” accounts.)

“Typeform: An Introduction” is available for individual purchase or, if you have a subscription or comprehensive bundle to datalab.cc, it’s included in your current plan.

Join us and see how Typeform can make your online interactions beautiful and inspiring!

datalab.cc begins its open beta

datalab.cc begins its open beta

We’re thrilled to say that today datalab.cc moves out of its private alpha and into its open beta! Now everybody can see how datalab.cc can help them make the most of their data without first having to register and receive an invitation to the private alpha.

The open beta continues the development of datalab.cc, as we keep on creating new courses and add resources like quizzes and PDFs to the existing ones. During this time, all of our courses and collections will be offered at a substantial discount. Members who purchase courses at these discounted prices will, of course, receive all of the enhanced materials the moment they are released. In addition, many of the course bundles will include future courses as part of their reduced price. It’s a great time to get a jump on your learning and save money in the process!

As a historical note, Wikipedia states that 17 November 1558 represents the beginning of England’s Elizabethan era – as Queen Mary I died and was succeeded by Elizabeth I – which, in turn, marks the beginning of England's “Golden Age” and the peak of the “English Renaissance.” That’s as good a reason as any to mark the day that – hopefully! – will begin datalab.cc’s golden age and the data renaissance for all of us!

Specialization is for insects; Data is for everyone

Specialization is for insects; Data is for everyone

I recently came across a few discussion pages where people asked for recommendations on statistical software that would be good for non-statisticians. The responses generally fell into one or both of two categories:

  • Don’t use anything with a point-and-click interface (e.g., SPSS, which is what I, my students, and my colleagues generally use) because you’ll have no idea what you’re doing and you’ll mess everything up. Instead, only people who can use R should be allowed to do statistics.
  • People who do not have extensive, specialized, academic training in the mathematical bases of statistics should never do statistics because – again – they have no idea what they’re doing and everything will blow up.

I find these answers deeply disappointing. First, they ignore the original question, which was for easy-to-use statistical software for non-statisticians, even though the asker made it clear that they had reasonable statistical training for their field. This violates a basic principle of communication: if you’re asked a question, then you should answer the question.

Second, the responders seemed to believe that if you are not a full-time, rigorously-trained specialist – which many of the responders apparently were (or acted as though they were) – then you should stay as far away from data as possible because you’ll break things. I certainly wouldn’t tell people that if you don’t have a PhD in psychology like I do – Social/Personality Psychology, City University of New York, 1999 – then you should never talk about people’s thoughts, feelings, and behaviors. That would be demeaning and amazingly restrictive.

Third (and related), they ignore the likely possibility that most analyses are not complicated and the questions are pretty basic: what’s the mean of this variable, do these two groups have different means, and so on. You don’t need a PhD in statistics to provide a workable answer to those questions. As Bob Dylan said, “you don’t need a weatherman to know which way the wind blows.”

Mostly, these responses strike me as unthoughtful, undemocratic, and unkind. They’re unthoughtful because they don’t seem to reflect on the reality of the asker’s needs and abilities. They’re undemocratic because they explicitly claim that only an elite group is qualified to work with data. And they’re unkind because they’re shutting somebody down for asking a sincere and important question.

Very sad.

Fortunately, the world is bigger than that and data is bigger than that. People are bigger than that, too. I’m reminded of a line by writer Robert A. Heinlein:

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.

I don’t necessarily think that means that everybody everywhere should be able to do all of those things expertly at the same time. Rather, I think what it means (or might mean) is that people have diverse abilities and potentialities. People are not restricted in their function in quite the same way that other organisms might be.

Data is everywhere. I believe that everybody would be best served by being able to work personally and fluently with data (even if they don’t use R). I’m a very strong supporter of the democratization of data and data science, and that’s why I created datalab.cc. Don’t get bullied; the data is waiting for you, so dive right in.

[By the way, my answer to the question of user-friendly software for non-statisticians with an adequate understanding of statistics is this: First, spreadsheets such as Google Sheets or Excel. Second, visualization with Tableau Public or Plotly. Third, analysis with SPSS. That should take care of 98% of most non-specialists’ needs. R and Python are nice – I use both – and the ability to manipulate data with Bash scripts or query relational databases with SQL are definitely helpful. But those are not especially user friendly (compared to the others) and they’re definitely oriented towards specialists and full-time statisticians and data scientists. And there you have it.]

"Data" is a plural noun; No, wait, not really...

"Data" is a plural noun; No, wait, not really...

This may seem trivial, but I have recently been set straight on this matter, so I thought I had better bring it up. The word "data" derives from the Latin dare, which means "to give." In Latin, though, "data" is a plural noun and "datum" is the singular. I have traditionally busted my students' chops on this and insisted that they treat "data" as a plural noun: "The data are all entered," "Data are exciting," etc.

However, I recently came across a rather lengthy discussion on this point by astronomer Norman Gray. In his article "Data is a singular noun," he argues that the word data is no longer used in English in the same way that it was in Latin and that it is functionally singular, in the same way that "stamina," and "agenda," two other Latin plural nouns, are used. He further makes the point that in no case does it make sense to give a number to the plural. That is, one would never say, "here are five data."

Gray has much more to his argument and he refers to others who posted on the same topic with largely the same conclusion. The result: In English, "data" is a singular noun.

I am sure that Star Trek'sData (above left and himself a singular noun) would agree. And that his evil twin Lore (above right and also technically a plural noun but a singular, fictional being) would not. Or maybe it's the other way around....