Recent Events

Data Scientist: A New Role For Librarians?

Amy Affelt

Amy Affelt

Amy Affelt, Director of Database Research Worldwide at Compass Lexicon and author of The Accidental Data Scientist, a newly-published book by Information Today, discussed some characteristics of Big Data and its possibilities for information professionals. She noted that big data is different from other data; here are some sources of it.

Big Data Characteristics

Big Data Characteristics

Verification of big data and determining its value are opportunities for information professionals.  Although the data is big, the insights gained from it are even bigger. Some recently developed big data applications include healthcare, transportation, and entertainment, all of which involve enormous collections of data. For example, Stanford University researchers looked through 81,000 searches–far more than can be done manually!–to find correlations between drugs and conditions.

Many big data applications are affected by the digital divide because they depend on using a smartphone app to access them which many people either do not have or do not know how to use. For example, if you have received a parking ticket, you can use an app called “Fixed” and take a photo of the ticket with  your smartphone, send it to the app, which will  survey thousands of previous tickets from the same municipality and determine chances of winning if the ticket is contested.  Obviously, people without a smartphone cannot use this app.

Not all big data applications have been successful; for example, data for Google’s Flu Trends  came from people searching Google, not healthcare professionals, and in the manhunt for the Boston Marathon bombers, for which FBI agents collected 13,000 video feeds and still photos and assigned analysts to look for someone acting suspiciously, the bombers were not apprehended from the data. Bad big data advice comes from reusing the data (how do you ensure that recycled data is clean), or global data sharing (the big concern is “garbage in, garbage out”; how do you stop “garbage in”?). Here is a raw data quality checklist:

Raw Data Quality Checklist

Raw Data Quality Checklist

Here are 6 data analysis tools that anyone can use:

Data Analysis Tools

Data Analysis Tools

You don’t need to know coding to work with big data, but a little knowledge helps.  Some roles for information professionals include Data Policy Expert, Data Release Expert, Exit Survey on Data Expert, and Algorithm Accountability Reviewer.  They can also interview subjects for a survey.

Don Hawkins
Conference Blogger

Comments are closed.