List of publications in data science


This is a list of publications in data science, generally organized by order of use in a data analysis workflow.
See the list of publications in statistics for more research-based and fundamental publications; while this list is more applied, business oriented, and cross-disciplinary.
General article inclusion criteria are:
Some reasons why a particular publication might be regarded as important:Topic creator – A publication that created a new topicBreakthrough – A publication that changed scientific knowledge significantlyInfluence – A publication which has significantly influenced the world or has had a massive impact on the teaching of data science.
When possible, a reference is used to validate the inclusion of the publication in this list.

History

Statistical Modeling: The Two Cultures
Data Scientist: The Sexiest Job of the 21st Century
50 Years of Data Science
'The Composable Data Management System Manifesto'''''

Data collection and organization

Tidy Data
'Data Organization in Spreadsheets'''''

Data visualizations

'Quantitative Graphics in Statistics: A Brief History'''''

Tooling

Hidden Technical Debt in Machine Learning Systems
'A few useful things to know about machine learning'''''

Teaching data science

'The Introductory Statistics Course: A Ptolemaic Curriculum'''''