Data Quality Trade-offs

Xiao-Li discusses the inherent trade-offs between data quality and quantity, highlighting how large datasets from social media often suffer from bias. He emphasizes the challenges of cleaning messy data and the risks of relying on overly sanitized datasets that may lead to misleading conclusions. The conversation also touches on the pitfalls of complete case analysis, which can result in unrepresentative samples and skewed insights.

In this clip
From this podcast
Super Data Science: ML & AI Podcast with Jon Krohn
SDS 581: Bayesian, Frequentist, and Fiducial Statistics in Data Science — with Xiao-Li Meng
Related Questions
- Is data quality overlooked in machine learning?

Data Quality Trade-offs

In this clip

From this podcast

Super Data Science: ML & AI Podcast with Jon Krohn

SDS 581: Bayesian, Frequentist, and Fiducial Statistics in Data Science — with Xiao-Li Meng

Related Questions

Is data quality overlooked in machine learning?