SQL for Data Analysis
Advanced Techniques for Transforming Data into Insights
Paperback Engels 2021 1e druk 9781492088783Samenvatting
With the explosion of computing power, thanks to analytic databases and cloud data warehouses, SQL has become an even more robust and flexible tool for the savvy analyst or data scientist. This practical book reveals hidden ways to get the most out of your SQL workflow.
You'll learn how to use both common and exotic SQL functions such as joins, window functions, subqueries, and regular expressions in new, innovative ways-as well as how to combine SQL techniques to accomplish your goals faster, with more understandable code. If you work with SQL databases, this is a must-have reference.
SQL for Data Analysis covers useful applications such as:
- Cohort analysis
- Text analysis
- Anomaly detection
- Time series analysis
- Experiment analysis
- Creating complex datasets for further exploration in statistical and visualization tools
- And more
Specificaties
Lezersrecensies
Inhoudsopgave
1.1 What is data analysis?
1.2 Why SQL
1.2.0 What is SQL?
1.2.1 Benefits of SQL
1.2.2 SQL vs. R or Python
1.2.3 SQL as part of the analysis workflow
1.3 Database Types and How to Work with Them
1.3.1 Row-store databases
1.3.2 Column-store databases
1.3.3 Other flavors of data infrastructure
1.4 Conclusion
2. Preparing Data for Analysis
2.0 Types of Data
2.0.1 Database data types
2.0.1 Structured vs. Unstructured
2.0.2 First-party, Third-party, and Cloud Vendor data
2.0.3 Sparse data
2.0.4 Quantitative vs. qualitative data
2.0.5 Categorical vs. continuous
2.1 Profiling: Distributions
2.1.1 Histograms and frequencies
2.1.3 Binning
2.1.2 N-tiles
2.2 Profiling: Data Quality
2.2.1 Detecting duplicates
2.2.2 Deduplication with GROUP BY and DISTINCT
2.2.3 Missing data
2.4 Data cleaning
2.4.1 CASE transformations
2.4.2 Dealing with nulls: COALESCE, NULLIF, NVL
2.4.3 Casting and type conversions
2.3 Shaping Data
2.3.1 For which output: BI, Visualization, statistics, ML
2.3.2 Pivoting with CASE statements
2.3.3 Unpivot with UNION statements
2.3.4 PIVOT and UNPIVOT
2.4 Conclusion
3. Time Series Analysis
3.1 Date, datetime, and time manipulations
3.1.1 Time zone conversions
3.1.2 Date and timestamp format conversions
3.1.3 Date math
3.1.4 Time math
3.1.5 Joining data from different sources
3.2 The retail sales data set
3.3 Trending the data
3.3.1 Simple trends
3.3.2 Comparing components
3.3.3 Percent of total calculations
3.3.4 Indexing to see % change over time
3.4 Rolling time windows
3.4.1 Calculating rolling time windows
3.4.2 Rolling time windows with sparse data
3.4.4 Calculating cumulative values
3.6 Analyzing with seasonality
3.6.1 Period over period comparisons - YoY and MoM
3.6.2 Period over period comparisons - Same month vs. last year
3.6.3 Comparing to multiple prior periods
3.7 Conclusion
4. Cohort Analysis
4.1 Cohorts: a useful analysis framework
4.2 The legislators data set
4.3 Retention
4.3.1 SQL for a basic retention curve
4.3.2 Defining the cohort from the time series itself
4.3.3 SQL for time adjustments to increase accuracy
4.3.4 Defining the cohort from a separate table
4.3.5 Dealing with sparse cohorts
4.3.6 Defining cohorts from dates other than the first date
4.4 Related cohort analyses
4.4.1 Survivorship
4.4.2 Returnship / repeat purchase behavior
4.4.3 Cumulative calculations
4.5 Revisiting cross-section analysis, with a cohort lens
4.6 Conclusion
5. Text Analysis
5.1 Why text analysis with SQL
5.1.0 What is text analysis
5.1.1 Why SQL is a good choice for text analysis
5.1.2 When SQL is not a good choice
5.2 The UFO sightings data set
5.3 Text characteristics
5.4 Text Parsing
5.5 Text Transformations
5.6 Finding elements within larger blocks of text
5.6.1 Wildcard matches: LIKE, ILIKE
5.6.2 Exact matches: IN, NOT IN
5.6.3 Regular expressions
5.7 Constructing and reshaping text
5.7.1 Concatenation
5.7.2 Reshaping text
5.9 Conclusion
Rubrieken
- advisering
- algemeen management
- coaching en trainen
- communicatie en media
- economie
- financieel management
- inkoop en logistiek
- internet en social media
- it-management / ict
- juridisch
- leiderschap
- marketing
- mens en maatschappij
- non-profit
- ondernemen
- organisatiekunde
- personal finance
- personeelsmanagement
- persoonlijke effectiviteit
- projectmanagement
- psychologie
- reclame en verkoop
- strategisch management
- verandermanagement
- werk en loopbaan