Op werkdagen voor 23:00 besteld, morgen in huis Gratis verzending vanaf €20

Text Mining with R

A Tidy Approach

Paperback Engels 2017 9781491981658
Verkooppositie 5407
Verwachte levertijd ongeveer 8 werkdagen


Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.

The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You’ll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media.

- Learn how to apply the tidy text format to NLP
- Use sentiment analysis to mine the emotional content of text
- Identify a document’s most important terms with frequency measurements
- Explore relationships and connections between words with the ggraph and widyr packages
- Convert back and forth between R’s tidy and non-tidy text formats
- Use topic modeling to classify document collections into natural groups
- Examine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages


Aantal pagina's:194
Hoofdrubriek:IT-management / ICT


Wees de eerste die een lezersrecensie schrijft!

Geef uw waardering

Zeer goed Goed Voldoende Matig Slecht

Over Julia Silge

Julia Silge is a data scientist at Stack Overflow. She enjoys making beautiful charts, the statistical programming language R, black coffee, red wine, and the mountains of her adopted home here in Utah. She has a PhD in astrophysics and an abiding love for Jane Austen. Her work involves analyzing and modeling complex data sets while communicating about technical topics with diverse audiences.

Andere boeken door Julia Silge

Over David Robinson

David Robinson is a data scientist at Stack Overflow. He has a PhD in Quantitative and Computational Biology from Princeton University, where he worked with Professor John Storey on genomic analysis. He enjoys working and blogging about statistics, R programming, and text mining, including a popular analysis of Donald Trump’s twitter account (performed according to the tidy data principles described in this book).

Andere boeken door David Robinson


1. The tidy text format
2. Sentiment analysis with tidy data
3. Analyzing word and document frequency: tf-idf
4. Relationships between words: n-grams and correlations
5. Converting to and from non-tidy formats
6. Topic modeling
7. Case study: comparing Twitter archives
8. Case study: mining NASA metadata
9. Case study: analyzing usenet text

Alle 100 bestsellers


Populaire producten



        Text Mining with R