# A guide to working with character data in R

**Revolutions**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R is primarily a language for working with numbers, but we often need to work with text as well. Whether it’s formatting text for reports, or analyzing natural language data, R provides a number of facilities for working with character data. Handling Strings with R, a free (CC-BY-NC-SA) e-book by UC Berkeley’s Gaston Sanchez, provides an overview of the ways you can manipulate characters and strings with R.

There are many useful sections in the book, but a few selections include:

- C-style formatting — very useful for preparing tabular data for reports
- String manipulation with the stringr package — which provides some welcome consistency in handling strings with R
- Regular expressions — the savior and/or curse for many data extraction problem

Note that the book does *not* cover analysis of natural language data, for which you might want to check out the CRAN Task View on Natural Language Processing or the book Text Mining with R: A Tidy Approach. It’s also sadly silent on the topic of character encoding in R, a topic that often causes problems when dealing with text data, especially from international sources. Nonetheless, the book is a really useful overview of working with text in R, and has been updated extensively since it was last published in 2014. You can read *Handling Strings with R* at the link below.

Gaston Sanchez: Handling Strings with R

**leave a comment**for the author, please follow the link and comment on their blog:

**Revolutions**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.