Module 35 Sentiment analysis with survey data


1. Start a new script. Name it “sentiment.R”.

2. Set up your work space by loading the ggplot2, dplyr, tidytext, gsheet, wordcloud2, sentimentr, and lubridate packages.

3. Read in your survey data by running the following:

4. Take a look at the first few rows of the data. What is the unit of observation?

5. Create a variable named date_time in your survey data. This should be based on the Timestamp variable. Use the mdy_hms variable to created a “date-time” object.

6. Create a visualization of the date_time variable.

7. Create an object called sentiments by running the following:

8. Explore the sentiments object. How many rows? How many columns? What is the unit of observation.

9. Create an object named words by running the following:

10. Explore words. What is the unit of observation.

11. Look up the help documentation for the function wordcloud2. What does it expect as the first argument of the function?

12. Create a dataframe named word_freq. This should be a dataframe which is conformant with the expectation of wordcloud2, showing how frequently each word appeared in our feelings.

13. Make a word cloud.

14. Run the below to create an object named sw.

15. What is the sw object all about? Explore it a bit.

16. Remove from word_freq any rows in which the word appears in sw.

17. Make a new word cloud.

18. Make an object with the top 10 words used only. Name this object top10.

19. Create a bar chart showing the number of times the top10 words were used.

20. Run the below to join word_freq with sentiments.

21. Now explore the data. What is going on?

22. For the whole survey, were there more negative or positive sentiment words used?

23. Create an object with the number of negative and positive words used for each person.

24. In that object, create a new variabled named sentimentality, which is the number of positive words minus the number of negative words.

25. Make a histogram of senitmentality

26. Make a barplot of sentimentality.

27. Create a wordcloud for the dream variable.

28. Create a barplot showing the top 16 words in our dreams.

29. Which word showed up most in people’s description of Joe?

30. Make a histogram of feeling_num.

31. Make a density chart of feeling_num.

32. Change the above plot to facet it by gender.

33. How many people mentioned Ryan Gosling in their description of Joe?

34. Is there a correlation between the sentimentality of people’s feeling and their feeling_num?