Wordcloud from user comments within the past month.
Combining Pushshift API with some spaghetti code, I was able to create "wordclouds" with the user comments of our subreddit. (Only the words from comments were used for this)
Who is that pokemon?
- A total of "494382" comments were used; and stopwords, emojis, links, gifs had to be removed in order to clean the data.
- Stopwords used are from python library "stop-words" and "Natural Language Toolkit"s stopwords combined; which adds up to ~1100 words.
- I had to also remove all of the moderator's comments because a good portion of them are the same.
It is also worth to mention that, in order to get a good looking "wordcloud", maximum size a word can get was limited heavily.
- Image below is a more accurate image with a higher maximum size allowed.
the \"more\" accurate
- And this is as accurate as it gets until it breaks the whole image.
https://preview.redd.it/rd1fej9gvjv91.png?width=4322&format=png&auto=webp&s=9ff456fb763cc4f7568f83442761e24e2702c57c
- For people who might want to see the raw data, here is the top 20 words that are used.
I didn't flag \"people\" as a stopword, because i found how much we are talking about \"people\" interesting.
- And here is our fellow r/cc
shitposters users. (size based on comment amount not karma)
It takes a subreddit to raise a moon.
A fair reminder for this last part: (Rule 4.5.)
- The revealing of users personal information (‘doxxing’) or targeting of specific users, including public figures (‘witch-hunting’) is strongly prohibited and in many jurisdictions illegal. Users engaging or encouraging this activity will receive a lifetime ban. Official figures, moderators, project contributors, or community influencers encouraging this will result in the entire associated project being blacklisted for a minimum of three months.
I'm not a native english speaker, so feel free to correct any major mistakes.
submitted by