The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Before we start the process of getting and visualizing the twitter data. Lets go ahead and take a peak at the packed barchart that will be the output of the process.
Before we can visualize any data we’ll have to gather it. There are R twitter packages out there, such as twitteR
, but I prefer to use a custom function (shown below). If you’d like to use the custom function, you’ll first need to provide your api keys/secrets.
api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxx'
api_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token = 'xxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
#custom function to get tweets by a username
get_user_tweets = function(user, n, api_key, api_secret, access_token, access_token_secret) {
#set up oauth
auth = httr::oauth_app("twitter", key=api_key, secret=api_secret)
sig = httr::sign_oauth1.0(auth, token=access_token, token_secret=access_token_secret)
#loop through GETs getting max of 200 per iteration
nLeft = n
i = 0
timeline = vector("list", n)
while (nLeft > 0) {
nToGet = min(200, nLeft)
i = i+1
#build GET URL
if (i == 1) {
GETurl = paste0("https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=",
user,"&count=", nToGet)
} else {
GETurl = paste0("https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=",
user,"&count=", nToGet,"&max_id=", max_id)
}
#actual GET and content extract
timelineRaw = httr::GET(GETurl, sig)
timelineContent = httr::content(timelineRaw)
#accumulate content
if (i==1) {
timeline = timelineContent
} else {
timeline = c(timeline, timelineContent)
}
max_id = min(vapply(timelineContent, function(ls) ls$id, numeric(1)))
nLeft = nLeft - nToGet
}
return(timeline)
}
We now have a working function to get our tweet data. In the below chunk we call the function and wrangle the data into a nice data.table
structure for plotting.
#call custom function to get tweets for a given user
my_tweets = get_user_tweets(user="ASpannbauer", n=1000,
api_key, api_secret, access_token, access_token_secret)
#parse out the information we want into a list of data.tables
tweet_dt_list = lapply(my_tweets, function(tweet) {
data.table::data.table(time = tweet$created_at,
text = tweet$text,
user = tweet$user$screen_name,
fav_n = tweet$favorite_count,
rt_n = tweet$retweet_count)
})
#combine list into single data.table
tweet_dt = data.table::rbindlist(tweet_dt_list)
#remove retweets
tweet_dt = tweet_dt[!grepl("^RT", text), ]
#combine count of favorites and retweets into single count
tweet_dt[,total_fav_rt := fav_n + rt_n]
#truncate tweet text for a preview of the tweet in the viz
tweet_dt[,text_preview := paste0(substr(text, 1, 20), "...")]
#order by descending popularity
tweet_dt = tweet_dt[order(-total_fav_rt), ]
#inspect head of data
head(tweet_dt[, -c("time","text")])
## user fav_n rt_n total_fav_rt text_preview
## 1: ASpannbauer 295 56 351 Most NSFW minute in ...
## 2: ASpannbauer 160 20 180 Analyzing emotes in ...
## 3: ASpannbauer 55 13 68 I transformed my res...
## 4: ASpannbauer 55 5 60 Playing around with ...
## 5: ASpannbauer 18 14 32 Trump Doesnt like Mo...
## 6: ASpannbauer 21 9 30 New festive post on ...
Before plotting with the packed barchart let’s take a peak at the distribution of the metric we’ll be plotting. As we see in the plot below, this data is very skewed. This type of distribution is a good case for the packed barchart’s intended design.
plot(tweet_dt$total_fav_rt, type = 'l', ylab = "Fav|RT Count")
At this point, we’re ready to use the packed barchart to see our twitter data in a new light. To do this we call the function rPackedBar::plotly_packed_bar
and specify our options.
input_data
- the name of the data.frame
type object containing the data to plotlabel_column
- the column in the input_data
contining labels for our plotted numeric datavalue_column
- the column in the input_data
contining the numeric data to plotnumber_rows
- the number of rows our packed barchart will containplot_title
- the main title to display over the chartxaxis_label
- the title to display for the xaxishover_label
- title to appear in the hover informationmin_label_width
- parameter to prevent text labels spilling over the bounds of the barscolor_bar_color
- color of the largest colored bars in the chartlabel_color
- color of the labels to appear over the colored bars in the chartp = rPackedBar::plotly_packed_bar(input_data = tweet_dt[total_fav_rt > 0, ],
label_column = "text_preview",
value_column = "total_fav_rt",
number_rows = 4,
plot_title = "Tweet Interactions",
xaxis_label = "Favorites & RTs",
hover_label = "Favs & RTs",
min_label_width = .1,
color_bar_color = "#00aced",
label_color = "white")
plotly::config(p, displayModeBar = FALSE)
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.