Function to detect coordinated behaviour based on content groups. See details.
detect_coordinated_groups(x, time_window = 10, min_repetition = 2)
a data.table with ids of coordinated contents. Columns:
object_id
, id_user
, id_user_y
, content_id
, content_id_y
,
timedelta
. The id_user
and content_id
represent the "older"
data points, id_user_y
and content_id_y
represent the "newer"
data points. For example, User A retweets from User B, then User A's
content is newer (i.e., id_user_y
).
a data.table with the columns: object_id
(uniquely identifies
coordinated content), id_user
(unique ids for users), content_id
(id of user generated content), timestamp_share
(integer)
the number of seconds within which shared contents are to be considered as coordinated (default to 10 seconds).
the minimum number of repeated coordinated actions a user has to perform (default to 2 times)
The function groups the data by object_id
(uniquely identifies
coordinated content) and calculates the time differences between all
content_id
(ids of user generated contents) within their groups.
It then filters out all content_id
that are higher than the time_window
(in seconds). It returns a data.table
with all IDs of coordinated
contents. The object_id
can be for example: hashtags, IDs of tweets being
retweeted, or URLs being shared.