Function to detect coordinated behaviour based on content groups. See details.
detect_coordinated_groups(x, time_window = 10, min_repetition = 2)
a data.table with ids of coordinated contents. Columns:
object_id
, id_user
, id_user_y
, content_id
, content_id_y
,
timedelta
. The id_user
and content_id
represent the "older"
data points, id_user_y
and content_id_y
represent the "newer"
data points. For example, User A retweets from User B, then User A's
content is newer (i.e., id_user_y
).
a data.table with the columns: object_id
(uniquely identifies
coordinated content), id_user
(unique ids for users), content_id
(id of user generated content), timestamp_share
(integer)
the number of seconds within which shared contents are to be considered as coordinated (default to 10 seconds).
the minimum number of repeated coordinated action to define two users as coordinated (defaults to 2)
The function groups the data by object_id
(uniquely identifies
coordinated content) and calculates the time differences between all
content_id
(ids of user generated contents) within their groups.
It then filters out all content_id
that are higher than the time_window
(in seconds). It returns a data.table
with all IDs of coordinated
contents. The object_id
can be for example: hashtags, IDs of tweets being
retweeted, or URLs being shared.