Algorithms and approaches to understanding large-scale social data
With the widespread adoption of online social networking tools,
new online data resources at significant scale have become available over
the last decade. These resources typically offer lower fidelity than
classical methods from social psychology, but at larger scale.
Understanding these new datasets comes with a new set of challenges.
Artifacts in the data, due to accident or malice, render naive
computations untrustworthy, so significant attention must be paid to data
cleaning. Once the data has been cleaned, new algorithms are required to
generate statistics efficiently, and new measures must be developed at both
micro and macro levels. Finally, modeling approaches must extend
techniques to cope with the common case, in which observations across users
are tightly coupled, and multiple independent samples of the evolution of
the entire social system are often unavailable. In this tutorial, we'll
cover a range of approaches to these challenges.