Andrew Tomkins, Google Research, USA

Algorithms and approaches to understanding large-scale social data
With the widespread adoption of online social networking tools, new online data resources at significant scale have become available over the last decade. These resources typically offer lower fidelity than classical methods from social psychology, but at larger scale. Understanding these new datasets comes with a new set of challenges. Artifacts in the data, due to accident or malice, render naive computations untrustworthy, so significant attention must be paid to data cleaning. Once the data has been cleaned, new algorithms are required to generate statistics efficiently, and new measures must be developed at both micro and macro levels. Finally, modeling approaches must extend techniques to cope with the common case, in which observations across users are tightly coupled, and multiple independent samples of the evolution of the entire social system are often unavailable. In this tutorial, we'll cover a range of approaches to these challenges.