A team of German scientists has revealed what online social networks on the internet may know about persons who are friends of members, but have no user profile of their own.
The findings, published in the journal PLoS-ONE, show that through network analytical and machine learning tools the relationships between members and the connection patterns to non-members can be evaluated with regards to non-member relationships.
The findings also show that using simple contact data, under certain conditions, it is possible to correctly predict that two non-members know each other with almost 40 % probability.
For several years scientists have been investigating what conclusions can be drawn from a computational analysis of input data by applying adequate learning and prediction algorithms. In a social network, information not disclosed by a member, such as sexual orientation or political preferences, can be calculated with a very high degree of accuracy if enough of his or her friends did provide such information about themselves.
“Once confirmed friendships are known, predicting certain unknown properties is no longer that much of a challenge for machine learning”, explained study co-author Prof. Fred Hamprecht of the Heidelberg Collaboratory for Image Processing.
Until now, studies of this type were restricted to users of social networks – persons with a posted user profile who agreed to the given privacy terms.
“Non-members, however, have no such agreement,” said Prof. Katharina Zweig, a study co-author who until recently worked at the Interdisciplinary Center for Scientific Computing of Heidelberg University. “We therefore studied their vulnerability to the automatic generation of so-called shadow profiles.”
In an online social network, it is possible to infer information about non-members, for instance by using so-called friend-finder applications. When new Facebook members register, they are asked to make available their full list of e-mail contacts, even of those people who are not Facebook members.
“This very basic knowledge of who is acquainted with whom in the social network can be tied to information about who users know outside the network. In turn, this association can be used to deduce a substantial portion of relationships between non-members”, explained Ágnes Horvát of Heidelberg University, a lead author of the study.
To make their calculations, the researchers used a standard procedure of machine learning based on network analytical structural properties. As the data needed for the study was not freely obtainable, the researchers worked with anonymized real-world Facebook friendship networks as a test set of basic data.
The partitioning between members and non-members was simulated using a broad possible range of models. These partitions were used to validate the study results. Using standard computers the researchers were able to calculate in just a few days which non-members were most likely friends of each other.
The scientists were astonished that all the simulation methods produced the same qualitative result.
“Based on realistic assumptions about the percentage of a population that are members of a social network and the probability with which they will upload their e-mail address books, the calculations enabled us to accurately predict 40 percent of the relationships between non-members,” concluded study co-author Dr. Michael Hanselmann of Heidelberg University.