Pattern Searching in a Social Network
Department of Computer Science Education, Korea University
Department of Computer Science Education, Korea University
Abstract: In this paper, we propose a system that allows a user to find a pattern that has a certain property in a social network. It exploits structural properties of a social network using geometric hashing that has been used in model-based recognition. Since it requires computationally intensive work, we construct a small-scale Grid environment based on JXTA technologies in order to preprocess and recognize patterns in different networks. We experimented the system with four types of data; blog data, web pages, OWL documents, and sociemetric data. In all cases, the result indicated that our system could find the nodes that match patterns correctly.
Keywords: Social network, geometric hashing.
CITE THIS PAPER AS:
Hyosook JUNG, Seongbin PARK, Pattern Searching in a Social Network, Studies in Informatics and Control, ISSN 1220-1766, vol. 19 (2), pp. 125-134, 2010.
Social networks support formation of online communities by combining blogs, wikis, tagging, bookmarking services, and social networking software . Several research works have analyzed the relationships in networks such as the link structure of web pages. The relationships show unique features of each node in the network. For example, a node that serves as an index node or a menu node may have many outgoing links to other nodes. It is also possible that an interesting node has many incoming links from other nodes. These features can be useful for users to find special nodes suited to their interests or needs.
In this paper, we present a system that allows a user to find a pattern which has a certain property in a social network. Using the system, a user can find nodes that satisfy structural constraints specified in a query from a social network. For example, if a user wants to find a node which has two outgoing links, one incoming link and one pair link in network n, the user specifies the constraints as query pattern p and our system finds patterns matched with query pattern p in network n. (Figure 1)
We model a social network as a directed graph and use geometric hashing  that has been used in model-based recognition to search patterns that satisfy constraints. The algorithm proceeds in two steps which are preprocessing step and recognition step. In the preprocessing step, the link structure of a node is represented as a 3D geometric pattern and its invariant properties are stored into hash tables, where the invariant properties of a node are found by checking outgoing links, incoming links, and pair links of the node. In the recognition step, patterns that match against user’s query are found.
The proposed system was tested experimentally under four types of social networks. First, it was applied to search communities that have unique relationship patterns on the Web. For example, some communities are linked by other communities but have a few links to other communities while others are linked by only a few other communities but have many links to other communities. Secondly, we applied it to the analysis of the link structure among web pages so that we can understand the role of each web page in the set of web pages. For example, the index page has many outgoing links to other pages. The menu or table of contents has a set of links that lead to various sections of the web site. It can help users find web pages suited to their needs. Thirdly, we applied it to the set of OWL documents which describe the classes and relations between them . Fourthly, we experimented the proposed system against socio-metric data. In all cases, experimental results indicate that the proposed system finds nodes that satisfy different types of constraints specified in queries correctly.
This paper is structured as follows. Section 2 describes related works and section 3 explains the proposed approach in detail. The structure of the system is described in section 4 and illustrative examples are given in section 5. Section 6 describes experimental results and section 7 concludes the paper.
- GOTTA, M., O’KELLY, P., Trends in Social Software, Educause Center for Applied Research, 1993.
- KLEINBERG, J. M., Authoritative Sources in a Hyperlinked Environment, Journal of the ACM, 1999, Vol. 46, No. 5, pp. 604-632.
- LIU, B., Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Data-Centric Systems and Applications, 2007.
- NAKAKUBO, H., S. NAKAJIMA, K. HATANO, J. MIYAZAKI, S. UEMURA, Web Page Scoring Based on Link Analysis of Web Page Sets, Proceedings of the 18th International Workshop on Database and Expert Systems Applications, 2007, pp. 269-273.
- NHN Crop, http://section.blog.naver.com, 2009.
- Sun Microsystems Inc., http://java.sun.com/docs/books/tutorial/java/javaoo/index.html, 2008.
- WOLFSON, H. J., Geometric Hashing: An Overview, IEEE Computational Science & Engineering, 1997, Vol. 4, pp. 10-21.
- ZHANG, Y., H. ZHU, S. GREENWOOD, Web Site Complexity Metrics for Measuring Navigability, Proceedings of the 4th International Conference on Quality Software, 2004, pp. 172-179.
- Protégé-OWL API, http://protege.stanford.edu/plugins/owl/api, Stanford Center for Biomedical Informatics Research, 2010.
- OWL, http://www.w3.org/2004/OWL, W3C, 2007.
- SCRIPPS, J., P. N. TAN, A. H. ESFAHANIAN, Node Roles and Community Structure in Networks, Proceedings of the 9th WEBKDD and 1st SNA-KDD Workshop ’07, ACM, 2007, pp. 26-35.
- JUNG, H., AHN, J., S. Park, A JXTA-based System for Protein Structure Comparison, The Journal of Korean Association of Computer Education, Vol. 12, No. 4, 2009, pp. 57-64.
- JUNG, H., S. Park, A JXTA-based Music Information Retrieval System, Proceedings of the 5th International Conference on Information Systems Technology and its Applications, Vol. 84, 2006, pp. 107-117.
- ARASU, A., J. NOVAK, A. TOMKINS, J. TOMLIN, PageRank Computation and the Structure of the Web: Experiments and Algorithms, Proceedings of the Eleventh International World Wide Web Conference, Poster Track, 2002, pp. 107-117.
- GONG, L., Project JXTA: A Technology Overview, Sun Microsystems, Inc., 2001.
- WACHS, J. P., Gaze, Posture and Gesture Recognition to Minimize Focus Shifts for Intelligent Operating Rooms in a Collaborative Support System, International Journal of Computers, Communications & Control, Vol. V, Number 1, 2010, pp. 106-124.
- BANCIU, D., e-Romania – A Citizens’ Gateway towards Public Information, Studies In Informatics and Control, Vol. 18, Number 3, 2009, pp. 205-210.