Our work has received several awards, including best paper awards from the IEEE and the ACM societies, including the best student paper award at JCDL 2007, the best ACM Multimedia demo award in 2006. The best student paper award at ACM Multimedia 2002, best paper runner-up at ACM Multimedia 2007, best student paper runner-up at ICASSP 2006. We also received a best paper award on video retrieval from IEEE Trans. On Circuits and Systems for Video Technology, 2000. In 2002, Prof. Sundaram received the Eliahu I. Jury Award for best Ph.D. dissertation.
This paper proposes a generative model to identify roles in Community Question Answering (CQA) communities and shows how differences in role composition influences community health. Community Question Answering (CQA) platforms are useful for people to share knowledge and to form ad-hoc social networks around specific topics. User experience across communities varies: whether a question will be answered, the extent of the delay in responding to a question, etc. While past research shows that participants play different roles in online communities, we investigate a complementary question: how does the distribution of roles differ across communities and do these differences help explain the differences in user experience? We propose the use of a generative model for inferring action- based roles for users both at the level of an individual brows- ing session as well as at the broader community level. Our model is specifically designed to produce descriptions of user behavior roles in the form of interpretable probability distributions over the atomic actions a user may take within a community while also modeling the composition of those roles inside individual communities to facilitate cross-community analysis. A comprehensive experiment on all 161 non-meta communities on the StackExchange CQA platform reveals three empirical insights. First, we show interesting distinctions within CQA communities in question-asking behavior (where two distinct types of askers can be identified) and answering behavior (where two distinct roles surrounding answers emerge). Second, clustering communities with similar role compositions according reveals that these clusters have interesting topical differences as well as statistically significant differences in mean health for different health metrics. Third, we show that each discovered cluster corresponds to a distinct evolution of role composition that suggests that users engaging in discussion on answers will eventually become the dominant role in most communities and remain that way.
In this paper, we interpret the community question answering websites on the StackExchange platform as knowledge markets, and analyze how and why these markets can fail at scale. A knowledge market framing allows site operators to reason about market failures, and to design policies to prevent them. Our goal is to provide insights on large-scale knowledge market failures through an interpretable model. We explore a set of interpretable economic production models on a large empirical dataset to analyze the dynamics of content generation in knowledge markets. Amongst these, the Cobb-Douglas model best explains empirical data and provides an intuitive explanation for content generation through the concepts of elasticity and diminishing returns. Content generation depends on user participation and also on how specific types of content (e.g. answers) depends on other types (e.g. questions). We show that these factors of content generation have constant elasticity and a percentage increase in any of the inputs leads to a constant percentage increase in the output. Furthermore, markets exhibit diminishing returns-the marginal output decreases as the input is incrementally increased. Knowledge markets also vary on their returns to scale-the increase in output resulting from a proportionate increase in all inputs. Importantly, many knowledge markets exhibit diseconomies of scale-measures of market health (e.g., the percentage of questions with an accepted answer) decrease as a function of the number of participants. The implications of our work are two-fold: site operators ought to design incentives as a function of system size (number of participants); the market lens should shed insight into complex dependencies amongst different content types and participant actions in general social networks.
This paper addresses the question of identifying a concept dependency graph for a MOOC through unsupervised analysis of lecture transcripts. The problem is important: extracting a concept graph is the first step in helping students with varying preparation to understand course material. The problem is challenging: instructors are unaware of the student preparation diversity and may be unable to identify the right resolution of the concepts, necessitating costly updates; inferring concepts from groups suffers from polysemy; the temporal order of concepts depends on the concepts in question. We propose innovative unsupervised methods to discover a directed concept dependency within and between lectures. Our main technical innovation lies in exploiting the temporal ordering amongst concepts to discover the graph. We propose two measures—the Bridge Ensemble Measure and the Global Direction Measure—to infer the existence and the direction of the dependency relations between concepts. The bridge ensemble measure identifies concept overlap between lectures, determines concept co-occurrence within short windows, and the lecture where concepts occur first. The global direction measure incorporates time directly by analyzing the concept time ordering both globally and within lectures. Experiments over real-world MOOC data show that our method outperforms the baseline in both AUC and precision/recall curves.
The rise of the ``big data'' era has created a pressing demand for educating many data scientists and engineers quickly at low cost. It is essential they learn by working on assignments that involve real world data sets to develop the skills needed to be successful in the workplace. However, enabling instructors to flexibly deliver all kinds of data science assignments using real world data sets to large numbers of learners (both on-campus and off-campus) at low cost is a significant open challenge. To address this emerging challenge generally, we develop and deploy a novel Cloud-based Lab for Data Science (CLaDS) to enable many learners around the world to work on real-world data science problems without having to move or otherwise distribute prohibitively large data sets. Leveraging version control and continuous integration, CLaDS provides a general infrastructure to enable any instructor to conveniently deliver any hands-on data science assignment that uses large real world data sets to as many learners as our cloud-computing infrastructure allows at very low cost. In this paper, we present the design and implementation of CLaDS and discuss our experience with using CLaDS to deploy seven major text data assignments for students in both an on-campus course and an online course to work on for learning about text data retrieval and mining techniques; this shows that CLaDS is a very promising novel general infrastructure for efficiently delivering a wide range of hands-on data science assignments to a large number of learners at very low cost.
Imagine a movie-trailer voice intoning, "In a world where AI has learned to partner with humans to peacefully advance society..." And then forget it! That movie, you see, is never getting made—no explosions, no antagonism, no killer robots. However, outside Hollywood, the co-adaptive development of humans and artificial intelligence (AI) may be worth a bit more consideration. There is little doubt at this point that the growth and maturation of AI will be a major influence on our economy and society overall. Significant work is underway both on advancing AI and on combining human and artificial intelligence to improve the functionality and user experience of AI-based methods, tools, and services. Advanced AI is successfully reshaping many transactional contexts such as image search and purchase recommendations, as well as contexts that involve repetitive activity, such as manufacturing. However, AI is progressing much more slowly in contexts that involve rich experiences aimed at advancing human intelligence and the overall human condition—for example, in education. A potentially unintended consequence of this is increased emphasis on the lower-hanging fruit of transactional and repetitive contexts, and less emphasis on the more complex human-development contexts that are critical for a healthy society. This article proposes a design approach for tackling the integration of AI into human-development contexts while promoting the development of new forms of cyber-human intelligence.
This paper proposes an approach to learn robust behavior representations in online platforms by addressing the challenges of user behavior skew and sparse participation. Latent behavior models are important in a wide variety of applications: recommender systems; prediction; user profiling; community characterization. Our framework is the first to jointly address skew and sparsity across graphical behavior models. We propose a generalizable bayesian approach to partition users in the presence of skew while simultaneously learning latent behavior profiles over these partitions to address user-level sparsity. Our behavior profiles incorporate the temporal activity and links between participants, although the proposed framework is flexible to introduce other definitions of participant behavior. Our approach explicitly discounts frequent behaviors and learns variable size partitions capturing diverse behavior trends. The partitioning approach is data-driven with no rigid assumptions, adapting to varying degrees of skew and sparsity. A qualitative analysis indicates our ability to discover niche and informative user groups on large online platforms. Results on User Characterization (+6-22% AUC); Content Recommendation (+6-43% AUC) and Future Activity Prediction (+12-25% RMSE) indicate significant gains over state-of-the-art baselines. Furthermore, user cluster quality is validated with magnified gains in the characterization of users with sparse activity.
In recent times, deep neural networks have found success in Collaborative Filtering (CF) based recommendation tasks. By parametrizing latent factor interactions of users and items with neural architectures, they achieve significant gains in scalability and performance over matrix factorization. However, the long-tail phenomenon in recommender performance persists on the massive inventories of online media or retail platforms. Given the diversity of neural architectures and applications, there is a need to develop a generalizable and principled strategy to enhance long-tail item coverage. In this paper, we propose a novel adversarial training strategy to enhance long-tail recommendations for users with Neural CF (NCF) models. The adversary network learns the implicit association structure of entities in the feedback data while the NCF model is simultaneously trained to reproduce these associations and avoid the adversarial penalty, resulting in enhanced long-tail performance. Experimental results show that even without auxiliary data, adversarial training can boost long-tail recall of state-of-the-art NCF models by up to 25%, without trading-off overall performance. We evaluate our approach on two diverse platforms, content tag recommendation in Q&A forums and movie recommendation.
Modern social platforms are characterized by the presence of rich user-behavior data associated with the publication, sharing and consumption of textual content. Users interact with content and with each other in a complex and dynamic social environment while simultaneously evolving over time. In order to effectively characterize users and predict their future behavior in such a setting, it is necessary to overcome several challenges. Content heterogeneity and temporal inconsistency of behavior data result in severe sparsity at the user level. In this paper, we propose a novel mutual-enhancement framework to simultaneously partition and learn latent activity profiles of users. We propose a flexible user partitioning approach to effectively discover rare behaviors and tackle user-level sparsity. We extensively evaluate the proposed framework on massive datasets from real-world platforms including Q&A networks and interactive online courses (MOOCs). Our results indicate significant gains over state-of-the-art behavior models ( 15% avg ) in a varied range of tasks and our gains are further magnified for users with limited interaction data. The proposed algorithms are amenable to parallelization, scale linearly in the size of datasets, and provide flexibility to model diverse facets of user behavior.
We propose a resource-constrained network growth model that explains the emergence of key structural properties of real-world directed networks: heavy-tailed in-degree distribution, high local clustering and degree-clustering relationship. In real-world networks, individuals form edges under constraints of limited network access and partial information. However, well-known growth models that preserve multiple structural properties do not incorporate these resource constraints. Conversely, existing resource-constrained models do not jointly preserve multiple structural properties of real-world networks. We propose a random walk growth model that explains how real-world network properties can jointly arise from edge formation under resource constraints. In our model, each node that joins the network selects a seed node from which it initiates a random walk. At each step of the walk, the new node either jumps back to the seed node or chooses an outgoing or incoming edge to visit another node. It links to each visited node with some probability and stops after forming a few edges. Our experimental results against four well-known growth models indicate improvement in accurately preserving structural properties of five citation networks. Our model also preserves two structural properties that most growth models cannot: skewed local clustering distribution and bivariate in-degree-clustering relationship.
In this paper, we model the community question answering (CQA) websites on Stack Exchange platform as knowledge markets, and analyze how and why these markets can fail at scale. Analyzing CQA websites as markets allows site operators to reason about the failures in knowledge markets, and design policies to prevent these failures. Our main contribution is to provide insight on knowledge market failures. We explore a set of interpretable economic production models to capture content generation dynamics in knowledge markets. The best performing of these, well-known in economic literature as Cobb-Douglas equation, provides an intuitive explanation for content generation in the knowledge markets. Specifically, it shows that (1) factors of content generation such as user participation and content dependency have constant elasticity--a percentage increase in any of the inputs leads to a constant percentage increase in the output, (2) in many markets, factors exhibit diminishing returns--the incremental, marginal output decreases as the input is incrementally increased, (3) markets vary according to their returns to scale--the increase in output resulting from a proportionate increase in all inputs, and finally (4) many markets exhibit diseconomies of scale--measures of market health decrease as a function of overall system size (number of participants).
We propose a probabilistic packet reception model for Bluetooth Low Energy (BLE) packets in indoor spaces and we validate the model by using it for indoor localization. We expect indoor localization to play an important role in indoor public spaces in the future. We model the probability of reception of a packet as a generalized quadratic function of distance, beacon power and advertising frequency. Then, we use a Bayesian formulation to determine the coefficients of the packet loss model using empirical observations from our testbed. We develop a new sequential Monte-Carlo algorithm that uses our packet count model. The algorithm is general enough to accommodate different spatial configurations. We have good indoor localization experiments: our approach has an average error of ~ 1.2m, 53% lower than the baseline range-free Monte-Carlo localization algorithm.
This paper introduces new techniques for sampling attributed networks to support standard Data Mining tasks. The problem is important for two reasons. First, it is commonplace to perform data mining tasks such as clustering and classification of network attributes (attributes of the nodes, including social media posts), on sampled graphs since real-world networks can be very large. And second, the early work on network samplers (e.g. ForestFire, Re-weighted Random Walk, Metropolis-Hastings Random Walk) focused on preserving structural properties of the network (e.g. degree distribution, diameter) in the sample. However, it is unclear if these data agnostic samplers tuned to preserve network structural properties would preserve salient characteristics of network content; preserving salient data characteristics is critical for clustering and classification tasks. There are three contributions of this paper. First, we introduce several data aware samplers based on Information Theoretic principles. Second, we carefully analyze data aware samplers with state of the art data agnostic samplers (which use only network structure to sample) for three different data mining tasks: data characterization, clustering and classification. Finally, our experimental results over large real-world datasets and synthetic benchmarks suggest a surprising result: there is no single sampler that is consistently the best across all tasks. We show that data aware samplers perform significantly better (p <0.05) than data agnostic samplers on data coverage, clustering, classification tasks.
In traditional public good experiments participants receive an endowment from the experimenter that can be invested in a public good or kept in a private account. In this paper we present an experimental environment where participants can invest time during five days to contribute to a public good. Participants can make contributions to a linear public good by logging into a web application and performing virtual actions. We compared four treatments, with different group sizes and information of (relative) performance of other groups. We find that information feedback about performance of other groups has a small positive effect if we control for various attributes of the groups. Moreover, we find a significant effect of the contributions of others in the group in the previous day on the number of points earned in the current day. Our results confirm that people participate more when participants in their group participate more, and are influenced by information about the relative performance of other groups.
We identify influential early adopters in a social network, where individuals are resource constrained, to maximize the spread of multiple, costly behaviors. A solution to this problem is especially important for viral marketing. The problem of maximizing influence in a social network is challenging since it is computationally intractable. We make three contributions. First, we propose a new model of collective behavior that incorporates individual intent, knowledge of neighbors actions and resource constraints. Second, we show that the multiple behavior influence maximization is NP-hard. Furthermore, we show that the problem is submodular, implying the existence of a greedy solution that approximates the optimal solution to within a constant. However, since the greedy algorithm is expensive for large networks, we propose efficient heuristics to identify the influential individuals, including heuristics to assign behaviors to the different early adopters. We test our approach on synthetic and real-world topologies with excellent results. We evaluate the effectiveness under three metrics: unique number of participants, total number of active behaviors and network resource utilization. Our heuristics produce 15-51% increase in expected resource utilization over the naïve approach.
This paper presents Lamina, a system for providing security and privacy to users in a public IoT space. Public IoT spaces, such as an IoT-enable retail store, have the potential to provide rich, targeted information and services to users within the environment. However, to fully realize such potential, users must be willing to share certain data regarding their habits and preferences with the public IoT spaces. To encourage users to share such information, we present Lamina, a system that ensures the user's data will not be leaked to third parties. Lamina uses CryptoCoP-based encryption and a unique MAC address rotation mechanism to ensure that a user's privacy is maintained and their data is protected while still allowing the public IoT space to collect sufficient information to effectively provide targeted services.
We study the problem of organizing a collection of objects—images, videos—into clusters, using crowdsourcing. This problem is no- toriously hard for computers to do automatically, and even with crowd workers, is challenging to orchestrate: (a) workers may cluster based on different latent hierarchies or perspectives; (b) work- ers may cluster at different granularities even when clustering using the same perspective; and (c) workers may only see a small portion of the objects when deciding how to cluster them (and therefore have limited understanding of the “big picture”). We develop cost-efficient, accurate algorithms for identifying the consensus organization (i.e., the organizing perspective most workers prefer to employ), and incorporate these algorithms into a cost-effective work- flow for organizing a collection of objects, termed ORCHESTRA. We compare our algorithms with other algorithms for clustering, on a variety of real-world datasets, and demonstrate that ORCHESTRA organizes items better and at significantly lower costs.
We have struck a Faustian bargain with major corporations---free information services in exchange for our web surfing behavioral data. Unfortunately, we have little control over not only what data is gathered about us, but also how long the data is stored and is used. Indeed, with the web, it is hard, if not impossible to be forgotten. As users interact with an Internet of Things (IoT) ecosystem, they leave behind traces of information about their presence, preferences and behavior. While the ecosystem can track individuals' movements to provide enhanced recommendations, individuals as with entities that track their web behavior, have little control over how this information is being used or distributed. Must the bargain between individuals and entities interested in tracking them in IoT environments, be asymmetric?
In response, we present Incognito, a secure and privacy preserving IoT framework where user information exposure is driven by the concept of identity. In particular, we advocate user-managed identities, leaving the control of the choice of identity in a given context, as well as the level of exposure, in the hands of the user. Using Incognito, users can create identities that work only within certain contexts and are meaningless outside of these contexts. Furthermore, Incognito allows for simple management of information exposure through contextual-policies for sharing as well as querying of an IoT ecosystem. By giving individuals full control over the information traces that they leave behind in an IoT infrastructure, Incognito, in essence, puts individuals on equal footing with the entities that want to track their behavioral data. Incognito fosters a symbiotic relationship; users will need to expose information in exchange for personalized recommendations and IoT organizations who provide sophisticated user experiences will see enhanced user engagement.
Real-world networks are often complex and large with millions of nodes, posing a great challenge for analysts to quickly see the big picture for more productive subsequent analysis. We aim at facilitating exploration of node-attributed networks by creating representations with conciseness, expressiveness, interpretability, and multi-resolution views. We develop such a representation as a map—among the first to explore principled network cartography for general networks. In parallel with common maps, ours has land- marks, which aggregate nodes homogeneous in their traits and interactions with nodes elsewhere, and roads, which represent the interactions between the landmarks. We capture such homogeneity by the similar roles the nodes played. Next, to concretely model the landmarks, we propose a probabilistic generative model of networks with roles as latent factors. Furthermore, to enable interactive zooming, we formulate novel model-based constrained optimization. Then, we design efficient linear-time algorithms for the optimizations. Experiments using real-world and synthetic net- works show that our method produces more expressive maps than existing methods, with up to 10 times improvement in network reconstruction quality. We also show that our method extracts landmarks with more homogeneous nodes, with up to 90% improvement in the average attribute/link entropy among the nodes over each landmark. Sense-making of a real-world network using a map computed by our method qualitatively verify the effectiveness of our method.
As users interact with an Internet of Things (IoT) ecosystem, they leave behind traces of information about their presence, preferences and behavior. While the ecosystem can track individuals' movements to provide enhanced recommendations, individuals have little control over how this information is being used or distributed. Such tracking has led to increasing privacy concerns over the use of IoT. While it is possible to develop systems to enable anonymous interaction with IoT, anonymity results in limited benefits to both individuals and IoT ecosystems. In response, we present Incognito, a secure and privacy preserving IoT framework where user information exposure is driven by the concept of identity. In particular, we advocate user-managed identities, leaving the control of the choice of identity in a given context, as well as the level of exposure, in the hands of the user. Using Incognito, users can create identities that work only within certain contexts and are meaningless outside of these contexts. Furthermore, Incognito allows for simple management of information exposure through contextual-policies for sharing as well as querying of an IoT ecosystem. By giving individuals full control over the information traces that they leave behind in an IoT infrastructure, Incognito, in essence, puts individuals on equal footing with the entities that want to track their behavioral data. Incognito fosters a symbiotic relationship; users will need to expose information in exchange for personalized recommendations and IoT organizations who provide sophisticated user experiences will see enhanced user engagement.
The rise of social media provides a great opportunity for people to reach out to their social connections to satisfy their information needs. However, generic social media platforms are not explicitly designed to assist information seeking of users. In this paper, we propose a novel framework to identify the social connections of a user able to satisfy his information needs. The information need of a social media user is subjective and personal, and we investigate the utility of his social context to identify people able to satisfy it. We present questions users post on Twitter as instances of information seeking activities in social media. We infer soft community memberships of the asker and his social connections by integrating network and content information. Drawing concepts from the social foci theory, we identify answerers who share communities with the asker w.r.t. the question. Our experiments demonstrate that the framework is effective in identifying answerers to social media questions.
This paper discusses the role of computing in engendering cooperation in social dilemmas such as sustainability and public health. These cooperative dilemmas exist at a large scale, within heterogeneous populations. Motivated by analysis of cooperation from empirical field studies, we argue that an integrative computational framework that analyzes social signals and verifies behaviors through smartphone sensors can shape and mold individual decisions to cooperate. We discuss four interconnected technical challenges and example solutions. The challenges include community discovery algorithms for construction of small homogenous groups, persuasion of individuals in resource constrained networks, activity monitoring in the wild and detection of large scale social coordination. We briefly discuss new applications that arise from a computational infrastructure for cooperation, including fighting childhood obesity, cybersecurity and improving public safety.
This article presents a personalized narrative on the early discussions within the Multimedia community and the subsequent research on experiential media systems. I discuss two different research initiatives—design of real-time, immersive multimedia feedback environments for stroke rehabilitation; exploratory environments for events that exploited the user's ability to make connections. I discuss the issue of foundations: the question of multisensory integration and superadditivity; the need for identification of “first-class” Multimedia problems; expanding the scope of Multimedia research.
Personal digital photo libraries embody a large amount of in- formation useful for research into photo organization, photo layout, and development of novel photo browser features. Even when anonymity can be ensured, amassing a sizable dataset from these libraries is still difficult due to the visi- bility and cost that would be required from such a study. We explore using the Mac App Store to reach more users to collect data from such personal digital photo libraries. More specifically, we compare and discuss how it differs from common data collection methods, e.g. Amazon Mechanical Turk, in terms of time, cost, quantity, and design of the data collection application. We have collected a large, openly available photo feature dataset using this manner. We illustrate the types of data that can be collected. In 60 days, we collected data from 20,778 photo sets (473,772 photos). Our study with the Mac App Store suggests that popular application distribu- tion channels is a viable means to acquire massive data col- lections for researchers.
Taskville is an interactive visualization that aims to increase awareness of tasks that occur in the workplace. It utilizes gameplay elements and playful interaction to motivate continued use. A preliminary study with 37 participants shows that Taskville succeeds at being a fun and enjoyable experience while also increasing awareness. A strong correlation was also found between two major study groups demonstrating its potential to increase awareness and stimulate task-based activity across work groups.
This paper reviews the state of the art and some emerging issues in research areas related to pattern analysis and monitoring of web-based social communities. This research area is important for several reasons. The presence of near ubiquitous low-cost computing and com- munication technologies have enabled people to access and share information at unprecedented scale, which necessitates new research for making sense of such content. Furthermore, popular websites with sophisticated media sharing and notification features allow users to stay in touch with friends and loved ones, and also to help form explicit and implicit groups. These social structures are an important source of information for better organizing and managing multimedia. In this article, we study how media-rich social networks provide additional insight into familiar multimedia research problems, including tagging and video ranking. In particular, we advance the idea that the contextual and social aspects of media semantics are as important for successful multimedia applications as the media content itself. We examine the inter-relationship between content and social context through the prism of three key questions. First, how do we extract context in which social interactions occur? Second, does social interaction provide value to the media object? And Finally, how does social media facilitate the re- purposing of shared content, and engender cultural memes? We present three case studies to examine these questions in detail. In the first case study, we show how to discover structure latent in the data, and use the structure to organize Flickr photo streams. In the second case study, we discuss how to determine the interestingness of conversations— and of participants—around videos uploaded to YouTube. Finally, we show how analysis of visual content—tracing content remixes, in particular—can help us understand the relationship amongst YouTube participants. For each case, we present an overview of recent work and review the state of the art. We also discuss two emerging issues related to the analysis of social networks—robust data sampling and scalable data analysis.
Presentation support tools, such as Microsoft PowerPoint, pose challenges both in terms of creating linear presentations from complex data and fluidly navigating such linear structures when presenting to diverse audiences. NextSlidePlease is a slideware application that addresses these challenges using a directed graph structure approach for authoring and delivering multimedia presentations. The application combines novel approaches for searching and analyzing presentation datasets, composing meaningfully structured presentations and efficiently delivering material under a variety of time constraints. We introduce and evaluate a presentation analysis algorithm intended to simplify the process of authoring dynamic presentations, and a time management and path selection algorithm that assists users in prioritizing content during the presentation process. Results from two comparative user studies indicate that the directed graph approach promotes the creation of hyperlinks, the consideration of connections between content items and a richer understanding of the time management consequences of including and selecting presentation material.
This paper focuses on detecting social, physical-world events from photos posted on social media sites. The problem is important: cheap media capture devices have significantly increased the number of photos shared on these sites. The main contribution of this paper is to incorporate online social interaction features in the detection of physical events. We believe that online social interaction reflect important signals among the participants on the “social affinity” of two photos, thereby helping event detection. We compute social affinity via a random-walk on a social interaction graph to determine similarity between two photos on the graph. We train a support vector machine classifier to combine the social affinity be- tween photos and photo-centric metadata including time, location, tags and description. Incremental clustering is then used to group photos to event clusters. We have very good results on two large scale real-world datasets: Upcoming and MediaEval. We show an improvement between 0.06–0.10 in F1 on these datasets
A photo stream is a chronological sequence of photos. Most existing photo stream segmentation methods assume that a photo stream comprises of photos from multiple events and their goal is to produce groups of photos, each corresponding to an event, i.e. they perform automatic albuming. Even if these photos are grouped by event, sifting through the abundance of photos in each event is cumbersome. To help make photos of each event more manageable, we propose a photo stream segmentation method for an event photo stream—the chronological sequence of photos of a single event—to produce groups of photos, each corresponding to a photo-worthy moment in the event.Our method is based on a hidden Markov model with parameters learned from time, EXIF metadata, and visual information from 1) training data of unlabelled, unsegmented event photo streams and 2) the event photo stream we want to segment. In an experiment with over 5000 photos from 28 personal photo sets, our method outperformed all six baselines with statistical significance (p<0.10 with the best baseline and p<0.005 with the others).
This paper explores photo organization within an event photo stream, i.e. the chronological sequence of photos from a single event. In our previous work, we have proposed a method to segment an event photo stream to produce groups of photos, each corresponding to a photo-worthy moment in the event. Building upon this work, we have developed a photo browser that uses our method to automatically group photos from a single event into smaller groups of photos we call chapters. The photo browser also affords users with a drag-and-drop interface to refine the chapter groupings. With the photo browser, we conducted an exploratory study of 23 college students with their 8096 personal photos from 92 events. In this paper, we report novel insights on how the subjects organized photos in each event into smaller groups and contrast our observations with existing literature on photo organization. We also explore how chapter- based photo organization affects photo-related tasks such as storytelling, searching and interpretation, through key aspects of the photo layouts. We found that subjects value the chronological order of the chapters more than maximizing screen space usage and that they value chapter consistency more than the chronological order of the photos. For automatic chapter groupings, having low chapter boundary misses is more important than having low chapter boundary false alarms; the choice of chapter criteria and granularity for chapter groupings are very subjective; and subjects found that chapter-based photo organization helps in all three tasks of the user study.
This is a position paper on the role of content analysis in media-rich online communities. We highlight changes in the multimedia generation and consumption process that has occurred the past decade, and find the new angles this has brought to multimedia analysis re- search. We first examine the content production, dissemination and consumption patterns in the recent social media studies literature. We derive an updated conceptual summary of media lifecycle. We present an update list of impact criteria and challenge areas for multimedia content analysis. Among the three criteria, two are existing but with new problems and solutions, one is new as a results of the community-driven content lifecycle. We present three case studies that addresses the impact criteria, and conclude with an outlook for emerging problems. This work uses the general methodology of a previous research column , while the observations and conclusions are new.
In the 21st century knowledge economy there is a growing need for the types of creative thinkers who can bridge the engineering mindset with the creative mindset, combining multiple types of skills. New economies will need workers who have "diagonal" skill sets, who can develop systems and content as an integrative process. This requires a new type of training and curriculum. In the newly formed "Digital Culture" undergraduate program at ASU, we attempt to support new types curricula by structuring differently the way students move through courses. With a constantly shifting and changing curriculum, structuring course enrollment using class prerequisites leads to fixed and rigid pathways through the curriculum. Instead, Digital Culture structures course sequences based on the students accumulation of abstract "Proficiencies" which are collected by students as they complete courses, and which act as keys to unlock access to higher level course. As a student accumulates more and more of these proficiencies, they are increasingly able to unlock new courses. This system leads to more flexible and adaptive pathways through courses while ensuring that students are prepared for entrance into more advanced classes. It is however more complicated and requires that students strategically plan their route through the curriculum. In order to support this kind of strategic planning we have designed and deployed a course planning system where students can simulate various possible paths through the curriculum. In this paper, we show our design process in coming up with our "Digital Culture Visual Planner". This design process starts with a network analysis of how all the Digital Culture courses are interrelated by, visualizing the relationships between proficiencies and courses. A number of possible design directions result from this analysis. Finally we select a single design and refine it to be understandable, useful and usable by new undergraduate Digital Culture majors.
In this paper, we present Wind Runners, which is a game designed for children with asthma. The goal of Wind Runners is to increase the likelihood of asthmatic children adhering to the NIH’s recommendation of measuring their peak expiratory flow (PEF) on a daily basis. We aim to accomplish this by incorporating both social gaming features and the actual medical regimen of measuring PEF into a mobile game.
Social network systems are significant scaffolds for political, economic and socio-cultural change. This is in part due to the widespread availability of sophisticated network technologies and the concurrent emergence of rich media websites. Social network sites provide new opportunities for social-technological research. Since we can inexpensively collect electronic records—over extended periods—of social data, spanning diverse populations, it is now possible to study social processes on a scale of tens of million individuals. To understand the large-scale dynamics of interpersonal interaction and its outcome, this article links the perspectives in the humanities for analysis of social networks to recent developments in data in- tensive computational approaches. With special emphasis on social communities mediated by network technologies, we review the historical research arc of community analysis, as well as methods applicable to community discovery in social media.
In this article, we present a novel algorithm to discover multi-relational structures from social media streams. A media item such as a photograph exists as part of a meaningful inter-relationship amongst several attributes including – time, visual content, users, and actions. Discovery of such relational structures enables us to understand the semantics of human activity and has applications in content organization, recommendation algorithms, and exploratory social network analysis. We are proposing a novel non-negative matrix factorization framework to characterize relational structures of group photo streams. The factorization incorporates image content features and contextual information. The idea is to consider a cluster as having similar relational patterns – each cluster consists of photos relating to similar content or context. Relations represent different aspects of the photo stream data, including visual content, associated tags, photo owners, and post times. The extracted structures minimize the mutual information of the predicted joint distribution. We also introduce a relational modularity function to determine the structure cost penalty, and hence determine the number of clusters. Extensive experiments on a large Flickr dataset suggest that our approach is able to extract meaningful relational patterns from group photo streams. We evaluate the utility of the discovered structures through a tag prediction task and through a user study. Our results show that our method based on relational structures, outperforms baseline methods, including feature and tag frequency based techniques, by 35%–420%. We have conducted a qualitative user study to evaluate the benefits of our framework in exploring group photo streams. The study indicates that users found the extracted clustering results clearly represent major themes in a group; the clustering results not only reflect how users describe the group data but often lead the users to discover the evolution of the group activity.
We propose SCENT, an innovative, scalable spectral analysis framework for internet scale monitoring of multi-relational social media data, encoded in the form of tensor streams. In particular, a significant challenge is to detect key changes in the social media data, which could reflect important events in the real world, sufficiently quickly. Social media data have three challenging characteristics. First, data sizes are enormous – recent technological advances allow hundreds of millions of users to create and share content within online social networks. Second, social data are often multi-faceted (i.e., have many dimensions of potential interest, from the textual content to user metadata). Finally, the data is dynamic – structural changes can occur at multiple time scales and be localized to a subset of users. Consequently, a framework for extracting useful in- formation from social media data needs to scale with data volume, and also with the number and diversity of the facets of the data. In SCENT, we focus on the computational cost of structural change detection in tensor streams. We extend compressed sensing (CS) to tensor data. We show that, through the use of randomized tensor ensembles, SCENT is able to encode the observed tensor streams in the form of compact descriptors. We show that the descriptors allow very fast detection of significant spectral changes in the tensor stream, which also reduce data collection, storage, and processing costs. Experiments over synthetic and real data show that SCENT is faster (17.7x–159x for change detection) and more accurate (above 0.9 F-score) than baseline methods.
This article presents the principles of an adaptive mixed reality rehabilitation (AMRR) system, as well as the training process and results from 2 stroke survivors who received AMRR therapy, to illustrate how the system can be used in the clinic. The AMRR system integrates traditional rehabilitation practices with state-of-the-art computational and motion capture technologies to create an engaging environment to train reaching movements. The system provides real-time, intuitive, and integrated audio and visual feedback (based on detailed kinematic data) representative of goal accomplishment, activity performance, and body function during a reaching task. The AMRR system also provides a quantitative kinematic evaluation that measures the deviation of the stroke survivor’s movement from an idealized, unimpaired movement. The therapist, using the quantitative measure and knowledge and observations, can adapt the feedback and physical environment of the AMRR system throughout therapy to address each participant’s individual impairments and progress. Individualized training plans, kinematic improvements measured over the entire therapy period, and the changes in relevant clinical scales and kinematic movement attributes before and after the month-long therapy are presented for 2 participants. The substantial improvements made by both participants after AMRR therapy demonstrate that this system has the potential to considerably enhance the recovery of stroke survivors with varying impairments for both kinematic improvements and functional ability.
This work aims at discovering community structure in rich media social networks through analysis of time- varying, multi-relational data. Community structure represents the latent social context of user actions. It has important applications such as search and recommendation. The problem is particularly useful in the enterprise domain where extracting emergent community structure on enterprise social media can help in forming new collaborative teams, in expertise discovery, and in the long term reorganization of enterprises based on collaboration patterns. There are several unique challenges: (a) In social media, the context of user actions is constantly changing and co-evolving; hence the social context contains time-evolving multi-dimensional relations. (b) The social context is determined by the available system features and is unique in each social media platform; hence the analysis of such data needs to flexibly incorporate various system features. In this article we propose MetaFac (MetaGraph Factorization), a framework that extracts community structures from dynamic, multi-dimensional social contexts and interactions. Our work has three key contributions: (1) metagraph, a novel relational hypergraph representation for modeling multi-relational and multi-dimensional social data; (2) an efficient multi-relational factorization method for community extraction on a given metagraph; (3) an on-line method to handle time-varying relations through incremental metagraph factorization. Extensive experiments on real-world social data collected from an enterprise and the public Digg social media website suggest that our technique is scalable and is able to extract meaningful communities per social media contexts. We illustrate the usefulness of our framework through two prediction tasks: (1) in the enterprise dataset, the task is to predict users’ future interests on tag usage, and (2) in the Digg dataset, the task is to predict users’ future interests on voting and commenting Digg stories. Our prediction significantly outperforms baseline methods (including aspect model and tensor analysis), indicating the promising direction of using metagraphs for handling time-varying social relational contexts.
We are motivated in our work by the following question: what factors influence individual participation in social media conversations? Conversations around user posted content, is central to the user experience in social media sites, including Facebook, YouTube and Flickr. Therefore, understanding why people participate, can have significant bearing on fundamental research questions in social network and media analysis, such as, network evolution, and information diffusion.
Our approach is as follows. We first identify several key aspects of social media conversations, distinct from both online forum discussions and other social networks. These aspects include intrinsic and extrinsic network factors. There are three factors intrinsic to the network: social awareness, community characteristics and creator reputation. The factors extrinsic to the network include: media context and conversational interestingness. Thereafter we test the effectiveness of each factor type in accounting for the observed participation of individuals using a Support Vector Regression based prediction framework. Our findings indicate that factors that influence participation depend on the media type: YouTube participation is different from a weblog such as Engadget. We further show that an optimal factor combination improves prediction accuracy of observed participation, by ~9--13% and ~8--11% over using just the best hypothesis and all hypotheses respectively. Implications of this work in understanding individual contributions in social media conversations, and the design of social sites in turn, are discussed.
This paper presents a novel, low-cost, real-time adaptive multimedia environment for home-based upper extremity rehabilitation of stroke survivors. The primary goal of this system is to provide an interactive tool with which the stroke survivor can sustain gains achieved within the clinical phase of therapy and increase the opportunity for functional recovery. This home-based mediated system has low cost sensing, off the shelf components for the auditory and visual feedback, and remote monitoring capability. The system is designed to continue active learning by reducing dependency on real-time feedback and focusing on summary feedback after a single task and sequences of tasks. To increase system effectiveness through customization, we use data from the training strategy developed by the therapist at the clinic for each stroke survivor to drive automated system adaptation at the home. The adaptation includes changing training focus, selecting proper feedback coupling both in real-time and in summary, and constructing appropriate dialogues with the stroke survivor to promote more efficient use of the system. This system also allows the therapist to review participant’s progress and adjust the training strategy weekly.
New motion capture technologies are allowing de- tailed, precise and complete monitoring of movement through real-time kinematic analysis. However, a clinically relevant understanding of movement impairment through kinematic analysis requires the development of computational models that integrate clinical expertise in the weighing of the kinematic parameters. The resulting kinematics based measures of movement impairment would further need to be integrated with existing clinical measures of activity disability. This is a challenging process requiring computational solutions that can extract correlations within and between three diverse data sets: human driven assessment of body function, kinematic based assessment of movement impairment and human driven assessment of activity. We propose to identify and characterize different sensorimotor control strategies used by normal individuals and by hemiparetic stroke survivors acquiring a skilled motor task. We will use novel quantitative approaches to further our understanding of how human motor function is coupled to multiple and simultaneous modes of feedback. The experiments rely on a novel interactive tasks environment developed by our team in which subjects are provided with rich auditory and visual feedback of movement variables to drive motor learning. Our proposed research will result in a computational framework for applying virtual information to assist motor learning for complex tasks that require coupling of proprioception, vision audio and haptic cues. We shall use the framework to devise a computational tool to assist with therapy of stroke survivors. This tool will utilize extracted relationships in a pre-clinical setting to generate effective and customized rehabilitation strategies.
This paper presents a novel generalized computational framework for quantitative kinematic evaluation of movement in a rehabilitation clinic setting. The framework integrates clinical knowledge and computational data-driven analysis together in a systematic manner. The framework provides three key benefits to rehabilitation: (a) the resulting continuous normalized measure allows the clinician to monitor movement quality on a fine scale and easily compare impairments across participants, (b) the framework reveals the effect of individual movement components on the composite movement performance helping the clinician decide the training foci, and (c) the evaluation runs in real-time, which allows the clinician to constantly track a patient‟s progress and make appropriate adaptations to the therapy protocol. The creation of such an evaluation is difficult because of the sparse amount of recorded clinical observations, the high dimensionality of movement and high variations in subject‟s performance. We address these issues by modeling the evaluation function as linear combination of multiple normalized kinematic attributes y=Σwiφi(xi) and estimating the attribute normalization function φi(·) by integrating distributions of idealized movement and deviated movement. The weights wi are derived from a therapist's pair-wise comparison using a modified RankSVM algorithm. We have applied this framework to evaluate upper limb movement for stroke survivors with excellent results – the evaluation results are highly correlated to the therapist's observations.
Raising awareness and motivating workers in a large collaborative enterprise is a challenging endeavor. In this paper we briefly describe Taskville, a distributed social media workplace game played by teams on large, public displays. Taskville gamifies the process of routine task management, introducing light competitive play within and between teams. We present the design and implementation of the Taskville game and offer insights and recommendations gained from two pilot studies.
This article analyzes communication within a set of individuals to extract the representative prototypical groups and provides a novel framework to establish the utility of such groups. Corporations may want to identify representative groups (which are indicative of the overall communication set) because it is easier to track the prototypical groups rather than the entire set. This can be useful for advertising, identifying “hot” spots of resource consumption as well as in mining representative moods or temperature of a community. Our framework has three parts: extraction, characterization, and utility of prototypical groups. First, we extract groups by developing features representing communication dynamics of the individuals. Second, to characterize the overall communication set, we identify a subset of groups within the community as the prototypical groups. Third, we justify the utility of these prototypical groups by using them as predictors of related external phenomena; specifically, stock market movement of technology companies and political polls of Presidential candidates in the 2008 U.S. elections. We have conducted extensive experiments on two popular blogs, Engadget and Huffington Post. We observe that the prototypical groups can predict stock market movement/political polls satisfactorily with mean error rate of 20.32%. Further, our method outperforms baseline methods based on alternative group extraction and prototypical group identification methods. We evaluate the quality of the extracted groups based on their conductance and coverage measures and develop metrics: predictivity and resilience to evaluate their ability to predict a related external time-series variable (stock market movement/political polls). This implies that communication dynamics of individuals are essential in extracting groups in a community, and the prototypical groups extracted by our method are meaningful in characterizing the overall communication sets.
This paper presents a novel mixed reality rehabilitation system used to help improve the reaching movements of people who have hemiparesis from stroke. The system provides real-time, multimodal, customizable, and adaptive feedback generated from the movement patterns of the subject's affected arm and torso during reaching to grasp. The feedback is provided via innovative visual and musical forms that present a stimulating, enriched environment in which to train the subjects and promote multimodal sensory-motor integration. A pilot study was conducted to test the system function, adaptation protocol and its feasibility for stroke rehabilitation. Three chronic stroke survivors underwent training using our system for six 75-min sessions over two weeks. After this relatively short time, all three subjects showed significant improvements in the movement parameters that were targeted during training. Improvements included faster and smoother reaches, increased joint coordination and reduced compensatory use of the torso and shoulder. The system was accepted by the subjects and shows promise as a useful tool for physical and occupational therapists to enhance stroke rehabilitation.
This chapter deals with the analysis of interpersonal communication dynamics in online social networks and social media. Communication is central to the evolution of social systems. Today, the different online social sites feature variegated interactional affordances, ranging from blogging, micro-blogging, sharing media elements (i.e., image, video) as well as a rich set of social actions such as tagging, voting, commenting and so on. Consequently, these communication tools have begun to redefine the ways in which we exchange information or concepts, and how the media channels impact our online interactional behavior. Our central hypothesis is that such communication dynamics between individuals manifest themselves via two key aspects: the information or concept that is the content of communication, and the channel i.e., the media via which communication takes place. We present computational models and discuss large-scale quantitative observational studies for both these organizing ideas. First, we develop a computational framework to determine the “interestingness” property of conversations cented around rich media. Second, we present user models of diffusion of social actions and study the impact of homophily on the diffusion process. The outcome of this research is twofold. First, extensive empirical studies on datasets from YouTube have indicated that on rich media sites, the conversations that are deemed “interesting” appear to have consequential impact on the properties of the social network they are associated with: in terms of degree of participation of the individuals in future conversations, thematic diffusion as well as emergent cohesiveness in activity among the concerned participants in the network. Second, observational and computational studies on large social media datasets such as Twitter have indicated that diffusion of social actions in a network can be indicative of future information cascades. Besides, given a topic, these cascades are often a function of attribute homophily existent among the participants. We believe that this chapter can make significant contribution into a better understanding of how we communicate online and how it is redefining our collective sociological behavior.
The emergence of the mediated social web—a distributed network of participants creating rich media content and engaging in interactive conversations through Internet-based communication technologies – has contributed to the evolution of powerful social, economic and cultural change. Online social network sites and blogs, such as Facebook, Twitter, Flickr and LiveJournal, thrive due to their fundamental sense of “community”. The growth of online communities offers both opportunities and challenges for researchers and practitioners. Participation in online communities has been observed to influence people’s behavior in diverse ways ranging from financial decision-making to political choices, suggesting the rich potential for diverse applications. However, although studies on the social web have been extensive, discovering communities from online social media remains challenging, due to the interdisciplinary nature of this subject. In this article, we present our recent work on characterization of communities in online social media using computational approaches grounded on the observations from social science.
This paper presents results from a clinical study of stroke survivors using an adaptive, mixed-reality rehabilitation (AMRR) system for reach and grasp therapy. The AMRR therapy provides audio and visual feedback on the therapy task, based on detailed motion capture, that places the movement in an abstract, artistic context. This type of environment promotes the generalizability of movement strategies, which is shown through kinematic improvements on an untrained reaching task and higher clinical scale scores, in addition to kinematic improvements in the trained task.
Platforms such as Twitter have provided researchers with ample opportunities to analytically study social phenomena. There are however, significant computational challenges due to the enormous rate of production of new information: re- searchers are therefore, often forced to analyze a judiciously selected “sample” of the data. Like other social media phenomena, information diffusion is a social process–it is affected by user context, and topic, in addition to the graph topology. This paper studies the impact of different attribute and topology based sampling strategies on the discovery of an important social media phenomena–information diffusion.We examine several widely-adopted sampling methods that select nodes based on attribute (random, location, and activity) and topology (forest fire) as well as study the impact of attribute based seed selection on topology based sampling. Then we develop a series of metrics for evaluating the quality of the sample, based on user activity (e.g. volume, number of seeds), topological (e.g. reach, spread) and temporal characteristics (e.g. rate). We additionally correlate the diffusion volume metric with two external variables–search and news trends. Our experiments reveal that for small sample sizes (30%), a sample that incorporates both topology and user- context (e.g. location, activity) can improve on naïve methods by a significant margin of ∼15-20%.
This paper presents a novel system architecture and evaluation metrics for an Adaptive Mixed Reality Rehabilitation (AMRR) system for stroke patient. This system provides a purposeful, engaging, hybrid (visual, auditory and physical) scene that encourages patients to improve their performance of a reaching and grasping task and promotes learning of generalizable movement strategies. This system is adaptive in that it provides assistive adaptation tools to help the rehabilitation team customize the training strategy. Our key insight is to combine the patients, rehabilitation team, multimodal hybrid environments and adaptation tools together as an adaptive experiential mixed reality system. There are three major contributions in this paper: (a) developing a computational deficit index for evaluating the patient's kinematic performance and a deficit-training-improvement (DTI) correlation for evaluating adaptive training strategy, (b) integrating assistive adaptation tools that help the rehabilitation team understand the relationship between the patient's performance and training and customize the training strategy, and (c) combining the interactive multimedia environment and physical environment together to encourage patients to transfer movement knowledge from media space to physical space. Our system has been used by two stroke patients for one-month mediated therapy. They have significant improvement in their reaching and grasping performance (+48.84% and +39.29%) compared to other two stroke patients who experienced traditional therapy (-18.31% and -8.06%).
Wearable, mobile computing platforms are envisioned to be used in out-patient monitoring and care. These systems continuously perform signal filtering, transformations, and classification, which are quite compute intensive, and quickly drain the system energy. The design space of these human activity sensors is large and includes the choice of sampling frequency, feature detection algorithm, length of the window of transition detection etc., and all these choices fundamentally trade-off power/performance for accuracy of detection. In this work, we explore this design space, and make several interesting conclusions that can be used as rules of thumb for quick, yet power-efficient designs of such systems. For instance, we find that the x-axis of our signal, which was oriented to be parallel to the forearm, is the most important signal to be monitored, for our set of hand activities. Our experimental results show that by carefully choosing system design parameters, there is considerable (5X) scope of improving the performance/power of the system, for minimal (5%) loss in accuracy.
We discover communities from social network data and analyze the community evolution. These communities are inherent characteristics of human interaction in online social networks, as well as paper citation networks. Also, communities may evolve over time, due to changes to individuals’ roles and social status in the network as well as changes to individuals’ research interests. We present an innovative algorithm that deviates from the traditional two-step approach to analyze community evolutions. In the traditional approach, communities are first detected for each time slice, and then compared to determine correspondences. We argue that this approach is inappropriate in applications with noisy data. In this paper, we propose FacetNet for analyzing communities and their evolutions through a robust unified process. This novel framework will discover communities and capture their evolution with temporal smoothness given by historic community structures. Our approach relies on formulating the problem in terms of maximum a posteriori (MAP) estimation, where the community structure is estimated both by the observed networked data and by the prior distribution given by historic community structures. Then we develop an iterative algorithm, with proven low time complexity, which is guaranteed to converge to an optimal solution. We per- form extensive experimental studies, on both synthetic datasets and real datasets, to demonstrate that our method discovers meaningful communities and provides additional insights not directly obtainable from traditional methods.
Transdisciplinary collaborations call for dynamic, responsive slide-ware presentations beyond the linear structure afforded by traditional tools. The NextSlidePlease application addresses this through a novel authoring and presentation interface. The application also features an innovative algorithm to enhance presentation time management. The cross-platform Java application is currently being evaluated in a variety of real-world presentation contexts.
In this position paper, we propose the idea that emergent and evolutionary aspects of semantics, which are complementary to the problem of semantic detection, are foundational to multimedia computing. We show that media rich social networks reveal certain implicit assumptions in concept learning about semantics, including semantic stability, emergence, and stability of context. We study the problem of semantic evolution in the context of media rich networks — (a) since meaning is an emergent artifact of human activity, it is crucial to study how human beings interact with, consume and share media data. (b) The ready availability of large scale social interaction datasets of blogs including sites such as Flickr and YouTube, allows us to instrument the relationship between media and human activity at a scale not available to earlier researchers. We have identified three initial problem areas critical to evolutionary aspects of semantics — community discovery, information flow and semantic diversity. We shall present examples of research problems addressed in each of the three areas.
Minimizing the number of computations a low-power device makes is important to achieve long battery life. In this paper we present a framework for a low-power device to minimize the number of calculations needed to detect and classify simple activities of daily living such as sitting, standing, walking, reaching, and eating. This technique uses wavelet analysis as part of the feature set extracted from accelerometer data. A log-likelihood ratio test and Hidden Markov Models (HMM) are used to detect transitions and classify different activities. A tradeoff is made between power and accuracy.
Online social networking sites such as Flickr and Facebook provide a diverse range of functionalities that foster online communities to create and share media content. In particular, Flickr groups are increasingly used to aggregate and share photos about a wide array of topics or themes. Unlike photo repositories where images are typically organized with respect to static topics, the photo sharing process as in Flickr often results in complex time-evolving social and visual patterns. Characterizing such time-evolving patterns can enrich media exploring experience in a social media repository. In this paper, we propose a novel framework that characterizes distinct time evolving patterns of group photo streams. We use a nonnegative joint matrix factorization approach to incorporate image content features and contextual information, including associated tags, photo owners and post times. In our framework, we consider a group as a mixture of themes — each theme exhibits similar patterns of image content and context. The theme extraction is to best explain the observed image content features and associations with tags, users and times. Extensive experiments on a Flickr dataset suggest that our approach is able to extract meaningful evolutionary patterns from group photo streams. We evaluate our method through a tag prediction task. Our prediction results outperform baseline methods, which indicate the utility of our theme based joint analysis.
This paper presents a novel social media summarization framework. Summarizing media created and shared in large scale online social networks unfolds challenging research problems. The networks exhibit heterogeneous social interactions and temporal dynamics. Our proposed framework relies on the co-presence of multiple important facets: who (users), what (concepts and media), how (actions) and when (time). First, we impose a syntactic structure of the social activity (relating users, media and concepts via specific actions) in our temporal multi-graph mining algorithm. Second, important activities along each facet are extracted as activity themes over time. Experiments on Flickr datasets demonstrate that our technique captures nontrivial evolution of media use in social networks.
We propose a computational framework to predict synchrony of action in online social media. Synchrony is a temporal social network phenomenon in which a large number of users are observed to mimic a certain action over a period of time with sustained participation from early users. Understanding social synchrony can be helpful in identifying suitable time periods of viral marketing. Our method consists of two parts – the learning framework and the evolution framework. In the learning framework, we develop a DBN based representation that includes an understanding of user context to predict the probability of user actions over a set of time slices into the future. In the evolution framework, we evolve the social network and the user models over a set of future time slices to predict social synchrony. Extensive experiments on a large dataset crawled from the popular social media site Digg (comprising ~7M diggs) show that our model yields low error (15.2∓4.3%) in predicting user actions during periods with and without synchrony. Comparison with baseline methods indicates that our method shows significant improvement in predicting user actions.
This paper aims at discovering community structure in rich media social networks, through analysis of time-varying, multi-relational data. Community structure represents the latent social context of user actions. It has important applications in information tasks such as search and recommendation. Social media has several unique challenges. (a) In social media, the context of user actions is constantly changing and co-evolving; hence the social context contains time-evolving multi-dimensional relations. (b) The social context is determined by the available system features and is unique in each social media website. In this paper we propose MetaFac (MetaGraph Factorization), a framework that extracts community structures from various social contexts and interactions. Our work has three key contributions: (1) metagraph, a novel relational hypergraph representation for modeling multi-relational and multi-dimensional social data; (2) an efficient factorization method for community extraction on a given metagraph; (3) an on-line method to handle time-varying relations through incremental metagraph factorization. Extensive experiments on real-world social data collected from the Digg social media website suggest that our technique is scalable and is able to extract meaningful communities based on the social media contexts. We illustrate the usefulness of our framework through prediction tasks. We outperform baseline methods (including aspect model and tensor analysis) by an order of magnitude.
In this paper we develop a recommendation framework to connect image content with communities in online social media. The problem is important because users are looking for useful feedback on their uploaded content, but finding the right community for feedback is challenging for the end user. Social media are characterized by both content and community. Hence, in our approach, we characterize images through three types of features: visual features, user generated text tags, and social interaction (user communication history in the form of comments). A recommendation framework based on learning a latent space representation of the groups is developed to recommend the most likely groups for a given image. The model was tested on a large corpus of Flickr images comprising 15,689 images. Our method outperforms the baseline method, with a mean precision 0.62 and mean recall 0.69. Importantly, we show that fusing image content, text tags with social interaction features outperforms the case of only using image content or tags.
This paper presents JAM (Joint Action Matrix Factorization), a novel framework to summarize social activity from rich media social networks. Summarizing social network activities requires an understanding of the relationships among concepts, users, and the context in which the concepts are used. Our work has three contributions: First, we propose a novel summarization method which extracts the co-evolution on multiple facets of social activity – who (users), what (concepts), how (actions) and when (time), and constructs a context rich summary called "activity theme". Second, we provide an efficient algorithm for mining activity themes over time. The algorithm extracts representative elements in each facet based on their co-occurrences with other facets through specific actions. Third, we propose new metrics for evaluating the summarization results based on the temporal and topological relationship among activity themes. Extensive experiments on real-world Flickr datasets demonstrate that our technique significantly outperforms several baseline algorithms. The results explore nontrivial evolution in Flickr photo-sharing communities.
The tasks in the physical environments are mainly information centric processes, such as search and exploration of physical objects. We have developed an informational environment, AURA that supports object searches in the physical world1. The goal of AURA is to enable individuals to use the environment in which they function as a living(short-term) memory of their activities and of the objects with which they interact in this environment. To support physical searches, the environment that the user is occupying must be transparently embedded with relevant information and made accessible by in-situ search mechanisms. We achieve this through innovative algorithms that re-imagine a collection of environmentally distributed RFID tags to act as a distributed storage cloud that encodes the required information for attribute-based object search. Since RFID tags lack radio transmitters and, thus, cannot communicate among each other, aura Prop and auraSearch leverage the movements of the humans in the environment to propagate information: as they move in the environment, users not only leave traces (or auras) of their own activities, but also help further disseminate auras of prior activities in the same space. This scheme creates an information-gradient in the physical environment which AURA then leverages to direct the user toward the object of interest. auraSearch significantly reduces the number of steps that the user has to walk while searching for a given object.
Rich media social networks promote not only creation and consumption of media, but also communication about the posted media item. What causes a conversation to be interesting, that prompts a user to participate in the discussion on a posted video? We conjecture that people participate in conversations when they find the conversation theme interesting, see comments by people whom they are familiar with, or observe an engaging dialogue between two or more people (absorbing back and forth exchange of comments). Importantly, a conversation that is interesting must be consequential - i.e. it must impact the social network itself. Our framework has three parts: characterizing themes, characterizing participants for determining interestingness and measures of consequences of a conversation deemed to be interesting. First, we detect conversational themes using a mixture model approach. Second, we determine interestingness of participants and interestingness of conversations based on a random walk model. Third, we measure the consequence of a conversation by measuring how interestingness affects the following three variables - participation in related themes, participant cohesiveness and theme diffusion. We have conducted extensive experiments using dataset from the popular video sharing site, YouTube. Our results show that our method of interestingness maximizes the mutual information, and is significantly better (twice as large) than three other baseline methods (number of comments, number of new participants and PageRank based assessment).
Social media websites promote diverse user interaction on media objects as well as user actions with respect to other users. The goal of this work is to discover community structure in rich media social networks, and observe how it evolves over time, through analysis of multi-relational data. The problem is important in the enterprise domain where extracting emergent community structure on enterprise social media, can help in forming new collaborative teams, aid in expertise discovery, and guide long term enterprise reorganization. Our approach consists of three main parts: (1) a relational hypergraph model for modeling various social context and interactions; (2) a novel hypergraph factorization method for community extraction on multi-relational social data; (3) an on-line method to handle temporal evolution through incremental hypergraph factorization. Extensive experiments on real-world enterprise data suggest that our technique is scalable and can extract meaningful communities. To evaluate the quality of our mining results, we use our method to predict users' future interests. Our prediction outperforms baseline methods (frequency counts, pLSA) by 36-250% on the average, indicating the utility of leveraging multi-relational social context by using our method.
In this paper, we introduce AURA, a novel framework for enriching the physical environment with information about objects and activities in order to support searches in the physical world. The goal is to enable individuals to use the environment in which they function as a living (short-term) memory of their activities and of the objects with which they interact in this environment. In order to act as a memory, the physical environment must be transparently embedded with relevant information and made accessible by in-situ search mechanisms. We achieve this embedding through innovative algorithms that leverage a collection of parasitic RFID tags distributed in the environment to act as a distributed storage cloud. Information about the activities of the users and objects with which they interact are encoded and stored, in a decentralized way, on these RFID tags to support attribute-based search. A novel auraProp algorithm disseminates information in the environment and a complementary auraSearch algorithm implements spatial searches for physical objects in the environment. Parasitic RFID tags are not self-powered and thus cannot communicate among each other. AURA leverages human movement in the environment to propagate information: as they move in the environment, users not only leave traces (or auras) of their own activities, but also help further disseminate auras of prior activities in the same space. AURA relies on a novel signature based information dissemination mechanism and a randomized information erasure scheme to ensure that the extremely limited storage spaces available on the RFID tags are used effectively. The erasure scheme also helps create an information gradient in the physical environment, which the auraSearch algorithm uses to direct the user towards the object of interest.
Experiential media systems refer to real time, physically grounded multimedia systems in which the user is both the producer and consumer of meaning. These systems require embodied interaction on part of the user to gain new knowledge. In this chapter we have presented our efforts to develop a real-time, multimodal biofeedback system for stroke patients. It is a highly specialized experiential media system where the knowledge that is imparted refers to a functional task – the ability to reach and grasp an object. There are several key ideas in this chapter: we show how to derive critical motion features using a biomechanical model for the reaching functional task. Then we determine the formal progression of the feedback and its relationship to action. We show how to map movement parameters into auditory and visual parameters in real-time. We develop novel validation metrics for spatial accuracy, opening, flow and consistency. Our real-world experiments with unimpaired subjects show that we are able to communicate key aspects of motion through feedback. Importantly they demonstrate the messages encoded in the feedback can be parsed by the unimpaired subjects.
In this article, we present a media adaptation framework for an immersive biofeedback system for stroke patient rehabilitation. In our biofeedback system, media adaptation refers to changes in audio/visual feedback as well as changes in physical environment. Effective media adaptation frameworks help patients recover generative plans for arm movement with potential for significantly shortened therapeutic time. The media adaptation problem has significant challenges—(a) high dimensionality of adaptation parameter space; (b) variability in the patient performance across and within sessions; (c) the actual rehabilitation plan is typically a non-first-order Markov process, making the learning task hard. Our key insight is to understand media adaptation as a real-time feedback control problem. We use a mixture-of-experts based Dynamic Decision Network (DDN) for online media adaptation. We train DDN mixtures per patient, per session. The mixture models address two basic questions—(a) given a specific adaptation suggested by the domain experts, predict the patient performance, and (b) given the expected performance, determine the optimal adaptation decision. The questions are answered through an optimality criterion based search on DDN models trained in previous sessions. We have also developed new validation metrics and have very good results for both questions on actual stroke rehabilitation data.
We have developed a computational framework to characterize social network dynamics in the blogosphere at individual, group and community levels. Such characterization could be used by corporations to help drive targeted advertising and to track the moods and sentiments of consumers. We tested our model on a widely read technology blog called Engadget. Our results show that communities transit between states of high and low entropy, depending on sentiments (positive / negative) about external happenings. We also propose an innovative method to establish the utility of the extracted knowledge, by correlating the mined knowledge with an external time series data (the stock market). Our validation results show that the characterized groups exhibit high stock market movement predictability (89%) and removal of 'impactful' groups makes the community less resilient by lowering predictability (26%) and affecting the composition of the groups in the rest of the community.
We present a framework for automatically summarizing social group activity over time. The problem is important in understanding large scale online social networks, which have diverse social interactions and exhibit temporal dynamics. In this work we construct summarization by extracting activity themes. We propose a novel unified temporal multi-graph framework for extracting activity themes over time. We use non-negative matrix factorization (NMF) approach to derive two interrelated latent spaces for users and concepts. Activity themes are extracted from the derived latent spaces to construct group activity summary. Experiments on real-world Flickr datasets demonstrate that our technique outperforms baseline algorithms such as LSI, and is additionally able to extract temporally representative activities to construct meaningful group activity summary.
In this paper, we develop a temporally evolving representation framework for context that can efficiently predict communication flow in social networks between a given pair of individuals. The problem is important because it facilitates determining social and market trends as well as efficient information paths among people. We describe communication flow by two parameters: the intent to communicate and communication delay. To estimate these parameters, we design features to characterize communication and social context. Communication context refers to the attributes of current communication. Social context refers to the patterns of participation in communication (information roles) and the degree of overlap of friends between two people (strength of ties). A subset of optimal features of the communication and social context is chosen at a given time instant using five different feature selection strategies. The features are thereafter used in a Support Vector Regression framework to predict the intent to communicate and the delay between a pair of individuals. We have excellent results on a real world dataset from the most popular social networking site, www.myspace.com. We observe interestingly that while context can reasonably predict intent, delay seems to be more dependent on the personal contextual changes and other latent factors characterizing communication, e.g. 'age' of information transmitted and presence of cliques among people.
In this paper, we develop a simple model to study and analyze communication dynamics in the blogosphere and use these dynamics to determine interesting correlations with stock market movement. This work can drive targeted advertising on the web as well as facilitate understanding community evolution in the blogosphere. We describe the communication dynamics by several simple contextual properties of communication, e.g. the number of posts, the number of comments, the length and response time of comments, strength of comments and the different information roles that can be acquired by people (early responders / late trailers, loyals / outliers). We study a "technology-savvy" community called Engadget (http://www.engadget.com). There are two key contributions in this paper: (a) we identify information roles and the contextual properties for four technology companies, and (b) we model them as a regression problem in a Support Vector Machine framework and train the model with stock movements of the companies. It is interestingly observed that the communication activity on the blogosphere has considerable correlations with stock market movement. These correlation measures are further cross-validated against two baseline methods. Our results are promising yielding about 78% accuracy in predicting the magnitude of movement and 87% for the direction of movement.
We discover communities from social network data, and analyze the community evolution. These communities are inherent characteristics of human interaction in online social networks, as well as paper citation networks. Also, communities may evolve over time, due to changes to individuals' roles and social status in the network as well as changes to individuals' research interests. We present an innovative algorithm that deviates from the traditional two-step approach to analyze community evolutions. In the traditional approach, communities are first detected for each time slice, and then compared to determine correspondences. We argue that this approach is inappropriate in applications with noisy data. In this paper, we propose FacetNet for analyzing communities and their evolutions through a robust unified process. In this novel framework, communities not only generate evolutions, they also are regularized by the temporal smoothness of evolutions. As a result, this framework will discover communities that jointly maximize the fit to the observed data and the temporal evolution. Our approach relies on formulating the problem in terms of non-negative matrix factorization, where communities and their evolutions are factorized in a unified way. Then we develop an iterative algorithm, with proven low time complexity, which is guaranteed to converge to an optimal solution. We perform extensive experimental studies, on both synthetic datasets and real datasets, to demonstrate that our method discovers meaningful communities and provides additional insights not directly obtainable from traditional methods.
This paper aims to develop a generalized framework to systematically trade off computational complexity with output distortion in linear transforms such as the DCT, in an optimal manner. The problem is important in real-time systems where the computational resources available are time-dependent. Our approach is generic and applies to any linear transform and we use the DCT as a specific example. There are three key ideas: (a) a joint transform pruning and Haar basis projection-based approximation technique. The idea is to save computations by factoring the DCT transform into signal-independent and signal-dependent parts. The signal-dependent calculation is done in real-time and combined with the stored signal-independent part, saving calculations. (b) We propose the idea of the complexity-distortion framework and present an algorithm to efficiently estimate the complexity distortion function and search for optimal transform approximation using several approximation candidate sets. We also propose a measure to select the optimal approximation candidate set, and (c) an adaptive approximation framework in which the operating points on the C-D curve are embedded in the metadata. We also present a framework to perform adaptive approximation in real time for changing computational resources by using the embedded metadata. Our results validate our theoretical approach by showing that we can reduce transform computational complexity significantly while minimizing distortion.
Events are real-world occurrences that unfold over space and time. Event mining from multimedia streams improves the access and reuse of large media collections, and it has been an active area of research with notable progress. This paper contains a survey on the problems and solutions in event mining, approached from three aspects: event description, event-modeling components, and current event mining systems. We present a general characterization of multimedia events, motivated by the maxim of five "W's" and one "H" for reporting real-world events in journalism: when, where, who, what, why, and how. We discuss the causes for semantic variability in real-world descriptions, including multilevel event semantics, implicit semantics facets, and the influence of context. We discuss five main aspects of an event detection system. These aspects are: the variants of tasks and event definitions that constrain system design, the media capture setup that collectively define the available data and necessary domain assumptions, the feature extraction step that converts the captured data into perceptually significant numeric or symbolic forms, statistical models that map the feature representations to richer semantic descriptions, and applications that use event metadata to help in different information-seeking tasks. We review current event-mining systems in detail, grouping them by the problem formulations and approaches. The review includes detection of events and actions in one or more continuous sequences, events in edited video streams, unsupervised event discovery, events in a collection of media objects, and a discussion on ongoing benchmark activities. These problems span a wide range of multimedia domains such as surveillance, meetings, broadcast news, sports, documentary, and films, as well as personal and online media collections. We conclude this survey with a brief outlook on open research directions.
This article addresses the problem of spam blog (splog) detection using temporal and structural regularity of content, post time and links. Splogs are undesirable blogs meant to attract search engine traffic, used solely for promoting affiliate sites. Blogs represent popular online media, and splogs not only degrade the quality of search engine results, but also waste network resources. The splog detection problem is made difficult due to the lack of stable content descriptors. We have developed a new technique for detecting splogs, based on the observation that a blog is a dynamic, growing sequence of entries (or posts) rather than a collection of individual pages. In our approach, splogs are recognized by their temporal characteristics and content. There are three key ideas in our splog detection framework. (a) We represent the blog temporal dynamics using self-similarity matrices defined on the histogram intersection similarity measure of the time, content, and link attributes of posts, to investigate the temporal changes of the post sequence. (b) We study the blog temporal characteristics using a visual representation derived from the self-similarity measures. The visual signature reveals correlation between attributes and posts, depending on the type of blogs (normal blogs and splogs). (c) We propose two types of novel temporal features to capture the splog temporal characteristics. In our splog detector, these novel features are combined with content based features. We extract a content based feature vector from blog home pages as well as from different parts of the blog. The dimensionality of the feature vector is reduced by Fisher linear discriminant analysis. We have tested an SVM-based splog detector using proposed features on real world datasets, with appreciable results (90% accuracy).
The paper develops a novel computational framework for predicting communication flow in social networks based on several contextual features. The problem is important because prediction of communication flow can impact timely sharing of specific information across a wide array of communities. We determine the intent to communicate and communication delay between users based on several contextual features in a social network corresponding to (a) neighborhood context, (b) topic context and (c) recipient context. The intent to communicate and communication delay are modeled as regression problems which are efficiently estimated using Support Vector Regression. We predict the intent and the delay, on an interval of time using past communication data. We have excellent prediction results on a real-world dataset from MySpace.com with an accuracy of 13-16%. We show that the intent to communicate is more significantly influenced by contextual factors compared to the delay.
There are information needs involving costly decisions that cannot be efficiently satisfied through conventional Web search engines. Alternately, community centric search can provide multiple viewpoints to facilitate decision making. We propose to discover and model the temporal dynamics of thematic communities based on mutual awareness, where the awareness arises due to observable blogger actions and the expansion of mutual awareness leads to community formation. Given a query, we construct a directed action graph that is time-dependent, and weighted with respect to the query. We model the process of mutual awareness expansion using a random walk process and extract communities based on the model. We propose an interaction space based representation to quantify community dynamics. Each community is represented as a vector in the interaction space and its evolution is determined by a novel interaction correlation method. We have conducted experiments with a real-world blog dataset and have promising results for detection as well as insightful results for community evolution.
In this paper, we present a media adaptation framework for an immersive biofeedback system for stroke patient rehabilitation. In our biofeedback system, media adaptation refers to changes in audio/visual feedback as well as changes in physical environment. Effective media adaptation frameworks help patients recover generative plans for arm movement with potential for significantly shortened therapeutic time. The media adaptation problem has significant challenges - (a) high dimensionality of adaptation parameter space (b) variability in the patient performance across and within sessions(c) the actual rehabilitation plan is typically a non first-order Markov process, making the learning task hard. Our key insight is to understand media adaptation as a real-time feedback control problem. We use a mixture-of-experts based Dynamic Decision Network (DDN) for online media adaptation. We train DDN mixtures per patient, per session. The mixture models address two basic questions - (a) given a specific adaptation suggested by the domain expert, predict patient performance and (b) given an expected performance, determine optimal adaptation decision. The questions are answered through an optimality criterion based search on DDN models trained in previous sessions. We have also developed new validation metrics and have very good results for both questions on actual stroke rehabilitation data.
In this paper, we present a novel visual design for information dense summaries of patient data with applications in biofeedback rehabilitation. The problem is important in review of large medical datasets where the clinicians require that both summary and all the performance details be shown at the same time. There are two main ideas (a) Summarizing data along the conceptual facets (accuracy / flow / openness) and the temporal facets (session / set / trial) in the biofeedback therapy. The conceptual facets represent key information needed by the experts to review patient performance. (b) Effectively present the data trends and the details in context of the entire performance. The summary incorporates ideas from graphic design and reveals the performance data at two time scales.
This paper focuses on the development of an event driven media sharing repository to facilitate community awareness. In this paper, an event refers to a real-world occurrences that unfolds over space and time. Our event model implementation supports creation of events using the standard facets of who, where, when and what. A key novelty in this research lies in the support of arbitrary event-event semantic relationships. We facilitate global as well as personalized event relationships. Each relationship can be unary or binary and can be at multiple granularities. The relationships can exist between events, between media, and between media and events. We have implemented a web based media archive system that allows people to create, explore and mange events. We have implemented an RSS based notification system that promotes awareness of actions. The initial user feedback has been positive and we are in the process of conducting a longitudinal study.
In this paper, we present a novel visual design for information dense summaries of patient data with applications in biofeedback rehabilitation. The problem is important in review of large medical datasets where the clinicians require that both summary and all the performance details be shown at the same time. There are two main ideas (a) Summarizing data along the conceptual facets (accuracy / flow / openness) and the temporal facets (session / set / trial) in the biofeedback therapy. The conceptual facets represent key information needed by the experts to review patient performance. (b) Effectively present the data trends and the details in context of the entire performance. The summary incorporates ideas from graphic design and reveals the performance data at two time scales.
This work deals with the problem of event annotation in social networks. The problem is made difficult due to variability of semantics and due to scarcity of labeled data. Events refer to real-world phenomena that occur at a specific time and place, and media and text tags are treated as facets of the event metadata. We are proposing a novel mechanism for event annotation by leveraging related sources (other annotators) in a social network. Our approach exploits event concept similarity, concept co-occurrence and annotator trust. We compute concept similarity measures across all facets. These measures are then used to compute event-event and user-user activity correlation. We compute inter-facet concept co-occurrence statistics from the annotations by each user. The annotator trust is determined by first requesting the trusted annotators (seeds) from each user and then propagating the trust amongst the social network using the biased PageRank algorithm. For a specific media instance to be annotated, we start the process from an initial query vector and the optimal recommendations are determined by using a coupling strategy between the global similarity matrix, and the trust weighted global co-occurrence matrix. The coupling links the common shared knowledge (similarity between concepts) that exists within the social network with trusted and personalized observations (concept co-occurrences). Our initial experiments on annotated everyday events are promising and show substantial gains against traditional SVM based techniques.
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms. The presence of splogs degrades blog search results as well as wastes network resources. In our approach we exploit unique blog temporal dynamics to detect splogs. There are three key ideas in our splog detection framework. We first represent the blog temporal dynamics using self-similarity matrices defined on the histogram intersection similarity measure of the time, content, and link attributes of posts. Second, we show via a novel visualization that the blog temporal characteristics reveal attribute correlation, depending on type of the blog (normal blogs and splogs). Third, we propose the use of temporal structural properties computed from self-similarity matrices across different attributes. In a splog detector, these novel features are combined with content based features. We extract a content based feature vector from different parts of the blog -- URLs, post content, etc. The dimensionality of the feature vector is reduced by Fisher linear discriminant analysis. We have tested an SVM based splog detector using proposed features on real world datasets, with excellent results (90% accuracy).
This paper aims to develop a novel framework to systematically trade-off computational complexity with output distortion in linear multimedia transforms, in an optimal manner. The problem is important in real-time systems where the computational resources available are time-dependent. We solve the real-time adaptation problem by developing an approximate transform framework. There are three key contributions of this paper—(a) a fast basis projection approximation framework that allows us to store signal independent partial transform results to be used in real-time, (b) estimating the complexity distortion curve for the linear transform approximation using a given basis projection approximation set and searching for optimal transform approximation which satisfies the complexity constraint with minimum distortion and (c) determining optimal operating points on complexity distortion function and a meta-data embedding algorithm for images that allows for real-time adaptation. We have applied this approach on the FFT approximation for images with excellent results.
This paper describes a framework to annotate images using personal and social network contexts. The problem is important as the correct context reduces the number of image annotation choices.. Social network context is useful as real-world activities of members of the social network are often correlated within a specific context. The correlation can serve as a powerful resource to effectively increase the ground truth available for annotation. There are three main contributions of this paper: (a) development of an event context framework and definition of quantitative measures for contextual correlations based on concept similarity in each facet of event context; (b) recommendation algorithms based on spreading activations that exploit personal context as well as social network context; (c) experiments on real-world, everyday images that verified both the existence of inter-user semantic disagreement and the improvement in annotation when incorporating both the user and social network context. We have conducted two user studies, and our quantitative and qualitative results indicate that context (both personal and social) facilitates effective image annotation.
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms and splogs corrupt blog search results as well as waste network resources. In our approach we exploit unique blog temporal dynamics to detect splogs. The key idea is that splogs exhibit high temporal regularity in content and post time, as well as consistent linking patterns. Temporal content regularity is detected using a novel autocorrelation of post content. Temporal structural regularity is determined using the entropy of the post time difference distribution, while the link regularity is computed using a HITS based hub score measure. Experiments based on the annotated ground truth on real world dataset show excellent results on splog detection tasks with 90% accuracy.
In this paper, we present a framework to analyze and summarize the temporal dynamics within personal blogs. Blog temporal dynamics are difficult to capture using a few class descriptors. Our approach comprises of (1) a representation of blog dynamics using self-similarity matrices, (2) theme extraction using non-negative self-similarity matrix factorization, and (3) a visualization representing blog theme evolution. Summaries based on large real-world blog datasets reveals interesting temporal characteristics for four blog types - personal blog, cooperative blog, power blog and spam blogs.
This paper describes our framework to annotate events using personal and social network contexts. The problem is important as the correct context is critical to effective annotation. Social network context is useful as real-world activities of members of the social network are often correlated, within a specific context. There are two main contributions of this paper: (a) development of an event context framework and definition of quantitative measures for contextual correlations based on concept similarity (b) recommendation algorithms based on spreading activations that exploit personal context as well as social network context. We have very good experimental results. Our user study with real world personal images indicates that context (both personal and social) facilitates effective image annotation.
In this paper, we develop a theoretical understanding of multi-sensory knowledge and user context and their inter-relationships. This is used to develop a generic representation framework for multi-sensory knowledge and context. A representation framework for context can have a significant impact on media applications that dynamically adapt to user needs. There are three key contributions of this work: (a) theoretical analysis, (b) representation framework and (c) experimental validation. Knowledge is understood to be a dynamic set of multi-sensory facts with three key properties – multi-sensory, emergent and dynamic. Context is the dynamic subset of knowledge that affects the communication between entities. We develop a graph based, multi-relational representation framework for knowledge, and model its temporal dynamics using a linear dynamical system. Our approach results in a stable and convergent system. We applied our representation framework to a image retrieval system with a large collection of photographs from everyday events. Our experimental validation with the retrieval evaluated against two reference algorithms indicates that our context based approach provides significant gains in real-world usage scenarios.
This paper describes a novel and functional application of data sonification as an element in an immersive stroke rehabilitation system. For two years, we have been developing a task-based experiential media biofeedback system that incorporates musical feedback as a means to maintain patient interest and impart movement information to the patient. This paper delivers project background, system goals, a description of our system including an in-depth look at our audio engine, and lastly an overview of proof of concept experiments with both unimpaired subjects and actual stroke patients suffering from right-arm impairment.