Dual Degree Ph.D. Student in Computer Science

Miguel Araújo Wins Best Student Paper Runner-Up Award at an International Conference 

Miguel Araújo CMU Portugal 2012  Miguel Araújo, dual degree Ph.D. student in Computer Science, at Faculdade de Ciências of the Universidade do Porto (FCUP) and Carnegie Mellon University (CMU), received the Best Student Paper Runner-Up Award at the Pacific Asia Knowledge Discovery and Data Mining 2014  (PAKDD) for the paper he co-authored titled “Com2: Fast Automatic Discovery of Temporal (’Comet’) Communities.”
“Winning an award is the best way to keep you motivated. The committee's recognition validates the work you are doing and it is a good way to see it promoted in the community,” says Miguel Araújo who co-authored this paper with his advisor at CMU Christos Faloutsos, and with Spiros Papadimitriou (Rutgers University), Stephan Guennemann (CMU), Prithwish Basu (BBN Technologies), Ananthram Swami (Army Research Laboratory), Evangelos E. Papalexakis (CMU), and Danai Koutra (CMU).

Miguel Araújo, who is currently enrolled in his third year, was able to publish three papers so far. “I am very happy as I feel that the quality of my work has been constantly improving, and I feel that each submission is better than the last,” says Miguel Araújo explaining that “it is very difficult to keep the pace for a very long time as new ideas and insights need time to mature.”

Miguel Araújo is co-advised by Christos Faloutsos, faculty member at CMU, and by Pedro Ribeiro, from FCUP, at the MAP-I program (Doctoral Program common to three Portuguese Universities – Aveiro, Minho and Porto).

CMU Portugal: What are the main findings of your paper titled "Com2: Fast Automatic Discovery of Temporal (’Comet’) Communities”? 
Miguel Araújo (MA): We developed a very efficient tool to automatically identify communities in time-evolving networks. Our method finds groups of users that are very well connected, either persistently or at some points in time. We analyzed phone call and computer networks and, for example, our method makes it easy to distinguish personal and work-related communications.

CMU Portugal: Is the topic of your paper related to your dual degree Ph.D. work?
 MA: My Ph.D. work is focused on detecting patterns and anomalies in structured graph data, and communities are a very good example of the type of patterns we can find. Understanding these patterns is an essential part for recommendation tasks in social networks, but there are also many applications in biology; for instance, the study of protein-protein interaction networks is important to gain insights in biochemical processes and can lead to the development of new drugs. On the other hand, anomalies can simply be described as an event that breaks the pattern. There are clear applications in intrusion detection, credit card fraud, spammers, etc. 

CMU Portugal: Since the beginning of your studies as a dual degree Ph.D. student you were able to publish three papers in leading conferences and journals, and one of them awarded. How do you comment on these three years? 

MA: I am very happy as I feel that the quality of my work has been constantly improving and I feel that each submission is better than the last. However, it is very difficult to keep the pace for a very long time as new ideas and insights need time to mature. After each paper is submitted, we often work in several directions simultaneously and only then decide which one is worth exploring. The constant juggle between breadth and depth is very interesting.

CMU Portugal: How do you comment on your experience as a dual degree Ph.D. student in CS that is on his third year?  
MA: We get to experience the best from both worlds - we get twice the feedback and it's a lot easier to get the ball rolling! Then there are significant differences in research methodology, and the need to be in two sides of the ocean at the same time is challenging. Overall, the exposition to different environments is key and helps us be better researchers.

__________

Title: "Com2: Fast Automatic Discovery ofTemporal (’Comet’) Communities"

Authors: Miguel Araújo (Universidade do Porto and CMU), Spiros Papadimitriou (Rutgers University), Stephan Guennemann (CMU), Christos Faloutsos (CMU), Prithwish Basu (BBN Technologies), Ananthram Swami (Army Research Laboratory), Evangelos E. Papalexakis (CMU), Danai Koutra (CMU).

Abstract: 
Given a large who-calls-whom network, changing over time, how can we find patterns and anomalies? We propose Com2, which operates on a network of 4 million mobile users, with 51 million edges (phonecalls), over 14 days. Com2 is able to spot temporal communities (comet communities), in a scalable and parameter-free way.
The idea behind our method is to use a novel and fast, incremental tensor analysis approach, coupled with minimum description language to discover both transient and periodic/repeating communities without the need for user-defined parameters. We report our findings, which include large ’star’-like patterns, near-bipartite-cores, as well as tiny groups (5 users), calling each other hundreds of times within a few days.

___________

PAKDD is a leading international conference in the areas of data mining and knowledge discovery. It provides an international forum for researchers and industry practitioners to share their latest developments, new ideas, original research results and practical development experiences from all knowledge discovery in databases (KDD) related areas.

March 2015