I found something odd.
I like graphs, and network analysis. I have have taken a few classes, and even written about the application of social network analysis to Enterprise Architecture (Data Structure Graphs).
It my toolbox I use iGraph in R, NetworkX from Python, and Gephi for general analysis, visualization and network study.
A segment of a social network (Photo credit: Wikipedia) 
I stumbled onto something odd today.
In R I loaded a graph originally generated from Gephi. Did a few calculations on the graph and converted the output to a data frame.
One of the metrics did not match between the two.
Eigenvector Centrality.
Thinking this only a little odd, I wrote some simple Python NetworkX code to assist me with determining which of the two answers was correct.
Guess what?
I got a third answer.
I did some research on the mechanism used by each for calculating the Eigenvector Centrality measure.
All of them appear to be valid.
So, I created a test case. Using our friendly example of the konisberg bridges. The first image link in google that shows the bridges on today' google search is here:
I created a simple input file to use in all three tools:
Source  Target 
A  C 
A  B 
B  A 
B  C 
B  D 
D  B 
D  C 
Here are the results for Eigenvector Centrality using the tools mentioned above:

I realize that all of these tools use slightly different algorithms all written by different authors.
However, I would not expect the answers to quite so divergent for such a reference problem as the Konisberg bridges.
I wrote this to ask a few questions:
 Has anyone else noticed this?
 Is there a set of options to pass to iGraph, Gephi, or NetworkX to make their calculations be similar?
 What should be considered the proper Eigenvector centralities for the seven bridges of konisberg?
I put all of my test code, along with test data on my github here: Eigenvector Questions
If anyone could point out a good way to get the same answer using multiple tools like this, I would appreciate it.
Thank you.
No comments:
Post a Comment