Networks -> Core/Periphery -> Coreness

Contents - Index

NETWORK > CORE/PERIPHERY > CONTINUOUS

PURPOSE Fit a continuous (ratio-level) core/periphery model to a data network, and estimate the coreness of each actor.

DESCRIPTION Simultaneously fits a core/periphery model to the data network and estimates the degree of coreness or closeness to the core of each actor. This is done by finding a vector C such that the product of C and C transpose is as close as possible to the original data matrix. In addition a number of measures which try to assess the degree to which the network falls into a core/periphery structure for different sizes of core are calculated. Each measure starts with the actor with the highest coreness score and places them in the core and all other actors are placed in the periphery. The core is then successively increased by moving the actor with the highest coreness score from the periphery into the core. This is continued until the periphery consists of a single actor. nDiff is a generalization of centralization and sums the differences between the actor in the core with the lowest coreness score with all those in the periphery and adds to this the sum of the difference between the actor with the highest score in the periphery and all the actors in the core. This value is then normalized. Diff is similar but places a weighting on the size of the core, this weighting is equal to the square root of the core size and so the measure gives greater value to smaller cores. The correlation measure correlates the given coreness scores with the ideal scores of a one for every core member and a zero for actors in the periphery. Finally, Ident is the same as the correlation measure but uses Euclidean distance in place of correlation.

PARAMETERS Input dataset:
Name of file containing network to be analyzed. Data type: Valued Digraph.

Data are Pos or Neg: (Default = POSITIVE)
Use positive to indicate that larger values imply a stronger relationship. Use negative to indicate that larger values in the data imply a more distant relationship.

Use Corr or Distance: (Default = CORR)
Which measure of fit to use. Corr measures the correlation between the data matrix and the product of C and C transpose. Distance uses Euclidean distance in place of correlation, in this case C is simply the principal eigenvector. Minres is factor analysis without diagonals

Prevent Negatives:
It is possible for the best C to contain negative values, choosing yes prevents this happening.

Max # of iterations: (Default = 1000)
The maximum number of iterations used in the optimization procedure.

Diagonal values valid: (Default = NO)
If NO diagonal values are ignored.

Output dataset: (Default = 'Coreness')
Name of file containing coreness values.

LOG FILE The correlation or Euclidean distance between the model and the data at the start and end of the optimization procedure together with the number of iterations required. Minres option just gives the final correlation.
The coreness of each actor, this has been normalized so that the sum of squares is one. Followed by some descriptive statistics including gini coefficients and an heterogeneity measure. The gini coefficient measures how the scores are distributed over the population and measures the amount of inequality in the data. If everyone had the same score it gives a value of zero, if a single actor had a value of 1 and everyone else had a score of zero it gives a value of 1. The composite score is an adjusted measure which takes account of the fact that we are looking for core-periphery structures. The heterogeneity measure is based on a simple summing of proportions which measures the extent to which the scores are evenly distributed.
This is followed by a table of the four concentration measures which assess the extent to which the data fits a core periphery structure. Each column gives a different measure, the value in row i places the i actors with the highest coreness in the core and the remainder in the periphery.
This is followed by a recommended core size based on the correlation measure. See the comments below.
Finally the expected values are given, this is C times C transpose and then normalized so that it has the same mean and standard deviation as the data.

TIMING O(N^3)

COMMENTS The concentration measures can need careful interpretation. If nDiff has a clear maxima which is not at 1 or n-1 then this indicates a solid core periphery structure. Often nDiff has a number of maxima indicating that there are a group of actors situated between the core and the periphery. If the user still wishes to specify a core then the other measures can be used. Diff is a biased measure and gives more weight to smaller cores and again if this has a clear maxima this can indicate a core. If this does not yield any conclusive results or there is no requirement to favor smaller cores then it is recommended that the correlation is used together with nDiff or Diff. The correlation measure can indicate an area in which to focus and the other measures can be used to fine tune the measure to identify a core size. Ident should be used in the same way as correlation but it places more weight on the absolute scores.

REFERENCES Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social Networks 21 375-395
Comrey AL (1962) The minimum residual method for factor analysis. Psychological Reports 11, 15-18.