Introduction

Many network interactions include both positive and negative edges to account for ‘against’ or ‘for’ types of relationships that cannot be modeled using regular network algorithms. Structural balance theory is attributed to Heider (1946) and Cartwright and Harary (1956). The theory briefly describes four types balanced triads that is:

Enemy of an enemy is a friend
Friend of a friend is a friend
Enemy of a friend is an enemy
Friend of an enemy is an enemy

These relationship can be observed in Figure 1.

Alt Text — Figure 1: Balanced network example. From Samin Aref’s slides.

We can define an undirected signed network as \(G = (V, E, \sigma)\) where \(\sigma\) is the sign function that maps edges to \(\{-1, 1\}\).

The network is structually balanced if it can be partitioned into two sets \(V_+\) and \(V_-\) such that all edges between \(V_+\) and \(V_-\) are negative and all edges within \(V_+\) and \(V_-\) are positive.

The network is k-balanced if it can be partitioned into \(k\) sets \(V_1, V_2, ..., V_k\) such that all edges between \(V_i\) and \(V_j\) are negative for \(i \neq j\) and all edges within \(V_i\) are positive.

Edges are said to be frustrated if positive edges have different endpoint colors or negative edges have the same endpoint colors.

It is reasonable to develop clustering models that minimize the number of frustrated edges. However, this is an NP-hard problem. Aref and Neal (2021) have developed an integer linear programming model that minimizes the number of frustrated edges. The model is as follows:

\[\begin{align*} \min_{f_{ij}} & \sum_{(i,j) \in E} f_{ij} \\ \text{s.t.} & \sum_{c \in C} x_{ic} = 1 & \forall i \in V \\ & f_{ij} \ge x_{ic} - x_{jc} & \forall (i,j) \in E^+, \forall c \in C \\ & f_{ij} \ge x_{ic} + x_{jc} - 1 & \forall (i,j) \in E^-, \forall c \in C \\ & x_{ic} \in \{0, 1\} & \forall i \in V, \forall c \in C \\ & f_{ij} \in \{0,1\} & \forall (i,j) \in E \end{align*}\]

where:

\(E\) is the set of edges in the network
\(V\) is the set of nodes in the network
\(C\) is the set of communities in the network
\(f_{ij}\) is a binary variable indicating whether edge \((i,j)\) is frustrated
\(x_{ic}\) is a binary variable indicating whether node \(i\) belongs to community \(c\)

Here is an application of frustration minimization that particularly clusters the network into two communities based on the AND model developed by Samin Aref et al. (2020)

Some codes to read or generate signed networks are provided below followed by the application of the AND model.

Code

import networkx as nx
import numpy as np

def graph_from_adjacency_matrix_signed(A, mode='undirected'):
    unique_elements = np.unique(A)
    for element in unique_elements:
        if element not in (-1, 0, 1):
            raise ValueError("A should only have entries -1, 0, 1")

    if mode == 'undirected':
        G = nx.from_numpy_matrix(A, create_using=nx.Graph())
    else:
        G = nx.from_numpy_matrix(A, create_using=nx.DiGraph())

    for (u, v, wt) in G.edges.data('weight'):
        if wt < 0:
            G.edges[u, v]['weight'] = -1
        elif wt > 0:
            G.edges[u, v]['weight'] = 1
        else:
            G.edges[u, v]['weight'] = 0

    return G

Code

import networkx as nx
import numpy as np

def graph_from_edgelist_signed(el, signs, directed=False):
    if len(el[0]) != 2:
        raise ValueError("graph_from_edgelist_signed expects a numpy array with two columns")
    if len(signs) != len(el):
        raise ValueError("signs and el must have the same length")
    if not all(sign in [-1, 1] for sign in signs):
        raise ValueError("signs should only have entries -1 or 1")
    if directed:
        G = nx.DiGraph()
    else:
        G = nx.Graph()
    for i in range(len(el)):
        G.add_edge(el[i][0], el[i][1], sign=signs[i])
    return G

Code

import networkx as nx
import numpy as np
def sample_gnp_signed(n, p, p_neg, directed=False, loops=False):
    if p_neg is None:
        raise ValueError("p_neg is missing with no default")
    # Create a graph based on the Erdos-Renyi model
    if directed:
        G = nx.fast_gnp_random_graph(n, p, directed=True)
    else:
        G = nx.fast_gnp_random_graph(n, p)
    # Assign a sign to each edge with a given probability p_neg
    for edge in G.edges():
        G.edges[edge]['sign'] = np.random.choice([-1, 1], p=[p_neg, 1-p_neg])
    return G

Code

nsign = sample_gnp_signed(15, 0.4, 0.5)

Code

import networkx as nx
#!pip install pulp
from pulp import LpVariable, LpProblem, LpMinimize, lpSum

def frustration_exact(g):
    #if 'sign' not in g.edges.data():
        #raise ValueError("Network does not have a sign edge attribute")

    A = nx.to_numpy_array(g, weight='sign', dtype=int)
    #d = A.sum(axis=1)
    n = len(g)

    x = [[LpVariable(f'x_{i}_{j}', 0, 1, 'Binary') for j in range(n)] for i in range(n)]
    y = [LpVariable(f'y_{i}', 0, 1, 'Binary') for i in range(n)]

    prob = LpProblem("FrustrationExact", LpMinimize)

    objective = (
        lpSum(y[i] + y[j] - 2 * x[i][j] for i in range(n) for j in range(i + 1, n) if A[i, j] == 1) +
        lpSum(1 - (y[i] + y[j] - 2 * x[i][j]) for i in range(n) for j in range(i + 1, n) if A[i, j] == -1)
    )
    prob += objective

    for i in range(n):
        for j in range(i + 1, n):
            if A[i, j] == 1:
                prob += x[i][j] <= y[i]
                prob += x[i][j] <= y[j]
            elif A[i, j] == -1:
                prob += x[i][j] >= y[i] + y[j] - 1

    prob.solve()

    partition = [val.varValue for val in y]
    frustration = prob.objective.value()

    return {'frustration': frustration, 'partition': partition}

Code

#frustration_exact(nsign)

References

1: Aref, Samin, and Zachary P. Neal. “Identifying hidden coalitions in the US House of Representatives by optimally partitioning signed networks based on generalized balance.” Scientific reports 11.1 (2021): 19939.

2: Aref, Samin, Andrew J. Mason, and Mark C. Wilson. “A modeling and computational study of the frustration index in signed networks.” Networks 75.1 (2020): 95-110.