Igraph

Материал из Letopisi.Ru — «Время вернуться домой»
Перейти к: навигация, поиск

Пакет, предназначенный для работы с данными и представлениями сетевых отношений. Многие задачи связаны с SNA -social netwok analysis


Подробная документация:

R and Networks
http://jfaganuk.github.io/2014/12/29/r-and-networks/
http://jfaganuk.github.io/2015/01/02/analyzing-a-basic-network/

http://igraph.org/r/doc/plot.common.html - описание основных параметров внешнего вида графа

The igraph package is coded in the back end entirely in C, which makes it blazingly fast. It is always preferable to use igraph functions instead of writing your own as much as possible since you will experience a large speed difference.

Основные сетевые характеристики графа про помощи пакета - Basic graph analytics using igraph - http://horicky.blogspot.ru/2012/04/basic-graph-analytics-using-igraph.html

Create a directed graph using adjacency matrix - Матрица смежности

m <- matrix(runif(4*4), nrow=4) # создали матрицу смежности g <- graph.adjacency(m > 0.5) # а граф построили только для вершин со значениями больше заданного


Network visualization in R with the igraph package - http://www.r-bloggers.com/network-visualization-in-r-with-the-igraph-package/

Содержание

layouts

http://igraph.org/r/doc/plot.common.html

layout.fruchterman.reingold

The Fruchterman and Reingold algorithm (FR) is often described as employing a "nuclear force"45 metaphor: all nodes repel each other, but connected nodes attract. A given pair of nodes is at an optimal spacing when the attractive and repulsive forces between them cancel out. At each pass of the algorithm, the "force" vectors between all nodes are added, giving the displacement and new position for each node. The displacement is limited by a "temperature" parameter that is gradually decreased until no movement is possible.


layout.kamada.kawai

The Kamada-Kawai algorithm is commonly described as a "spring-embedder," meaning that it fits with a general class of algorithms that represents a network as a virtual collection of weights (nodes) connected by springs (arcs) with a degree of elasticity and a desired resting length. T


Описание отдельный layouts в статье http://www.cmu.edu/joss/content/articles/volume7/deMollMcFarland/

Процедура представления совместной деятельности как графа

В данном случае - исходные данные = действия участников блога Галактика в 2010 году - как все это начиналось и во что это сложилось. См. История образовательной Галактики

Fast, efficient two-mode to one-mode conversion in R

R/Конверсия биграфа в монограф


E(mydata.igraph)$label <- mydata[,3] # Например, мы захотим поставить тип связи - оценка, редактирование или комментарий - мы можем это сделать.

Объединение узлов

The function contract.vertices() merges several vertices into one. By computing the community structure, one can control how this merging happens. At conclusion of the contraction, two vertices can have multiple edges.

Contracting and simplifying a network graph
http://blog.revolutionanalytics.com/2015/08/contracting-and-simplifying-a-network-graph.html

Объединение связей

The equivalent step for edges is simplify(). A simplified graph contains only a single edge between two nodes. The simplification step can compute summary statistics for the combined edges, for example the sum of edge weights.

edge.attr.comb
http://www.inside-r.org/packages/cran/igraph/docs/attribute.combination

Варианты:

  • lt2s.network <- simplify(lt2.network , remove.multiple = T, remove.loops = T, edge.attr.comb=c(weight="sum", type="ignore") ) ;
    • g4 <- simplify(g3, edge.attr.comb = list(weight = "sum"))
    • g4 <- simplify( g3, remove.multiple = T, remove.loops = T, edge.attr.comb=c(weight="sum", type="ignore") )

Фильтрация связей: g.edge3 <- subgraph.edges(g4, which(E(g)$weight > 1))

Атрибуты узлов

  • vertex.color цвет вершины
    • vertex.color="gold", vertex.color="dark red"
    • vertex.color="lightsteelblue2"
  • vertex.frame.color цвет контура вершины
  • vertex.shape форма обозначения вершины, одно из значений «none», «circle», «square», «csquare», «rectangle», «crectangle», «vrectangle», «pie», «raster», «sphere»
  • vertex.size размер вершины (по умолчанию 15)
  • vertex.size2 второй параметр размера вершины (например, для прямоугольника)
  • vertex.label вектор символов для обозначения вершин
    • vertex.label.color="black"
  • vertex.label.family семейство шрифтов для меток вершин (например, «Times», «Helvetica»)
  • vertex.label.font шрифт: 1 — обычный, 2 — жирный, 3 — курсив, 4 — жирный курсив, 5 — символьный
  • vertex.label.cex размер шрифта (множитель, зависит от устройства)
    • vertex.label.cex=0.8
  • vertex.label.dist расстояние между меткой и вершиной

Атрибуты связей

  • edge.arrow.size=.4 - размер стрелки
  • edge.color - цвет связи
  • edge.width - толщина связи
    • E(g4)$width <- ifelse (E(g4)$weight < 100, 0.1, 1)
  • edge.lty - 1 - solid, 2 - dashed, 3 - dotted,
    • E(g4)$lty <- ifelse (E(g4)$weight < 100, 3, 1)

Basic Graph Algorithms

  • mst <- minimum.spanning.tree(g4)

Minimum Spanning Tree algorithm is to find a Tree that connect all the nodes within a connected graph while the sum of edges weight is minimum.

clusters
clusters(g4, mode="weak")


Connected Component algorithms is to find the island of nodes that are interconnected with each other, in other words, one can traverse from one node to another one via a path.


$membership

$csize
[1] 3 2 3 932 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

2 2 2 2 2 2 2 2 2 2 2 2 2

Т.е. есть огромный кластер на 932 + еще кучки по 2-3 никак не связанных. На самом деле - эти 2-3 переводятся, что он сделал пост или 2 и не получил отклика

Сетевые метрики

Центральность

Centrality functions (vertex level) and centralization functions (graph level). The centralization functions return res - vertex centrality, centralization, and theoretical_max - maximum centralization score for a graph of that size. The centrality function can run on a subset of nodes (set with the vids parameter). This is helpful for large graphs where calculating all centralities may be a resource-intensive and time-consuming task.

Degree (number of ties)

degree(net, mode="in") centr_degree(net, mode="in", normalized=T)

Closeness

(centrality based on distance to others in the graph) Inverse of the node’s average geodesic distance to others in the network.

  • closeness(net, mode="all", weights=NA)
  • centr_clo(net, mode="all", normalized=T)

Eigenvector

centrality proportional to the sum of connection centralities Values of the first eigenvector of the graph matrix.

  • eigen_centrality(net, directed=T, weights=NA)
  • centr_eigen(net, directed=T, normalized=T)

Betweenness

centrality based on a broker position connecting others Number of geodesics that pass through the node or the edge.

  • betweenness(net, directed=T, weights=NA)

edge_betweenness(net, directed=T, weights=NA)

  • centr_betw(net, directed=T, normalized=T)


Разные способы определения сообществ

Простой сетевой анализ с R http://jfaganuk.github.io/2015/01/24/basic-network-analysis/


Finding clusters of CRAN packages using igraph
http://blog.revolutionanalytics.com/2014/12/finding-clusters-of-cran-packages-using-igraph.html


Finding clusters of CRAN packages using igraph
http://blog.revolutionanalytics.com/2014/12/finding-clusters-of-cran-packages-using-igraph.html
Network basics with R and igraph
https://assemblingnetwork.wordpress.com/2013/06/10/network-basics-with-r-and-igraph-part-ii-of-iii/

gcon <- simplify(gcon, edge.attr.comb = list(weight = "sum", function(x)length(x)))

What are the differences between community detection algorithms in igraph?
http://stackoverflow.com/questions/9471906/what-are-the-differences-between-community-detection-algorithms-in-igraph/9478989#9478989

Частота использования терминов в документе

An Example of Social Network Analysis with R using Package igraph - https://rdatamining.wordpress.com/2012/05/17/an-example-of-social-network-analysis-with-r-using-package-igraph/

рассматриваются термины и твиты, в которых они использовались

> g <- graph.adjacency(termMatrix, weighted=T, mode = “undirected”)
> # remove loops
> g <- simplify(g)
> # set labels and degrees of vertices
> V(g)$label <- V(g)$name
> V(g)$degree <- degree(g)
> # set seed to make the layout reproducible
> set.seed(3952)

Улучшение внешнего вида

V(g)$label.cex <- 2.2 * V(g)$degree / max(V(g)$degree)+ .2
V(g)$label.color <- rgb(0, 0, .2, .8)
V(g)$frame.color <- NA
egam <- (log(E(g)$weight)+.4) / max(log(E(g)$weight)+.4)
 E(g)$color <- rgb(.5, .5, 0, egam)
E(g)$width <- egam
# plot the graph in layout1
plot(g, layout=layout1)


Видео-руководство


Персональные инструменты
Инструменты