Igraph
Пакет, предназначенный для работы с данными и представлениями сетевых отношений. Многие задачи связаны с SNA -social netwok analysis
Подробная документация:
- Network Analysis and Visualization with R and igraph
- http://kateto.net/networks-r-igraph
- http://kateto.net/network-visualization - визуализация сетей с R
- R and Networks
- http://jfaganuk.github.io/2014/12/29/r-and-networks/
- http://jfaganuk.github.io/2015/01/02/analyzing-a-basic-network/
http://igraph.org/r/doc/plot.common.html - описание основных параметров внешнего вида графа
The igraph package is coded in the back end entirely in C, which makes it blazingly fast. It is always preferable to use igraph functions instead of writing your own as much as possible since you will experience a large speed difference.
Основные сетевые характеристики графа про помощи пакета - Basic graph analytics using igraph - http://horicky.blogspot.ru/2012/04/basic-graph-analytics-using-igraph.html
Create a directed graph using adjacency matrix - Матрица смежности
m <- matrix(runif(4*4), nrow=4) # создали матрицу смежности g <- graph.adjacency(m > 0.5) # а граф построили только для вершин со значениями больше заданного
Network visualization in R with the igraph package - http://www.r-bloggers.com/network-visualization-in-r-with-the-igraph-package/
Содержание |
layouts
http://igraph.org/r/doc/plot.common.html
layout.fruchterman.reingold
The Fruchterman and Reingold algorithm (FR) is often described as employing a "nuclear force"45 metaphor: all nodes repel each other, but connected nodes attract. A given pair of nodes is at an optimal spacing when the attractive and repulsive forces between them cancel out. At each pass of the algorithm, the "force" vectors between all nodes are added, giving the displacement and new position for each node. The displacement is limited by a "temperature" parameter that is gradually decreased until no movement is possible.
layout.kamada.kawai
The Kamada-Kawai algorithm is commonly described as a "spring-embedder," meaning that it fits with a general class of algorithms that represents a network as a virtual collection of weights (nodes) connected by springs (arcs) with a degree of elasticity and a desired resting length. T
Описание отдельный layouts в статье http://www.cmu.edu/joss/content/articles/volume7/deMollMcFarland/
Процедура представления совместной деятельности как графа
В данном случае - исходные данные = действия участников блога Галактика в 2010 году - как все это начиналось и во что это сложилось. См. История образовательной Галактики
Fast, efficient two-mode to one-mode conversion in R
R/Конверсия биграфа в монограф
- http://kateto.net/network-visualization
- Визуализация статических и динамических сетей на R - http://habrahabr.ru/company/infopulse/blog/263953/
E(mydata.igraph)$label <- mydata[,3] # Например, мы захотим поставить тип связи - оценка, редактирование или комментарий - мы можем это сделать.
Объединение узлов
The function contract.vertices() merges several vertices into one. By computing the community structure, one can control how this merging happens. At conclusion of the contraction, two vertices can have multiple edges.
- Contracting and simplifying a network graph
- http://blog.revolutionanalytics.com/2015/08/contracting-and-simplifying-a-network-graph.html
Объединение связей
The equivalent step for edges is simplify(). A simplified graph contains only a single edge between two nodes. The simplification step can compute summary statistics for the combined edges, for example the sum of edge weights.
Варианты:
- lt2s.network <- simplify(lt2.network , remove.multiple = T, remove.loops = T, edge.attr.comb=c(weight="sum", type="ignore") ) ;
- g4 <- simplify(g3, edge.attr.comb = list(weight = "sum"))
- g4 <- simplify( g3, remove.multiple = T, remove.loops = T, edge.attr.comb=c(weight="sum", type="ignore") )
Фильтрация связей: g.edge3 <- subgraph.edges(g4, which(E(g)$weight > 1))
Атрибуты узлов
- vertex.color цвет вершины
- vertex.color="gold", vertex.color="dark red"
- vertex.color="lightsteelblue2"
- vertex.frame.color цвет контура вершины
- vertex.shape форма обозначения вершины, одно из значений «none», «circle», «square», «csquare», «rectangle», «crectangle», «vrectangle», «pie», «raster», «sphere»
- vertex.size размер вершины (по умолчанию 15)
- vertex.size2 второй параметр размера вершины (например, для прямоугольника)
- vertex.label вектор символов для обозначения вершин
- vertex.label.color="black"
- vertex.label.family семейство шрифтов для меток вершин (например, «Times», «Helvetica»)
- vertex.label.font шрифт: 1 — обычный, 2 — жирный, 3 — курсив, 4 — жирный курсив, 5 — символьный
- vertex.label.cex размер шрифта (множитель, зависит от устройства)
- vertex.label.cex=0.8
- vertex.label.dist расстояние между меткой и вершиной
Атрибуты связей
- edge.arrow.size=.4 - размер стрелки
- edge.color - цвет связи
- edge.width - толщина связи
- E(g4)$width <- ifelse (E(g4)$weight < 100, 0.1, 1)
- edge.lty - 1 - solid, 2 - dashed, 3 - dotted,
- E(g4)$lty <- ifelse (E(g4)$weight < 100, 3, 1)
Basic Graph Algorithms
- http://horicky.blogspot.ru/2012/04/basic-graph-analytics-using-igraph.html - Подробное описание и смыслы
- mst <- minimum.spanning.tree(g4)
Minimum Spanning Tree algorithm is to find a Tree that connect all the nodes within a connected graph while the sum of edges weight is minimum.
- clusters
- clusters(g4, mode="weak")
Connected Component algorithms is to find the island of nodes that are interconnected with each other, in other words, one can traverse from one node to another one via a path.
$membership
- $csize
- [1] 3 2 3 932 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2
Т.е. есть огромный кластер на 932 + еще кучки по 2-3 никак не связанных. На самом деле - эти 2-3 переводятся, что он сделал пост или 2 и не получил отклика
Сетевые метрики
Центральность
Centrality functions (vertex level) and centralization functions (graph level). The centralization functions return res - vertex centrality, centralization, and theoretical_max - maximum centralization score for a graph of that size. The centrality function can run on a subset of nodes (set with the vids parameter). This is helpful for large graphs where calculating all centralities may be a resource-intensive and time-consuming task.
Degree (number of ties)
degree(net, mode="in") centr_degree(net, mode="in", normalized=T)
Closeness
(centrality based on distance to others in the graph) Inverse of the node’s average geodesic distance to others in the network.
- closeness(net, mode="all", weights=NA)
- centr_clo(net, mode="all", normalized=T)
Eigenvector
centrality proportional to the sum of connection centralities Values of the first eigenvector of the graph matrix.
- eigen_centrality(net, directed=T, weights=NA)
- centr_eigen(net, directed=T, normalized=T)
Betweenness
centrality based on a broker position connecting others Number of geodesics that pass through the node or the edge.
- betweenness(net, directed=T, weights=NA)
edge_betweenness(net, directed=T, weights=NA)
- centr_betw(net, directed=T, normalized=T)
Разные способы определения сообществ
Простой сетевой анализ с R http://jfaganuk.github.io/2015/01/24/basic-network-analysis/
- Finding clusters of CRAN packages using igraph
- http://blog.revolutionanalytics.com/2014/12/finding-clusters-of-cran-packages-using-igraph.html
- Finding clusters of CRAN packages using igraph
- http://blog.revolutionanalytics.com/2014/12/finding-clusters-of-cran-packages-using-igraph.html
- Network basics with R and igraph
- https://assemblingnetwork.wordpress.com/2013/06/10/network-basics-with-r-and-igraph-part-ii-of-iii/
gcon <- simplify(gcon, edge.attr.comb = list(weight = "sum", function(x)length(x)))
- What are the differences between community detection algorithms in igraph?
- http://stackoverflow.com/questions/9471906/what-are-the-differences-between-community-detection-algorithms-in-igraph/9478989#9478989
Частота использования терминов в документе
An Example of Social Network Analysis with R using Package igraph - https://rdatamining.wordpress.com/2012/05/17/an-example-of-social-network-analysis-with-r-using-package-igraph/
рассматриваются термины и твиты, в которых они использовались
> g <- graph.adjacency(termMatrix, weighted=T, mode = “undirected”) > # remove loops > g <- simplify(g) > # set labels and degrees of vertices > V(g)$label <- V(g)$name > V(g)$degree <- degree(g)
> # set seed to make the layout reproducible > set.seed(3952)
Улучшение внешнего вида
V(g)$label.cex <- 2.2 * V(g)$degree / max(V(g)$degree)+ .2 V(g)$label.color <- rgb(0, 0, .2, .8) V(g)$frame.color <- NA egam <- (log(E(g)$weight)+.4) / max(log(E(g)$weight)+.4) E(g)$color <- rgb(.5, .5, 0, egam) E(g)$width <- egam # plot the graph in layout1 plot(g, layout=layout1)