[PDF] PHYSICAL REVIEW E 99 022306 (2019) Entropy-based





Previous PDF Next PDF



Untitled

La prima volta che ascoltai un brano dei Jamiroquai "Too young to our full potential



LISTENING SKILLS

that they will be mentioning these kinds of music and bands or singers. the order that they hear it. Answers: Chill out Jamiroquai



Easy Tempo Set - Repertorio Easy Tempo Set

Can't take my eyes off of you (Lauryn Hill) + It aint over till its over (Lenny Kravitz) Cosmic Girl (Jamiroquai) + Superstition (Stevie Wonder).



Untitled

So although this particular album is centered around our European experiences



Jamiroquai Main Vein

What are you going to do to make me hate me. I can't see how you can this time. A fistful of lies and a verbal decimation. It's so sad but keep on trying.



PISA LIVE THE CITY

Pisa and if you want to be updated about our events



Jamiroquai Radio

You were making love to me in stereo. Can't you see there's nothing wrong all night. I know you want to get it on the radio. That's when some funny money 



Jamiroquai Stillness in Time

I found love in that way. And I'm never sad and I'm always glad. Anything you give me today. I will be thankful for. People find it hard to be strong.



PHYSICAL REVIEW E 99 022306 (2019) Entropy-based

7 feb 2019 We will show that adding more constraints allows us to ... to note that some Jamiroquai albums can be found in this.



impaginazione repertorio ok

Can't take my eyes off of you (Lauryn Hill) + It aint over till its over (Lenny Kravitz). - Do you really want to hurt me (Culture club) + Kingston town 

PHYSICAL REVIEW E99, 022306 (2019)

Entropy-based randomization of rating networks

Carolina Becatti,

1,*

Guido Caldarelli,

1,2,3 and Fabio Saracco 1 1 IMT School for Advanced Studies, Piazza S.Francesco 19, 55100 Lucca, Italy 2

Istituto dei Sistemi Complessi (ISC)-CNR UoS Università "Sapienza", Piazzale Aldo Moro 5, 00185 Roma, Italy

3

ECLT San Marco 2940, 30124 Venezia, Italy(Received 16 April 2018; revised manuscript received 30 November 2018; published 7 February 2019)

In recent years, due to the great diffusion of e-commerce, online rating platforms quickly became a common

tool for purchase recommendations. However, instruments for their analysis did not evolve at the same speed.

Indeed, interesting information about users" habits and tastes can be recovered just considering the bipartite

network of users and products, in which links represent products" purchases and have different weights due to

the score assigned to the item in users" reviews. With respect to other weighted bipartite networks, in these

systems we observe a maximum possible weight per link, that limits the variability of the outcomes. In the

present article we propose an entropy-based randomization method for this type of networks (i.e., bipartite rating

networks) by extending the configuration model framework: the randomized network satisfies the constraints of

the degree per rating, i.e., the number of given ratings received by the specified product or assigned by the single

user. We first show that such a null model is able to reproduce several nontrivial features of the real network

better than other null models. Then, using our model as benchmark, we project the information contained in

the real system on one of the layers: To provide an interpretation of the projection obtained, we run the Louvain

community detection on the obtained network and discuss the observed division in clusters. We are able to detect

groups of music albums due to the consumers" taste or communities of movies due to their audience. Finally, we

show that our method is also able to handle the special case of categorical bipartite networks: we consider the

bipartite categorical network of scientific journals recognized for the scientific qualification in economics and

statistics. In the end, from the outcome of our method, the probability that each user appreciate every product can

be easily recovered. Therefore, this information may be employed in future applications to implement a more

detailed recommendation system that also takes into account information regarding the topology of the observed

network.

DOI:10.1103/PhysRevE.99.022306

I. INTRODUCTIONNetwork theory [1,2] proved successful [3] in the descrip- tion and modeling of a wide variety of systems, ranging from the obvious cases of the internet [4,5], the world wide web [6], and social networks [7]. In these settings it formed the evidence on which computational social science is based [8], to cell properties in biology [9], and fMRI imaging in brain analysis [10,11], contributing to the new field of network medicine [12,13] and to banks in financial systems [14,15]. Networks come in various shapes, from the simplest case of similar vertices connected by binary edges, to weighted and/or directed networks, to multigraphs where more than one edge can connect two vertices, to bipartite graphs where two distinct sets of vertices are present. Simple examples of the latter case are bipartite graphs in which a connection is drawn other set. A lot of work has been developed so far to analyze this kind of data, a great part of it being focused on different methods to identify the structure of the network (see, for example, Ref. [16], where the authors discuss the drawbacks of finding communities in bipartite networks and then propose* carolina.becatti@imtlucca.it a new solution based on bipartite stochastic block models to address this topic). This work deals with the specific case of bipartiterating networks, where the two sets of nodes are individuals and items" purchases while the edges represent reviews of prod- ucts given by consumers and are weighted by the numerical score received, as for example, in the well-known Amazon re- view system. These kinds of graphs have been mostly studied from a machine learning and computer science perspective, to train models able to recommend items to people, based on their taste and preferences. Different methodologies are employed for this purpose; see Ref. [17] for an thorough review of the literature on the topic and the recorded progress. In this paper we focus on a different approach, providing an analytical tool that could reveal useful in the development of a recommendation system based on network topology. We follow the stream of literature introduced by Refs. [18-20], which defines an appropriate method to construct benchmark models for the observed networks. Therefore, our attention can be focused on the assessment of the significance of several topological quantities measured on the graph. More specif- ically, we compare the real system with its randomization, represented by an ensemble of graphs with the same number of nodes and all possibile link configurations. To have an

effective filter, we constrain the average over the ensemble of2470-0045/2019/99(2)/022306(15) 022306-1 ©2019 American Physical Society

CAROLINA BECATTI, GUIDO CALDARELLI, AND FABIO SARACCO PHYSICAL REVIEW E99, 022306 (2019) some topological quantities—in this case the degree sequence per rating—and check if other nontrivial measures of the actual network are reproduced. If not, then there is a signal of a behavior that it is not captured by the constraints only. The method works as follows: It first prescribes to define an appropriate ensemble of graphs, with constant number of nodes; second, it defines a probability distribution over the ensemble through a constrained entropy maximization proce- dure; then, the maximization of the related likelihood function provides the probability that any possible pair of nodes in the network of interest is connected. The constraints introduced in the first maximization procedure are the topological quantities of the real network, i.e., for binary and undirected networks the degree of each node is used as a constraint. Once the theoretical framework is complete, we can state if the real values of some topological quantities substantially deviate from the theoretical distribution, by comparing the actual observations with the expectations of the null model. Rating networks may be interpreted as classical weighted networks, whose edges are weighted by a finite set of discrete scores. In this context, appropriate constraints are represented by the specification of nodes" strengths only (weighted con- figuration model, in Ref. [20]). Because of the extremely poor predictive power of vertices" strengths, an enriched version of the previous model has also been introduced (enhanced configuration model) in Ref. [21]; this method adds the topol- ogy as additional information. However, the presence (in our framework) of a finite number of discrete weights complicates the problem formulation and increases the required compu- tational effort. For these reasons, a preliminar “binarization" procedure is often employed (it is the approach of Ref. [22], but it is also common in recommendation systems, like in Ref. [23]), by thresholding the edges" weights. In this way, the resulting network is binary and can be easily randomized with the Bipartite Configuration Model in Ref. [24]. In this paper we propose an alternative approach, constraining not only on the presence of positive reviews, but on the exact ratings. Due to its application, we indicate it in the following as bipartite score configuration model (BiSCM). The peculiarity of our approach is that we avoid the scores-related problems by specifying amultidegreefor each node in the network, i.e., byspecifying the entire distribution of scores received by a node. We will show that adding more constraints allows us to define a more restrictive null model, thus to reproduce with higher accuracy the features of the original network. Let us highlight that our approach is general enough to permit to randomize bipartite signed networks (as a subset of rating networks) and bipartite categorical networks, the latter being a subject to which, to the best of our knowledge, there is a substantial scarcity of theoretical tools for their analysis. For instance, while there are different proposals for measuring the similarity among items in categorical datasets [25], nontrivial null-models and benchmarks are practically absent. The present methodology tries to fill this gap. The rest of this paper is structured as follows. In the Sec.II we explain the ensemble construction procedure and show how the model can be employed in two different kinds of analysis: evaluation of the significance of some topological quantities and projection of the bipartite network on one of its layers. In Sec.IIIwe briefly review the datasets used to test FIG. 1. A simple bipartite graph. In the following, Latin letters will indicate goods, while Greek letters will denote users. the methods. The main results regarding the motifs analysis are reported in Sec.IV, where we describe all the performed analyses, while we characterize the communities found in the projected and validated networks in Sec.V. Finally, we dis- cuss possible future developments of the method in Sec.VI.

II. METHODS

In this section, we briefly introduce the used notation and explain the necessary steps to construct the null model ensemble. A bipartite network is a network that can be par- titioned in two sets of nodes, such that only edges between nodes belonging to different sets can be observed (see Fig.1). This kind of structure naturally arises whenever considering collaboration networks (i.e., actors in the first set, movies on the second set), export of products, consumers, goods, etc. To distinguish between the two sets, the index running on one setLis typically indicated by Latin letters, while the index running on the second set?is indicated by Greek letters. The number of nodes belonging to the two setsLand?will be denoted with the symbolsN L andN , respectively. A bipartite ratingnetwork withN=N L +N vertices andEedges can be entirely specified by itsN L ×N adjacency matrixM with entriesm i,α =βwhenever productihas been reviewed and assigned scoreβby userαandm i,α =0, otherwise. In what follows, we only deal with the case in which users are required to assign discrete numerical scores and the number of possible scores is known, denoted from now on asβ max

Therefore,β?{1,...,β

max }. All members of the benchmark ensemble will have a constant number of vertices per layer, respectively, equal toN L andN . For the sake of simplicity, a binary representation of the adjacency matrix entries will be considered, definingm i,α,β =δ(m i,α ,β) for allβ, whereδis the Kroneckerδfunction. By doing so, the variablem i,α,β will be equal to 1 if nodeαhas reviewed nodeiwith the numerical scoreβandm i,α,β =0 otherwise. We use the notation k i,β (M)=? m i,α,β i=1,...,N L ,(1) k (M)=? i m i,α,β

α=1,...,N

(2) to indicate the number of reviews with scoreβ, respectively, received by a generic productiin Eq. (1) and assigned by a generic userαin Eq. (2). The specification of Eqs. (1) and (2) for all scoresβdefines the distribution of scores received by each node and constitute the fundamental constraints of our problem.

022306-2

ENTROPY-BASED RANDOMIZATION OF RATING NETWORKS PHYSICAL REVIEW E99, 022306 (2019) At this point we look for the probability distribution maxi- mizing the (Shannon"s) entropy, S=-? M

P(M)lnP(M),(3)

under the constraints?k i,β ?=k i,β and?k ?=k for alli,α and for all scoresβ. In other words, we consider the prob- ability distribution over the ensemble such that the expected degree of each node, for every possible rating, equals on av- erage its observed value, while keeping all the rest maximally random. The solution to this bipartite maximization problem gives the following probability distribution over the ensemble,

P(M|?x,?y)=?

i,α q i,α (m i,α,β |?x,?y),(4) where ?xis aN L max dimensioned vector of Lagrangian mul- tipliers that controls the expected degrees for each possible rating for the set of products, while ?yis the analogousN max dimensioned vector of Lagrangian multipliers for the users.

The quantity

q i,α (m i,α,β |?x,?y)=? (x i,β y m i,α,β 1+? x i,β y (5) for all scores determines the probability to observe one of the entries between nodesiandα(refer to the Appendix for further details). Notice that each node has been assigned a vectorial Lagrangian multiplier ( ?x i if it belongs to the layer L, ?y if it belongs to the layer?) of dimensionβ max . Thus, the probability to observe a positive outcome, i.e., a link with ratingβ, can be expressed as p i,α,β =x i,β y 1+? x i,β y (6) for alli,α, andβ. Therefore, the outcome of our method allows to easily recover the probability that each user assigns a given score to all items, for all observed rating levels in the network. To determine the numerical values for our Lagrangian mul- tipliers, let us consider a specific real-world rating network M , for which the degree sequence{k i,β (M ),k (M )}is known for alli,αand for each rating levelβ. The log- likelihood defined by Eq. (4)isgivenby

L(?x,?y|M

i,β k i,β (M )lnx i,β k (M )lny i,α ln? 1+? x i,β y .(7) Then, the maximization procedure consists in finding the specific parameter values ( ?x ,?y ) that maximize the probability to observe the network of interestM . Thus, the benchmark model for the real-world networkM is completely specified and it is possible to compare its observed topological properties with the same quantities averaged over the ensemble of graphs. Let us conclude this section with some remarks: In the whole manuscript we employed the exact result of our pro-

cedure, thus the average over the ensemble truly reproducesthe score degree sequence observed in the real network. Nev-

ertheless, the null model"s calibration (i.e., the determinationquotesdbs_dbs27.pdfusesText_33
[PDF] methodologie de laffiche

[PDF] affiche exposé originale

[PDF] exercices pour la formation des délégués élèves

[PDF] outils formation des délégués

[PDF] manuel de design graphique pdf

[PDF] le design graphique par le dessin pdf

[PDF] cours de design graphique pdf

[PDF] cours de graphisme gratuit pdf

[PDF] livre design graphique pdf

[PDF] les fondamentaux du design graphique pdf

[PDF] cours art graphique pdf

[PDF] guide pratique de la création graphique pdf

[PDF] le film le tableau analyse

[PDF] école et cinéma le tableau cycle 2

[PDF] le tableau laguionie histoire des arts