MRQAP

MRQAP tests associations across the dyads of a network. Questions generally ask whether one relation is associated with another relation in multi-relational networks (e.g. do marriage ties correlate with business ties?), or whether similarity on dyadic features predict tie formation (e.g. are same-race pairs more likely to be friends? or generally what shared characteristics predict friendship nominations among adolescents in an American high school?). MRQAP is an extension of the Mantel test which uses node permutation to accommodate issues of non-independence that bias traditional regression analysis estimates using (sociocentric) network data. To illustrate how to use MRQAP with ideanet, we’ll use the Faux Mesa High dataset native to the package.

Preparing to use ideanet’s MRQAP module is easy: all we need is the igraph object produced using netwrite, as this object contains the various node-level attributes contained in a user’s original nodelist as well as the node-level measurements produced by netwrite.

library(ideanet)

nw_fauxmesa <- netwrite(nodelist = fauxmesa_nodes,
                        node_id = "id",
                        i_elements = fauxmesa_edges$from,
                        j_elements = fauxmesa_edges$to,
                        directed = FALSE,
                        net_name = "faux_mesa",
                        shiny = TRUE)

When we look at the original nodelist passed to netwrite (fauxmesa_nodes), we see that we have information about each student’s grade level, their race/ethnicity, and their sex. This information exists at the individual level in that it pertains to individual students. However, when using MRQAP analysis, we are generally interested in how dyadic measures predict outcomes. In other words, we’re interested in whether similarities or differences between nodes lead to outcomes, both of which are understood at the level of ego-alter relationships. While some users may have dyadic measures stored in their edgelist ahead of time, typically these measures are something one has to generate. The qap_setup function in ideanet allows users to quickly take node-level attributes and generate dyad-level comparisons with only a few arguments.

These argument are:

For the Faux Mesa High network, let’s imagine one wants to know whether same sex, race, or grade levels affect the likelihood that adolescents nominate each other as friends.

For similarity in sex, we’ll want to apply the reduced_category method to the sex variable. For combinations of the race, we’ll apply the multi_category method. And for grade, which we’ll treat as a continuous variable, we’ll use the difference method.

Given the importance of ensuring that each element in the variables argument corresponds to the correct element in the methods argument, users may find it helpful to store both vectors in a data frame prior to running qap_setup. This allows us to double-check that all elements appear in the correct order.

var_methods <- data.frame(variable = c("sex", "race", "grade"),
                          method = c("reduced_category", "multi_category", "difference"))

var_methods
variable method
sex reduced_category
race multi_category
grade difference

Now that we’ve ensured everything’s in the right order, let’s use qap_setup:

faux_qap_setup <- qap_setup(net = nw_fauxmesa$faux_mesa,
                            variables = var_methods$variable,
                            methods = var_methods$method,
                            directed = FALSE)

qap_setup produces a list object containing a nodelist, an edgelist containing newly-calculated dyadic measures, and an igraph object with these dyadic measures. Let’s quickly inspect our new edgelist to see the kinds of variables we’ve just created:

head(faux_qap_setup$edges)
from to weight sex_ego sex_alter same_sex race_ego race_alter both_race_Hisp both_race_NatAm both_race_White both_race_Other both_race_Black grade_ego grade_alter diff_grade abs_diff_grade
0 24 1 F F 1 Hisp White 0 0 0 0 0 7 7 0 0
0 51 1 F F 1 Hisp NatAm 0 0 0 0 0 7 7 0 0
0 57 1 F F 1 Hisp Hisp 1 0 0 0 0 7 12 -5 5
0 69 1 F F 1 Hisp NatAm 0 0 0 0 0 7 7 0 0
0 86 1 F F 1 Hisp White 0 0 0 0 0 7 7 0 0
0 91 1 F F 1 Hisp NatAm 0 0 0 0 0 7 7 0 0

We now have several new variables. Variables appended with _ego and _alter represent the original values for each edge’s ego and alter, respectively, as determined in our original nodelist. Our same_sex and both_race variables are binary indicators of whether ego and alter have the same value for a particular attribute. By contrast, diff_grade is a continuous measure showing how many grade levels ego and alter are apart from each other. Note that values in diff_grade may be positive or negative depending on whether or not the node designated as ego in the edgelist is in a higher grade level than the node designated as alter. Signed differences of this sort can be useful when working with directed networks — you can imagine younger students being more likely to nominate older students as friends than vice versa. Given that ties in our network are undirected, however, the order in which egos and alters appear in our edgelist have no real meaning, and whether values in diff_grade are positive or negative is largely a matter of chance. Consequently, we’re better off using the absolute value of ego and alter’s grade level in our analysis, as this measure is agnostic to the order in which nodes are presented in our edgelist. qap_setup provides us with this absolute value automatically — here it is stored in abs_diff_grade.

With our variables of interest in hand, we turn to the MRQAP analysis itself. ideanet’s qap_run function seamlessly integrates output from netwrite and qap_setup. However, in its current iteration users must select variables produced by qap_setup. Arguments for qap_run include:

NOTE: If the input network is multi-relational, qap_run will automatically merge duplicated rows. This is necessary given that, at this time, the MRQAP wrapper does not elegantly handle repeated observations. When merging rows, it will take the sum of numeric edge attributes, and a random value of character edge attributes. If the user is interested in the association between two types of ties (e.g., marriage ties predicting business ties), we recommend that they create a set of binary edge attributes using qap_setup.

Let’s use qap_run using the default 500 permutations. While we can significantly decrease the number of permutations to allow for lower computation times, this may make confidence intervals in our results less interpretable. As far as variables go in this example, we’ll include same_sex, both_race_White, and abs_diff_grade.

faux_qap <- qap_run(net = faux_qap_setup$graph,
                    dependent = NULL,
                    variables = c("same_sex", "both_race_White", "abs_diff_grade"),
                    directed = FALSE,
                    family = "linear")

qap_run returns a list of two objects. The first (covs_df) is a data frame summarizing model results in a way resembling a traditional regression output. The second (mods_df) is a data frame providing the number of dyadic observations on which the model is computed.

faux_qap$covs_df
covars estimate pvalue
intercept 0.0142453 0.002
same_sex 0.0048255 0.062
both_race_White 0.0014419 0.706
abs_diff_grade 0.0004715 0.582
faux_qap$mods_df
num_obs
10878

Assuming a p-value of .1 or less indicates some statistical significance, our results here tell us that students are more likely to nominate students of the same sex as friends, holding all else constant. It is recommended that researchers apply the same model with different amounts of draws to confirm the confidence intervals associated with each variable.

The above example shows that setting up and using MRQAP in ideanet is fast and easy, allowing users to quickly explore a variety of model specifications with their own data.