SophiaReed & KinshipCode
SophiaReed SophiaReed
Have you ever considered how a family tree might map onto a phylogenetic diagram, with each branch representing both social links and genetic information? I’d love to compare the data structures we use.
KinshipCode KinshipCode
Oh, absolutely! I’ve been sketching those two side by side for months. The family tree is all about human bonds, those “I am your cousin because we share a grandparent” links, while the phylogenetic diagram is the DNA roadmap, each branch a mutation line. If you overlay them, the cousin‑marriage taboos pop up like hidden nodes—those are the sociological “pruning” points. On a napkin I’ve already drawn a little sociogram that shows which cousin pairings are taboo in a matrilineal clan. The interesting part is how the genetic distance often mirrors the social distance, but not always—some cultures allow close‑cousin unions even when the genetics say it’s risky. It’s like a cryptic puzzle where the answer changes with the language you write it in. Do you have a specific clan or dataset you want to compare?
SophiaReed SophiaReed
Sounds like a fascinating project—are you looking at a specific ethnic group, or a broader dataset of multiple clans? I could help with aligning the kinship matrices to the genetic distance tables and checking for correlations.
KinshipCode KinshipCode
I’m actually pulling data from several Pacific Islander clans—Trobriand, Bwanabwana, and a few Melanesian groups—so it’s a broader dataset, but each clan gets its own little sociogram first. I’ll line up the kinship matrix with the mitochondrial haplogroups and see if the cousin‑marriage taboos line up with the genetic distances. If you can help me align the matrices, we’ll finally get that puzzle piece that shows whether the social constraints were driven by genetics or culture. Ready to dive in?
SophiaReed SophiaReed
Let’s start by normalizing each kinship matrix so that entries are the number of shared ancestors, then compute the pairwise genetic distance from the mitochondrial haplogroup frequencies. We can stack the two matrices into a single table and run a Mantel test to see if the correlation holds. I can script the alignment in R or Python—just let me know which you prefer. Once we have the paired distances, we’ll map the taboo links onto the graph and see if they sit at the outliers. Ready to code?
KinshipCode KinshipCode
Sounds great! I’m more comfortable with Python for the data wrangling, but R’s vegan package does Mantel just fine too. Let’s go with Python—pandas for the kinship matrix, scipy to compute pairwise distances, scikit‑bio for the Mantel test, and networkx to map the taboo links on the graph. Once we have the paired distances, we can highlight the taboo edges in red and see if they sit at the outliers. Let me know if you want me to tweak any assumptions or add a napkin sketch of the sociogram while we code. Ready to dive in?
SophiaReed SophiaReed
Sounds solid—just let me know when you’ve got the matrices ready, and I’ll start the script to compute the pairwise genetic distances, run the Mantel test, and build the network graph with the taboo edges highlighted. We can iterate on any assumptions as we go. Ready when you are.
KinshipCode KinshipCode
Great, I’ve pulled the kinship counts into a tidy CSV and mapped the mitochondrial haplogroup frequencies for each clan. The matrices are ready—just hit me with your script to compute the genetic distances and run the Mantel test. I’ll sketch the sociogram on a napkin while you code. Let’s see if the taboo edges line up with the outliers!
SophiaReed SophiaReed
Here’s a quick script that pulls everything together. Copy it into a .py file and run it after you’ve put the two CSVs in the same folder. It assumes the kinship matrix has rows/cols labelled by clan, and the haplogroup table has one row per clan with columns for each haplogroup frequency (they should sum to 1). ```python import pandas as pd import numpy as np from scipy.spatial.distance import pdist, squareform from skbio.stats.distance import mantel import networkx as nx import matplotlib.pyplot as plt # Load data kinship = pd.read_csv('kinship_matrix.csv', index_col=0) # [clan, clan] haplo = pd.read_csv('haplogroup_frequencies.csv', index_col=0) # [clan, haplogroup] # Ensure the same ordering clans = kinship.index.intersection(haplo.index) kinship = kinship.loc[clans, clans] haplo = haplo.loc[clans] # -------------------- 1. Genetic distances -------------------- # Euclidean distance between haplogroup frequency vectors gen_dist = pd.DataFrame( squareform(pdist(haplo, metric='euclidean')), index=clans, columns=clans ) # -------------------- 2. Mantel test -------------------- # Convert to condensed distance matrices for the Mantel function kinship_cond = pdist(kinship, metric='euclidean') gen_cond = pdist(gen_dist, metric='euclidean') mantel_stat, p_value = mantel(kinship_cond, gen_cond, method='pearson', permutations=9999) print(f'Mantel statistic: {mantel_stat:.4f}, p‑value: {p_value:.4f}') # -------------------- 3. Build sociogram graph -------------------- G = nx.Graph() G.add_nodes_from(clans) # Add all edges weighted by kinship for i, clan_i in enumerate(clans): for j, clan_j in enumerate(clans): if i < j: G.add_edge(clan_i, clan_j, weight=kinship.loc[clan_i, clan_j]) # Mark taboo edges (replace your list of taboo pairs) taboo_edges = [('Trobriand', 'Bwanabwana'), ('Melanesian1', 'Melanesian2')] for u, v in taboo_edges: if G.has_edge(u, v): G[u][v]['taboo'] = True else: G.add_edge(u, v, taboo=True) # -------------------- 4. Plot -------------------- pos = nx.spring_layout(G, seed=42) edge_colors = ['red' if G[u][v].get('taboo') else 'grey' for u, v in G.edges()] edge_widths = [G[u][v]['weight'] for u, v in G.edges()] nx.draw_networkx_nodes(G, pos, node_color='lightblue', node_size=700) nx.draw_networkx_labels(G, pos, font_size=12, font_family='sans-serif') nx.draw_networkx_edges(G, pos, edge_color=edge_colors, width=edge_widths, alpha=0.7) plt.title('Kinship sociogram – taboo edges in red') plt.axis('off') plt.tight_layout() plt.show() ``` A couple of points to double‑check: 1. If your kinship counts are “number of shared grandparents” or something, you might want to transform them (e.g., 1 / (count + 1)) so that larger numbers mean closer kinship, which aligns with the intuition that small genetic distances should match high kinship. 2. The Mantel test uses Euclidean distances on the raw matrices; if you think a different metric (e.g., log‑ratio or Jaccard) is more appropriate, swap the `metric` argument. Run the script, inspect the Mantel statistic and p‑value, and look at the plotted graph. If the red taboo edges cluster in the high‑weight, low‑genetic‑distance region, that’s a neat hint that cultural rules were probably reacting to genetic closeness. If they’re scattered, the story might be more sociological than genetic. Happy coding!
KinshipCode KinshipCode
That looks almost ready! I’ll just drop the two CSVs in the folder, hit run, and see what the Mantel plot gives us. I’ll keep an eye on the transformation suggestion – if we treat shared‑grandparent counts as “closeness”, inverting them (1/(count+1)) might make the genetic distance line up better with the sociogram weights. When you run it, check if the Mantel statistic is positive and the p‑value is low – that’d be a strong signal that the kinship and mitochondrial distances are correlated. Then, look at the plot: the red taboo edges should sit where the weights are high and the genetic distance low if the taboo is indeed responding to genetic proximity. If they’re all over the place, maybe it’s a cultural rule independent of DNA. Feel free to tweak the `metric` argument if you think Jaccard or log‑ratio makes more sense for our data. Once you’ve got the numbers, I’ll draw a quick napkin sketch of the sociogram and note where the taboo edges lie. Ready to launch the script?