data set
(note that Sender column was used for source node, Recipient column for target node, and Date column for edge attribute, while the other two columns were not included)
For each of three time periods (1500-1650, 1650-1800, and 1800-2000), I will show a view of the entire network as well as a closer view of a particularly interesting section of the network.
All of these network pictures show a lot of data, so they are naturally a bit cluttered and it is hard to pick out individual connections. However, this does not mean they do not show qualitatively significant network features. Two such features that are worth looking at are:
- The connections or lack thereof between components of the network, which can be observed from the broad view for each time period’s network.
- The connections or lack thereof between nodes (in particular those that are connected to a ‘hub’ node with a large number of adjacent edges) with few adjacent edges, which can be observed from the closer view for each time period’s network.
Broad view of letter network including letters from 1500-1650
Close up view of letter network including letters from 1500-1650
Broad view of letter network including letters from 1650-1800
Close up view of letter network including letters from 1650-1800
Broad view of letter network including letters from 1800-2000
Close up view of letter network including letters from 1800-2000
The first step for formulating a research question for me was to find data suitable to network analysis. So since we read about the republic of letters and discussed it in class, this network of letter sending/receiving stood out to me as a good candidate for research on using Cytoscape. Specifically, the site we looked at in class kept metadata on the origin and destination of the letters such that each letter could be easily translated into connections between location nodes in a social network.
Unfortunately, I couldn’t figure out any easy way to scrape data from EMLO. Instead, I ended up using data from “correspSearch”, which is an online database of the metadata of scholarly editions of letters. The site seemed to be German with metadata collected from primarily german websites/databases, which might represent a skew towards more letters with origin or destination in Germany. (However, the data from EMLO – or any source – would also be skewed just in a different way, since there is always selection bias when the data isn’t 100% complete).
Once I had figured out the data that I was using (letter metadata) and the form of the network (locations as nodes, letters as connections), I began to consider what research questions could be asked about this network. Some questions were implausible given the limitations of the data (for example, any question about how the letter’s theme/type related to any other variable – time, origin, etc. – was implausible since there was no theme/type metadata). Consequently, I focused on research questions that related to the metadata that was available.
Specifically, there was metadata on the origin, destination, sender, receiver, and date of each letter. The date stood out to me as a particularly useful piece of information for network analysis since the connections were spread over a large time period. Especially in the context of this class, where we have been looking at the evolution of communication over time, comparing letter networks at different points in time seemed like a very interesting topic.