The Enron email corpus is appealing to researchers because it represents a rich temporal record of internal communication within a large, real-world organization facing a severe and survival-threatening crisis. We describe how we enhanced the original corpus database and present findings from our investigation undertaken with a social network analytic perspective. We explore the dynamics of the structure and properties of the organizational communication network, as well as the characteristics and patterns of communicative behavior of the employees from different organizational levels. We found that during the crisis period, communication among employees became more diverse with respect to established contacts and formal roles. Also during the crisis period, previously disconnected employees began to engage in mutual communication, so that interpersonal communication was intensified and spread through the network, bypassing formal chains of communication. The findings of this study provide valuable insight into a real-world organizational crisis, which may be further used for validating or developing theories and dynamic models of organizational crises; thereby leading to a better understanding of the underlying causes of, and response to, organization failure.
It was recently reported that men self-cite >50% more often than women across a wide variety of disciplines in the bibliographic database JSTOR. Here, we replicate this finding in a sample of 1.6 million papers from Author-ity, a version of PubMed with computationally disambiguated author names. More importantly, we show that the gender effect largely disappears when accounting for prior publication count in a multidimensional statistical model. Gender has the weakest effect on the probability of self-citation among an extensive set of features tested, including byline position, affiliation, ethnicity, collaboration size, time lag, subject-matter novelty, reference/citation counts, publication type, language, and venue. We find that self-citation is the hallmark of productive authors, of any gender, who cite their novel journal publications early and in similar venues, and more often cross citation-barriers such as language and indexing. As a result, papers by authors with short, disrupted, or diverse careers miss out on the initial boost in visibility gained from self-citations. Our data further suggest that this disproportionately affects women because of attrition and not because of disciplinary under-specialization.
Scholars have often relied on name initials to resolve name ambiguities in large-scale coauthorship network research. This approach bears the risk of incorrectly merging or splitting author identities. The use of initialbased disambiguation has been justified by the assumption that such errors would not affect research findings too much. This paper tests that assumption by analyzing coauthorship networks from five academic fields-biology, computer science, nanoscience, neuroscience, and physics-and an interdisciplinary journal, PNAS. Name instances in data sets of this study were disambiguated based on heuristics gained from previous algorithmic disambiguation solutions. We use disambiguated data as a proxy of ground-truth to test the performance of three types of initial-based disambiguation. Our results show that initial-based disambiguation can misrepresent statistical properties of coauthorship networks: It deflates the number of unique authors, number of components, average shortest paths, clustering coefficient, and assortativity, while it inflates average productivity, density, average coauthor number per author, and largest component size. Also, on average, more than half of top 10 productive or collaborative authors drop off the lists. Asian names were found to account for the majority of misidentification by initial-based disambiguation due to their common surname and given name initials.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.