In addition to tackling big-picture architectural questions, one of the really fun projects that I've worked on over the past several weeks is experimenting with extracting cited references from arXiv papers. Today I put up a blog post that describes what we've done so far, user input, and thoughts for the future.
How and why biologists choose to work with the organisms that they do has intrigued me since before I started graduate school. I remember being exposed to behavioral neuroscience research in an undergraduate special topics course offered by one of my favorite professors at Puget Sound. This led to a fascination with how scientists make sense of human brains and bodies by studying non-human brains and bodies. So it's exciting to be able to make some contribution—-to inject some data!—-into the philosophical debate about the process of organism choice in the biomedical sciences. Last week, a paper that I co-authored with Heather Kropp, Julia Damerow, and Manfred Laubichler was published in the journal BioEssays.
Many projects in digital humanities use name authorities (e.g. VIAF, DBPedia) to encode more precise references to people, places, or other concepts in data, and to leverage linked data. We often run into a situation in which we need to keep track of alignments between entries in different authorities that represent the same concepts. The typical solution is to create one's own master authority system that aggregate those existing authorities, but this can cause major headaches when it comes to data integration later on. Black Goat is an attempt to reduce those headaches through non-hierarchical mappings.
The computational turn in the humanities has precipitated the need for sustainable software development projects that are specifically focused on humanities research problems, and the need for graduate and undergraduate training models that address the trans-disciplinary nature of computational humanities research. In this paper, we describe one approach for addressing those two challenges simultaneously: an interdisciplinary research and development team called the Digital Innovation Group (DigInG). DigInG quickly and necessarily became an experiment in trans-disciplinary education at the interface of digital humanities and computer science. Not only does DigInG play an important role in developing a computational infrastructure for d/cHPS research, it also creates an environment for hands-on training for graduate and undergraduate students in computer science, biology, and history and philosophy of science. We discuss the rationale, benefits, and challenges of DigInG since its inception. Our primary objectives are to broaden the discussion about how digital and computational humanities programs are organized, and to suggest that software development and training in the digital humanities need not be conceived as independent activities.
The 1956 meeting of the Fellows of the National Institute of Agricultural Botany (NIAB) was to be held in July, 1956. E. T. Jones was to receive the NIAB Cereal Award for his “Powys” variety of winter oats, and to confer with NIAB leadership about the ongoing transfer of grass and clover seed stocks from the WPBS to NIAB. For members of the SPBS, the annual NIAB meeting was seen as an obligation. “All the plant breeding stations had to get together each year,” one of the Genecology Section staff, David Harberd, recalled, “and you didn't have any choice in the matter: you went. The bosses in London just decided where you were meeting and you went” (Harberd pers. comm.).
Named Entity Recognition is the problem of locating and categorizing chunks of text that refer to...well...entities. 'Entities' usually means things like people, places, organizations, or organisms, but can also include things like currency, recipe ingredients, or any other class of concepts to which a text might refer. In this post I'll describe how to use the Stanford NER classifier to perform Named Entity Recognition on a large collection of texts, in Python.
Black Goat is a configuration-driven platform for non-hierarchical alignments among an unlimited number of name authority systems.
The Genecology Project is an open-note history of the field of ecological genetics, focusing primarily on the Ecology Genetics Group in Britain. Ecological genetics is the study of evolution in action: it shows us how populations of plants and animals are evolving in the pastures, forests, waterways, and neighborhoods where we live. As a scientific field, ecological genetics hasn't been around forever. The story of ecological genetics—where it came from, how it got here, and where it's going now—is also a story about how we as a society make sense of living world.
Amphora is a light-weight repository solution for resource-centered digital projects.
Tethne is a Python package for parsing and analyzing bibliographic metadata. The overarching goal of the project is to make it easier for scholars to build metadata-based network models, such as co-author and co-citation graphs.
The bedrock of our collective knowledge is the intepretation of texts. The epistemic web moves knowledge-making out of the cloisters and into the light of day by extending the semantic web to support the subjectivity of the interpretive process. Encode your interpretations of texts—from scholarly works to websites—and help build the epistemic web.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.