- Posts: 4
- Thank you received: 5
Genetic data and biodiversity
- hrezabek
- Topic Author
- Offline
- Passenger
Less
More
3 years 7 months ago - 3 years 7 months ago #1104
by hrezabek
Heath Rezabek // librarian and futurist
@ heath_rezabek // sxsw 2015 // linkedin
hrezabek created the topic: Genetic data and biodiversity
Some account difficulties have made me late to the party, but I'm glad to be able to participate at last!
About me: I'm a librarian living and working in Austin TX, currently entering the Advanced Study program at UT-Austin's School of Information for Digital Libraries, Archives, and Long Term Preservation. I've worked with The Long Now Foundation in SF on their Manual for Civilization project, and am liaison to Long Now for LM1. My particular interest lies in redundant and resilient formats for data on endangered cultural and biological data, and in the possibility of an open framework for mirroring efforts such as LM1's globally.
(Annoyance alert: I'll tend to embed URLs as visible links, for the sake of possible future printouts or otherwise lossy scrapes of forum content.)
A particular interest of mine is the presence and preservation, within the Public Archive, of recoverable traces of Earth's biodiversity.
Long an inspiration to me is Gregory Benford's 1992 thought experiment of a paper, "Saving the Library of Life", http://www.pnas.org/content/89/22/11098.full.pdf in which he argues that the rate of depletion of biodiversity had become such that taxonomy had become a luxury, and preservation of recoverable traces via sampling should be a priority. The rate of depletion of biodiversity is greater now than then, though his sampling strategy remains (likely) impractical.
However, I've recently realized that the track being taken by Craig Venter in developing his 'Digital Biological Converter' project has reached a stage where complete genomic data alone could be enough to allow for the eventual recovery of species. See -- http://motherboard.vice.com/read/elon-musk-and-craig-venter-want-to-print-life-on-mars -- and -- http://www.theguardian.com/science/2013/oct/13/craig-ventner-mars
This is good news. It means that we might plausibly turn our energy towards bolstering and expanding existing efforts to compile comprehensive genomic data, and that the Public Archive might strive to mirror such holdings. I'm watching such efforts closely.
At this point it's worth spelling out another opinion I hold: From an archival perspective, I feel we could gain a lot by favoring the inclusion of multiple extant, parallel, intact, and in some cases even partially redundant collections, for two reasons. The first is that it would allow us to leverage and amplify existing efforts, many of which have been toiling at their focus for a long time. We'd thus benefit from the lessons they've learned and the decisions they've made. The second is that, by allowing for multiple, parallel collections, we avoid having to choose between them, instead gaining the information a future archive explorer could gain through comparative analysis of the various collections. What I essentially mean is that if we can manage the storage capacity to mirror multiple collections on cultural data or biodiversity, we should try to do so. Mercifully, there are a few formats which could allow for immensely high density storage.
My own favorite is digital DNA storage. http://hms.harvard.edu/news/writing-book-dna-8-16-12 -- fused quartz is an interesting counter-example for physically resilient write-only digital storage. http://www.hitachi.com/New/cnews/month/2014/10/141020a.html
Whatever media or formats we use, my hope is that we can achieve densities that allow for multiple, parallel collections. If so, in the realm of biodiversity, I've been digging in to the Encyclopedia of Life project -- http://eol.org/ -- as championed by E. O. Wilson. It's unclear to me the extent to which the project does or doesn't plan to preserve and present genomic data, but if it doesn't, it should; and if it can't, we ought to identify similar projects which can/do.
Food for thought...
- Heath
About me: I'm a librarian living and working in Austin TX, currently entering the Advanced Study program at UT-Austin's School of Information for Digital Libraries, Archives, and Long Term Preservation. I've worked with The Long Now Foundation in SF on their Manual for Civilization project, and am liaison to Long Now for LM1. My particular interest lies in redundant and resilient formats for data on endangered cultural and biological data, and in the possibility of an open framework for mirroring efforts such as LM1's globally.
(Annoyance alert: I'll tend to embed URLs as visible links, for the sake of possible future printouts or otherwise lossy scrapes of forum content.)
A particular interest of mine is the presence and preservation, within the Public Archive, of recoverable traces of Earth's biodiversity.
Long an inspiration to me is Gregory Benford's 1992 thought experiment of a paper, "Saving the Library of Life", http://www.pnas.org/content/89/22/11098.full.pdf in which he argues that the rate of depletion of biodiversity had become such that taxonomy had become a luxury, and preservation of recoverable traces via sampling should be a priority. The rate of depletion of biodiversity is greater now than then, though his sampling strategy remains (likely) impractical.
However, I've recently realized that the track being taken by Craig Venter in developing his 'Digital Biological Converter' project has reached a stage where complete genomic data alone could be enough to allow for the eventual recovery of species. See -- http://motherboard.vice.com/read/elon-musk-and-craig-venter-want-to-print-life-on-mars -- and -- http://www.theguardian.com/science/2013/oct/13/craig-ventner-mars
This is good news. It means that we might plausibly turn our energy towards bolstering and expanding existing efforts to compile comprehensive genomic data, and that the Public Archive might strive to mirror such holdings. I'm watching such efforts closely.
At this point it's worth spelling out another opinion I hold: From an archival perspective, I feel we could gain a lot by favoring the inclusion of multiple extant, parallel, intact, and in some cases even partially redundant collections, for two reasons. The first is that it would allow us to leverage and amplify existing efforts, many of which have been toiling at their focus for a long time. We'd thus benefit from the lessons they've learned and the decisions they've made. The second is that, by allowing for multiple, parallel collections, we avoid having to choose between them, instead gaining the information a future archive explorer could gain through comparative analysis of the various collections. What I essentially mean is that if we can manage the storage capacity to mirror multiple collections on cultural data or biodiversity, we should try to do so. Mercifully, there are a few formats which could allow for immensely high density storage.
My own favorite is digital DNA storage. http://hms.harvard.edu/news/writing-book-dna-8-16-12 -- fused quartz is an interesting counter-example for physically resilient write-only digital storage. http://www.hitachi.com/New/cnews/month/2014/10/141020a.html
Whatever media or formats we use, my hope is that we can achieve densities that allow for multiple, parallel collections. If so, in the realm of biodiversity, I've been digging in to the Encyclopedia of Life project -- http://eol.org/ -- as championed by E. O. Wilson. It's unclear to me the extent to which the project does or doesn't plan to preserve and present genomic data, but if it doesn't, it should; and if it can't, we ought to identify similar projects which can/do.
Food for thought...
- Heath
Heath Rezabek // librarian and futurist
@ heath_rezabek // sxsw 2015 // linkedin
Last Edit: 3 years 7 months ago by hrezabek.
The following user(s) said Thank You: Mike de Sousa, SteveC, nmvg1468, Doug