RxPG News Feed for RxPG News

Medical Research Health Special Topics World
  Home
 
   Health
 Aging
 Asian Health
 Events
 Fitness
 Food & Nutrition
 Happiness
 Men's Health
 Mental Health
 Occupational Health
 Parenting
 Public Health
 Sleep Hygiene
 Women's Health
 
   Healthcare
 Africa
 Australia
 Canada Healthcare
 China Healthcare
 India Healthcare
 New Zealand
 South Africa
 UK
 USA
 World Healthcare
 
   Latest Research
 Aging
 Alternative Medicine
 Anaethesia
 Biochemistry
 Biotechnology
 Cancer
 Cardiology
 Clinical Trials
 Cytology
 Dental
 Dermatology
 Embryology
 Endocrinology
 ENT
 Environment
 Epidemiology
 Gastroenterology
 Genetics
  Cloning
  Genetic Disorders
  X Chromosome
 Gynaecology
 Haematology
 Immunology
 Infectious Diseases
 Medicine
 Metabolism
 Microbiology
 Musculoskeletal
 Nephrology
 Neurosciences
 Obstetrics
 Ophthalmology
 Orthopedics
 Paediatrics
 Pathology
 Pharmacology
 Physiology
 Physiotherapy
 Psychiatry
 Radiology
 Rheumatology
 Sports Medicine
 Surgery
 Toxicology
 Urology
 
   Medical News
 Awards & Prizes
 Epidemics
 Launch
 Opinion
 Professionals
 
   Special Topics
 Ethics
 Euthanasia
 Evolution
 Feature
 Odd Medical News
 Climate

Last Updated: Oct 11, 2012 - 10:22:56 PM
Genetics Channel

subscribe to Genetics newsletter
Latest Research : Genetics

   EMAIL   |   PRINT
Novel Data-Mining Approach Systematically Links Genes to Traits

Apr 6, 2005 - 4:28:00 PM
Combining automated literature mining with comparative genomics—which compares genome sequences of different organisms to discern differences and similarities in gene content—the authors conducted a systematic search for associations between genes and phenotypic traits. Their approach automates tasks that typically require human curation.

 
[RxPG] With exponential advances in computing power over the past ten years, data-generating capacity has far outpaced anyone’s ability to mine the rich seams of information. This is especially true in the field of genomics.

So far, over 222 prokaryote (bacteria) genomes have been sequenced, 21 archaea (primitive bacteria-like extremophiles), and 17 eukaryotes (from yeast to fly and rat to human), according to the Center for Biological Sequence Analysis in Denmark (http://www.cbs.dtu.dk/services/GenomeAtlas/). All these genomes promise to provide powerful insights into the biological processes of life, but such insights come with painstaking analysis by trained experts. Matching genotype to phenotype—the visible or measurable characteristics of species—is a major challenge in what Francis Collins, Director of the United States National Human Genome Research Institute, has called the “post-genomic era.”

In a new study, Peer Bork and a team of bioinformatics-savvy molecular biologists tested a new approach to extracting biologically meaningful information from the massive MEDLINE database. The US National Library of Medicine’s MEDLINE contains over 12 million abstracts from thousands of publications dating back to 1965. Combining automated literature mining with comparative genomics—which compares genome sequences of different organisms to discern differences and similarities in gene content—the authors conducted a systematic search for associations between genes and phenotypic traits. Their approach automates tasks that typically require human curation.

Recognizing that the best source of information on species phenotypic traits is the scientific literature where biologists describe them, the authors first ran a search to identify associations between species and traits in MEDLINE abstracts. Words that tended to occur with subsets of species, the authors reasoned, were more likely to reflect particular traits. From a total of 255,249 MEDLINE abstracts showing any connection to 92 prokaryotic species with sequenced genomes, 172,967 nouns showed meaningful associations related to the species’ traits. “Flagellum” and “motility” showed up more often in self-propelling species, for example, and “endosymbiont” aptly appeared with the intracellular bacteria (Buchnera aphidicola) that inhabits aphids.

Next, Bork and colleagues detected the presence or absence of over 200,000 evolutionarily conserved genes across the 92 species and sorted the results into species–word and species–gene groups. The analysis revealed a number of words and genes with similar distribution in related species, leading to over 2,700 significant associations between trait-descriptive words and orthologous (evolved from a common ancestor) groups of genes. These genes encode over 28,000 proteins. Many were already known—including genes involved in pathogenicity, biodegradation and biosynthesis, and photosynthesis—but many, the authors note, are “novel” or of “unexpected character and complexity.”

And it is the ability to uncover unexpected relationships across numerous genes and genomes—patterns likely to escape human analysis—that makes this approach so powerful. Among these unexpected match-ups, Bork and colleagues linked a number of food and food-poisoning-related terms with metabolic-enzyme-coding genes. All 37 genes predicated to play a role in food spoilage and toxicity are present in food-borne pathogens but not in most other prokaryotes. By assigning functions to these previously uncharacterized genes, the authors could also assign new roles for pathways that use the genes. For example, by linking two genes with pathways that metabolize propanediol and ethanolamine—compounds found almost exclusively in highly hazardous food-borne pathogens—the authors predict that propanediol and ethanolamine pathways are “crucial genomic determinants of pathogenicity associated with food poisoning.”

That their analysis linked so many predicted genes with bacterial pathogenicity might be expected, the authors note, since both genome sequencing and biological research are heavily focused on human health. Given the weekly increase in the number of genomes sequenced and in MEDLINE entries, the method outlined here should provide a valuable tool to help researchers narrow the gap between the promise and payoff of the genomic revolution.



Publication: (2005) A Novel Data-Mining Approach Systematically Links Genes to Traits. PLoS Biol 3(5): e166.
On the web: Print PDF (37K) 

Advertise in this space for $10 per month. Contact us today.


Related Genetics News
Genetic study of bedbugs may help identify pesticide resistance genes
Novel method of database analysis to help identify responsible genes and diagnostic markers
Environmental influences can be passed down to the next generation
Gene found to be key in etiology of cleft palate
History, geography also seem to shape our genome
Induced pluripotent stem cell lines from pigs
Egg cells help extend life of sperms
Family of genes known as KRAB-ZFP regulate genes dealing with stress
New screening strategy increases Down's syndrome detection before birth
Can genetic research spur racist attitudes?

Subscribe to Genetics Newsletter

Enter your email address:


 Additional information about the news article
DOI: 10.1371/journal.pbio.0030166

Published: April 5, 2005

Copyright: © 2005 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License. PLoS Biology is an open-access journal published by the nonprofit organization Public Library of Science.
 Feedback
For any corrections of factual information, to contact the editors or to send any medical news or health news press releases, use feedback form

Top of Page

 
Contact us

RxPG Online

Nerve

 

    Full Text RSS

© All rights reserved by RxPG Medical Solutions Private Limited (India)