Monday, July 20, 2009
perl problem 9 (HARD)
Teja
PERL PROBLEM 8 (EASY)
HERE IS PERL PROBLEM 8
write a program to generate 'n' random dna strings between length 'a' and 'b'. The values of 'n', 'a','b' has to be input from command line.
Example:
2
5
7
AGTCTC
ACCCTGC
Monday, July 6, 2009
No problems this week!
Sorry there wont be any problems this week. I will posting problems next monday.
Thankyou
Teja
Wednesday, June 17, 2009
Computer scientists develop model for studying arrangements of tissue networks by cell division
Cambridge, Mass. – June 17, 2009 – Computer scientists at Harvard have developed a framework for studying the arrangement of tissue networks created by cell division across a diverse set of organisms, including fruit flies, tadpoles, and plants.
The finding, published in the June 2009 issue of PLoS Computational Biology, could lead to insights about how multicellular systems achieve (or fail to achieve) robustness from the seemingly random behavior of groups of cells and provide a roadmap for researchers seeking to artificially emulate complex biological behavior.
"We developed a model that allows us to study the topologies of tissues, or how cells connect to each other, and understand how that connectivity network is created through generations of cell division," says senior author Radhika Nagpal, Assistant Professor of Computer Science at the Harvard School of Engineering and Applied Sciences (SEAS) and a core faculty member of the Wyss Institute for Biologically Inspired Engineering. "Given a cell division strategy, even if cells divide at random, very predictable 'signature' features emerge at the tissue level."
Using their computational model, Nagpal and her collaborators demonstrated that the regularity of the tissue, such as the percentage of hexagons and the overall cell shape distribution, can act as an indicator for inferring properties about the cell division mechanism itself. In the epithelial tissues of growing organisms, from fruit flies to humans, the ability to cope with often unpredictable variations (referred to as robustness) is critical for normal development. Rapid growth, entailing large amounts of cell division, must be balanced with the proper regulation of overall tissue and organ architecture.
"Even with modern imaging methods, we can rarely directly 'ask' the cell how it decided upon which way to divide. The computational tool allows us to generate and eliminate hypotheses about cell division. Looking at the final assembled tissue gives us a clue about what assembly process was used," explains Nagpal.
The model also sheds light on a prior discovery made by the team: that many proliferating epithelia, from plants to frogs, show a nearly identical cell shape distribution. While the reasons are not clear, the authors suggest that the high regularity observed in nature requires a strong correlation between how neighboring cells divide. While plants and fruit flies, for example, seem to have conserved cell shape distributions, the two organisms have, based on the computational and experimental evidence, evolved distinct ways of achieving such a pattern.
"Ultimately, the work offers a beautiful example of the way biological development can take advantage of very local and often random processes to create large-scale robust systems. Cells react to local context but still create organisms with incredible global predictability," says Nagpal.
In the future, the team plans to use their approach to detect and study various mutations that adversely affect cell division process in epithelial tissues. Epithelial tissues are common throughout animals and form important structures in humans from skin to the inner lining of organs. Deviations from normal division can result in abnormal growth during early development and to the formation of cancers in adults.
"One day we may even be able to use our model to help researchers understand other kinds of natural cellular networks, from tissues to geological crack formations, and, by taking inspiration from biology, design more robust computer networks," adds Nagpal.
###
Nagpal's collaborators included Ankit B. Patel and William T. Gibson, both at Harvard, and Dr. Matthew C. Gibson at Stower's Institute.
SOLUTION TO PERL PROBLEM 5 (HARD)
#!usr/bin/perl -w
use DBI;
chomp($keyword=<>);
$dbh=DBI->connect("DBI:mysql:host=localhost;database=test","root","password");
$query="select * from table1 where keyword like '%$keyword%'";
$sth=$dbh->prepare($query);
$sth->execute();
while(@result=$sth->fetchrow_array()){
print "@result";
print "\n";
}
$dbh->disconnect();
Please post any doubts.
Thankyou
Teja
PERL PROBLEM 7 (HARD)
(FROM LEARNING PERL)
Thankyou
Teja
PERL PROBLEM 6 ( EASY)
Here is this week's easy perl problem
Write a program which opens up a BLAST output file and spits out the name of the query sequence, the top hit, and how many hits there were.
Thankyou
Teja
Tuesday, June 9, 2009
Here is a solution for problem 5 using PHP
Amit has given me a solution for problem 5 using PHP.. Just send me a email to get the solution as i cannot paste HTML codes in blog.
Thankyou Amit
Teja
Proteomics and Bioinformatics Approaches for Identification of Serum Biomarkers to Detect Breast Cancer
1 Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21287.
aAuthor for correspondence.
Background: Surface-enhanced laser desorption/ionization (SELDI) is an affinity-based mass spectrometric method in which proteins of interest are selectively adsorbed to a chemically modified surface on a biochip, whereas impurities are removed by washing with buffer. This technology allows sensitive and high-throughput protein profiling of complex biological specimens.
Methods: We screened for potential tumor biomarkers in 169 serum samples, including samples from a cancer group of 103 breast cancer patients at different clinical stages [stage 0 (n = 4), stage I (n = 38), stage II (n = 37), and stage III (n = 24)], from a control group of 41 healthy women, and from 25 patients with benign breast diseases. Diluted serum samples were applied to immobilized metal affinity capture Ciphergen ProteinChip® Arrays previously activated with Ni2+. Proteins bound to the chelated metal were analyzed on a ProteinChip Reader Model PBS II. Complex protein profiles of different diagnostic groups were compared and analyzed using the ProPeak software package.
Results: A panel of three biomarkers was selected based on their collective contribution to the optimal separation between stage 0–I breast cancer patients and noncancer controls. The same separation was observed using independent test data from stage II–III breast cancer patients. Bootstrap cross-validation demonstrated that a sensitivity of 93% for all cancer patients and a specificity of 91% for all controls were achieved by a composite index derived by multivariate logistic regression using the three selected biomarkers.
Conclusions: Proteomics approaches such as SELDI mass spectrometry, in conjunction with bioinformatics tools, could greatly facilitate the discovery of new and better biomarkers. The high sensitivity and specificity achieved by the combined use of the selected biomarkers show great potential for the early detection of breast cancer.perl problem -5 (hard)
Here is the perl problem 5
A simple search engine.
Write a program which when given a keyword as input , fetches all the records that contains the given keyword. Assume that you have a mysql database with a table containing records.
Input:
flu
Output:
Swine flu
Bird flu
influenza
etc etc
Hint (use DBI module)
solution to perl problem -3
Here is the program.
$File_Path="c:\\teja\\2UZT.pdb";
%chain=();
$Total_Seqres=0;
$Total_Helix=0;
open(FH,"$File_Path");
while(
if ( $_ =~ /SEQRES/ ) {
@Seqres=split(" ",$_);
if ( ! exists $chain{$Seqres[2]}) {
$chain{$Seqres[2]}=$Seqres[3];
$Total_Seqres += $Seqres[3];
}
}
if ( $_ =~ /HELIX/ ) {
$Total_Helix+=(split/\s/,$_)[-1];
}
} #while end here
close(FH);
print "\n Helix Value: $Total_Helix \n";
print "\n Total SEQRES: $Total_Seqres \n";
$Per_Value=($Total_Helix*100)/$Total_Seqres;
print "\n The percentage of aminoacids in a protein that are present in helix region is: $Per_Value \n";
Thanks to Amit for sending me the intial solution.
Teja
Monday, June 8, 2009
Bioinformatics Successfully Predicts Immune Response To One Of The Most Complex Viruses Known
The use of computers to advance human disease research – known as bioinformatics -- has received a major boost from researchers at the La Jolla Institute for Allergy & Immunology (LIAI), who have used it to successfully predict immune response to one of the most complex viruses known to man – the vaccinia virus, which is used in the smallpox vaccine. Immune responses, which are essentially how the body fights a disease-causing agent, are a crucial element of vaccine development.
"We are excited because this further validates the important role that bioinformatics can play in the development of diagnostic tools and ultimately vaccines," said Alessandro Sette, Ph.D., an internationally known vaccine expert and head of LIAI's Emerging Infectious Disease and Biodefense Center. "We've shown that it can successfully reveal – with a very high degree of accuracy -- the vast majority of the epitopes (targets) that would trigger an effective immune response against a complex pathogen."
Bioinformatics holds significant interest in the scientific community because of its potential to move scientific research forward more quickly and at less expense than traditional laboratory testing.
The findings were published this week in a paper, "A consensus epitope prediction approach identifies the breadth of murine TCD8+-cell responses to vaccinia virus," in the online version of the journal Nature Biotechnology. LIAI scientist Magdalini Moutaftsi was the lead author on the paper.
While bioinformatics – which uses computer databases, algorithms and statistical techniques to analyze biological information -- is already in use as a predictor of immune response, the LIAI research team's findings were significant because they demonstrated an extremely high rate of prediction accuracy (95 percent) in a very complex pathogen – the vaccinia virus. The vaccinia virus is a non-dangerous virus used in the smallpox vaccine because it is related to the variola virus, which is the agent of smallpox. The scientific team was able to prove the accuracy of their computer results through animal testing.
"Before, we knew that the prediction methods we were using were working, but this study proves that they work very well with a high degree of accuracy," Sette said.
The researchers focused their testing on the Major Histocompatibility Complex (MHC), which binds to certain epitopes and is key to triggering the immune system to attack a virus-infected cell. Epitopes are pieces of a virus that the body's immune system focuses on when it begins an immune response. By understanding which epitopes will bind to the MHC molecule and cause an immune attack, scientists can use those epitopes to develop a vaccine to ward off illness – in this case to smallpox.
The scientists were able to find 95 percent of the MHC binding epitopes through the computer modeling. "This is the first time that bioinformatics prediction for epitope MHC binding can account for almost all of the (targeted) epitopes that are existing in very complex pathogens like vaccinia," said LIAI researcher Magdalini Moutaftsi. The LIAI scientists theorize that the bioinformatics prediction approach for epitope MHC binding will be applicable to other viruses.
"The beauty of the virus used for this study is that it's one of the most complex, large viruses that exist," said Moutaftsi. "If we can predict almost all (targeted) epitopes from such a large virus, then we should be able to do that very easily for less complex viruses like influenza, herpes or even HIV, and eventually apply this methodology to larger microbes such as tuberculosis."
The big advantage of using bioinformatics to predict immune system targets, explained Sette, is that it overcomes the need to manufacture and test large numbers of peptides in the laboratory to find which ones will initiate an immune response. Peptides are amino acid pieces that potentially can be recognized by the immune system. "There are literally thousands of peptides," explained Sette. "You might have to create and test hundreds or even thousands of them to find the right ones," he said. "With bioinformatics, the computer does the screening based on very complex mathematical algorithms. And it can do it in much less time and at much less expense than doing the testing in the lab."
The LIAI scientific team verified the accuracy of their computer findings by comparing the results against laboratory testing of the peptides and whole infectious virus in mice. "We studied the total response directed against infected cells," Sette said. "We compared it to the response against the 50 epitopes that had been predicted by the computer. We were pleased to see that our prediction could account for 95% of the total response directed against the virus."
perl problem -4 (easy)
Format of perl problems
Saturday, June 6, 2009
Reply needed !
Friday, June 5, 2009
Geography and history shape genetic differences in humans
New research indicates that natural selection may shape the human genome much more slowly than previously thought. Other factors -- the movements of humans within and among continents, the expansions and contractions of populations, and the vagaries of genetic chance – have heavily influenced the distribution of genetic variations in populations around the world. The study, conducted by a team from the Howard Hughes Medical Institute, the University of Chicago, the University of California and Stanford University, is published June 5 in the open-access journal PLoS Genetics.
In recent years, geneticists have identified a handful of genes that have helped human populations adapt to new environments within just a few thousand years—a strikingly short timescale in evolutionary terms. However, the team found that for most genes, it can take at least 50,000-100,000 years for natural selection to spread favorable traits through a human population. According to their analysis, gene variants tend to be distributed throughout the world in patterns that reflect ancient population movements and other aspects of population history. "We don't think that selection has been strong enough to completely fine-tune the adaptation of individual human populations to their local environments," says co-author Jonathan Pritchard. "In addition to selection, demographic history -- how populations have moved around -- has exerted a strong effect on the distribution of variants."
To determine whether the frequency of a particular variant resulted from natural selection, Pritchard and his colleagues compared the distribution of variants in parts of the genome that affect the structure and regulation of proteins to the distribution of variants in parts of the genome that do not affect proteins. Since these neutral parts of the genome are less likely to be affected by natural selection, they reasoned that studying variants in these regions should reflect the demographic history of populations.
The researchers found that many previously identified genetic signals of selection may have been created by historical and demographic factors rather than by selection. When the team compared closely related populations they found few large genetic differences. If the individual populations' environments were exerting strong selective pressure, such differences should have been apparent.
Selection may still be occurring in many regions of the genome, says Pritchard. But if so, it is exerting a moderate effect on many genes that together influence a biological characteristic. "We don't know enough yet about the genetics of most human traits to be able to pick out all of the relevant variation," says Pritchard. "As functional studies go forward, people will start figuring out the phenotypes that are associated with selective signals," says lead author Graham Coop. "That will be very important, because then we can figure out what selection pressures underlie these episodes of natural selection."
But even with further research, much will remain unknown about the processes that have resulted in human traits. In particular, Pritchard and Coop urge great caution in trying to link selection with complex characteristics like intelligence. "We're in the infancy of trying to understand what signals of selection are telling us," says Coop, "so it's a very long jump to attribute cultural features and group characteristics to selection."
perl problem - 3
Thursday, June 4, 2009
Consortium Publishes Finished Mouse Genome Assembly
NEW YORK (GenomeWeb News) – Researchers from the Mouse Genome Sequencing Consortium and their collaborators have created a finished, high quality assembly of the mouse genome.
In a paper appearing online last night in PLoS Biology, the team used clone-based sequencing and assembly to generate a high quality mouse genome assembly called "Build 36." In so doing, they filled in thousands of gaps in — and added millions of bases of sequence to — the draft version of the mouse genome, published several years ago. By comparing the finished mouse and human genomes, the researchers were able to more fully appreciate conserved regions in the genomes as well as those specific to mice or rodents.
"With the benefit of hindsight, we now see how incomplete our initial summary of the mouse genome was," co-lead author Deanna Church, a staff scientist at the US National Institutes of Health's National Center for Biotechnology Information, said in a statement. "The new findings will allow us to dismiss some commonly held misconceptions and, more importantly, to reveal many hidden secrets of mouse biology."
The Mouse Genome Sequencing Consortium and Mouse Genome Analysis Group published a draft version of the mouse genome in Nature in 2002. That draft assembly, called MGSCv3, was generated using whole genome shotgun sequencing.
But the team also has been working in parallel to create a more refined mouse genome assembly using clone-based sequencing and mapping. To do this, the researchers sequenced BAC clones covering the entire mouse genome, incorporating information from the already available draft genome sequence. Nearly all of the sequencing was done at Washington University's Genome Center, the Broad Institute, Baylor College's Genome Center, and the Wellcome Trust Sanger Institute.
Using this data, the researchers assembled Build 36, closing more than 175,000 gaps in the mouse draft genome. The assembly contains 139 million bases of new sequence as well as millions more bases that appear to have been misassembled in the draft genome.
"The mouse genome assembly shows marked improvements over the MGSCv3," the authors noted, "with an increased amount of ordered and oriented sequence placed on a chromosome … and increased base level accuracy due to the addition of clone-based finished sequence."
Based on this assembly, the team concluded that the mouse genome contains roughly 20,210 protein-coding genes — nearly 1,200 more than the human genome. In particular, the researchers noted, Build 36 contains 1,259 mouse specific genes that were previously missing or misrepresented.
The researchers identified repetitive elements, repeats, and segmentally duplicated regions in the mouse genome that appear to harbor mouse or rodent specific sequence. They also uncovered shared long non-coding RNAs in the mouse and human genomes as well as ncRNAs present in mice but missing in humans.
"These new findings are extremely important in helping us to separate genes that underpin biology that is the same across all mammals, from genes that make humans and mice so different from one another," co-senior author Chris Ponting, a group leader at the University of Oxford's MRC Functional Genomics Unit, said in a statement.
The researchers are continuing to refine the mouse genome. A currently available assembly, Build 37, reportedly offers further improvements over Build 36, though the authors noted that some regions of the mouse genome "remain under review and will be addressed in forthcoming assemblies."
The researchers conceded that the clone-based sequencing and assembly is more expensive and time consuming than whole genome methods, but they argued that the extra investment is warranted in situations where researchers require a more refined view of the genome.
"[I]t's clear from our analysis of the finished mouse genome assembly that draft [whole genome sequence and assemblies] will always poorly reflect lineage-specific biology," the authors noted. "Finished genome sequence has proved essential to understanding the full range of biology for both the human and the mouse genome, and will no doubt prove similarly informative for other vertebrate species."
Perl problem -2
solution to perl problem -1
Wednesday, June 3, 2009
“Longevity” gene behind living beyond 100 uncovered
Scientists at the Albert Einstein College of Medicine of Yeshiva University have found out why some people live to 100 or more, despite the fact that they have as many, or sometimes even more, harmful gene variants as younger people. The scientists have uncovered favourable “longevity” genes that provide very old people with protection against the harmful effects of bad genes.
Reported in the journal PLoS Computational Biology, the novel approach used by the researchers may lead to the development of new drugs to protect against age-related diseases.
“We hypothesized that people living to 100 and beyond must be buffered by genes that interact with disease-causing genes to negate their effects,” says senior study author Dr. Aviv Bergman, a professor in the departments of pathology and neuroscience at Einstein.
In order to test their hypothesis, the researchers examined individuals enrolled in Einstein’s Longevity Genes Project, initiated in 1998 to investigate longevity genes in a selected population, viz, Ashkenazi (Eastern European) Jews.
Since Ashkenazi Jews are descended from a founder group of about 30,000 people, they are relatively genetically homogenous. This simplifies the challenge of associating traits-such as age related diseases and longevity-with the genes that determine them.
In all, 305 Ashkenazi Jews more than 95 years old and a control group of 408 unrelated Ashkenazi Jews participated in the study. They were grouped into cohorts representing each decade of lifespan from the 50’s on up.
The researchers tested DNA samples collected from each cohort, and determined the prevalence of 66 genetic markers present in 36 genes associated with ageing. Some disease-related gene variants were as prevalent or even more prevalent in the oldest cohorts of Ashkenazi Jews than in the younger ones.
As Dr. Bergman had predicted, genes associated with longevity also became more common in each succeeding cohort.
“These results indicate that the frequency of deleterious genotypes may increase among people who live to extremely old ages because their protective genes allow these disease-related genes to accumulate,” he says.
The researchers also constructed a network of gene interactions, which enabled them to find that the favourable variant of the gene CETP acts to buffer the harmful effects of the disease-causing gene Lp(a).
If future research also shows that a single longevity gene buffers against several disease-causing genes, the finding may lead to drugs that mimic the action of the longevity gene to protect against cardiovascular disease and other age-related diseases.
“This study shows that our approach, which was inspired by a theoretical model, can reveal underlying mechanisms that explain seemingly paradoxical observations in a complex trait such as aging,” says Dr. Bergman.
“So we’re hopeful that this method could also help uncover the mechanisms-the gene interactions-responsible for other complex biological traits such as cancer and diabetes,” he added.
Meanwhile, from the 66 genetic markers examined in this study, the researchers are now using a high-throughput technology that allows them to assay one million genetic markers throughout the human genome. Their goal is to find additional genetic networks that are involved in the process of ageing. (ANI)