I'm writing some Perl code for a class I'm taking. BioPerl is a very well developed and documented set of tools written in Perl, but I thought it would be cool just to write my own set of classes. Its pretty raw, but I think its a good start.
bio.pm These are the basic classes for manipulating Biological datacodons.pm holds some codon data and creates a hash to lookup codons
sequences.fasta contains a sequence you can use to test
Here are some examples of what you can do.
Create some sequence objects
use bio;
$seq = new bio::DNASeq('ATCGAAATTTGGGC',{'desc'=>'Some optional description'});
$seq = new bio::RNASeq('AUCGAAAUUUGGGC');
$seq = new bio::proteinSeq('MAAW*');
print "My protein sequence is ",$seq->seq,"\n";
Read sequences from FASTA file
$r = new bio::FASTAReader("sequences.fasta");
while($seq = $r->nextSeq()){
print "My Sequence: ",$seq->seq;
}
Translate sequences
print "Sequence as RNA",($seq->translate('rna'))->seq,"\n";
print "Sequence as Protein",($seq->translate('protein'))->seq,"\n";
Get GC Content
# GCContent returns (%of GCs, %of Gs, % of Cs)
print "GC Content: ",($seq->GCContent)[0],"\n";
Compute any rate of occurence
#Get the rate of occurence of some sequence within you sequence
# returns a hashRef keyed with your sequences
%h = %{$seq->rateOfOccurence(('A','T','C','G'))};
print "% of As = $h{A}\n";
print "% of Gs = $h{G}\n";
print "% of Cs = $h{C}\n";
print "% of Ts = $h{T}\n";
