1. Determine the codon usage for a DNA sequence
- Create a string containing a DNA sequence
- In a for loop
- use the function substr ($seq, $offset, 3) to extract codons
- store each codon and the number of times it has occurred in a hash
- Report the codon usage
2. Create a report of the expression levels and sequences of genes expressed in liver.
- There is a tab delimited file of expression data here.
- This file contains the following information:
GeneID, tissue the gene is expressed in, expression level, and gene sequence.
- Open this file script using open().
- In a loop:-
- Read a line in the file
- remove the "\n" at the end of the line
- split the line on tabs using split (/\t/ , $line)
- Store the data on each line in 3 different hashes as described below
hash key value
%tissue GeneID tissue
%expr GeneID expression level
%seq GeneID sequence
- Now search the %tissue hash for genes that are expressed in liver. Make a list of the GeneIDs corresponding to these genes.
- create a report of the GeneID and expression level of these genes