Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

PERL SCRIPT PROGRAMMING Write a program that will read a DNA or RNA sequence in

ID: 3791524 • Letter: P

Question

PERL SCRIPT PROGRAMMING

Write a program that will read a DNA or RNA sequence in FASTA format and determine the count of each nucleotide in the sequence. Your task is to write such a program as a Perl script

The Perl script should

- work with FASTA files containing only one sequence

- The name of the file can be given on the command line when the script is invoked. If the name of the FASTA file is not specified on the command line, the script will read the sequence information (in FASTA format) from the standard input

- The script is to confirm that the record is in FASTA format. If it is not, it is to issue an error message and terminate

- Sequence information in the FASTA file can be in upper or lower case

- Output information is to be prefaced by the sequence identifier from the FASTA header

Notes:

To loop through the characters of a string you can use a construction such as

while( $c = ( chop $str ) )

For example

Given a FASTA file with content such as

your output should look like

Do not include newline or carriage-return characters in any of your counts.

Explanation / Answer

use strict;

use warnings;

# file for reading the data

my $filename = 'data.txt';

# open file. If unable to open exit program

open(my $fh, $filename)

or die "Could not open file '$filename' $!";

# read first entry in file to read inventory id. chomp remove ending carriage return

my $row = <$fh>;

chomp $row;

# get inventory id which is first element before space by spliting first line on space

$row = (split(" ", $row))[0];

print "inventory for '$row': ";

# initialise all counter to 0

my $a_count = 0;

my $c_count = 0;

my $g_count = 0;

my $tu_count = 0;

my $total_length = 0;

# loop over file untill it is completely read

while ($row = <$fh>) {

chomp $row;

# count number of small or capital A. similarly for others

$a_count += () = $row =~ /a|A/g;

$c_count += () = $row =~ /c|C/g;

$g_count += () = $row =~ /g|G/g;

$tu_count += () = $row =~ /t|T|u|U/g;

# count all charcters

$total_length += length($row);

}

# get count of other charcter

my $other_count = $total_length - $a_count - $c_count - $g_count - $tu_count;

# print details.

print "A: $a_count ";

print "C: $c_count ";

print "G: $g_count ";

print "T/U: $tu_count ";

print "other characters: $other_count ";