PERL SCRIPT PROGRAMMING A researcher has a file containing information about the
ID: 3864765 • Letter: P
Question
PERL SCRIPT PROGRAMMING
A researcher has a file containing information about the number of times particular k-mers (peptide sequences of length k, derived from actual protein sequences) occur in the human proteome. The information for each k-mer is on one line in the file. The information is divided into columns. The first column is the position of the start of the k-mer in its source protein. The next column is the k-mer itself. Then are two counts: the number of times that the k-mer occurs in the human proteome, and the number of proteins in the human proteome which contain the k-mer. The information columns are deliminted by tab characters. For example, a portion of the data file might look like:
The researcher is interested in those k-mers for which the counts in the last two columns are both 0; i.e. the researcher is interested in k-mers which do not occur in the human proteome. For instance, given the data above, the researcher would be interested in being informed of the k-mer IDTLQ.
Write a Perl script that will output, on the standard output, the k-mers that do not occur in the human proteome assuming input as described above. Each k-mer is to be on a separate line. The script is to read from standard input. Assume that the input file contains nothing other than lines of k-mer information.
Hint: Use the pattern-extraction facilities of Perl.
Your scripts should be independent of the value of k (providing, of course, that k1). That is, your scripts should be work for data files of k-mers of any size. Further, k should not be a parameter in/to your scripts
note: perl script not shell script
Explanation / Answer
to run :
perl dna.pl
perl dna.pl inputfile.txt
if filename is not passed, inputfile.txt will be read from pgm by default
CODE
--------
OUTPUT
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.