The strategy for using Perl at Big Data Datacenters is almost complete, with the
ID: 3733829 • Letter: T
Question
The strategy for using Perl at Big Data Datacenters is almost complete, with the exception of one last-minute requirement. The team has identified a requirement to parse data so that it can be consumed into a centralized relational database management system (RDBMS).
Again, they are looking to you for answers, and they would like you to provide sample Perl scripts that show the following:
•Show how the Split function can be used.
•Show how the Chomp function can be used.
•Show how a for loop can be used in Perl.
•Show 2 additional functions of your choosing that can be used to parse data in Perl. For each sample code, provide some brief commentary on how the functions are being used to parse data.
Explanation / Answer
SPLIT Function : The split functionis basically used to tokenize the given expression string with the given pattern/delimeter as the separator. The syntax of Split function is as follows:
> split(PATTERN, EXPRESSION, [LIMIT](Optional))
The return value of the split function is the vector scalar. We can get clear understanding of the split function with the following function.
e.g. If we want to tokenize a:b:c:d:e by delimeter ":", then our split function call would be
#Pearl Script for split function
#!/usr/bin/perl -w
@retArray = split(:, "a:b:c:d:e");
print "@retArray";
The output of the above script is
a b c d e
The third parameter in split function is LIMIT which is used till where you want to tokenize the expression. Like if we specify @retArray = split(:, "a:b:c:d:e", 3) then the output would be a b c:d:e.
Chomp Function: This function is used to remove the trailing characters from the given string. By default the trailing character is set to new line character (' '). The syntax of the Chomp function is :
> chomp(EXPRESSION)
It retunrs the number of characters that are removed/chomped. For example, consider the below pearl script
#Pearl Script for Chomp function
#!/usr/bin/perl -w
@string1 = "test string";
@retValue = chomp(@string1);
print "@string1 : @retValue";
@string2 = "test string ";
@retValue = chomp(@string2);
print "@string2 @retValue"
The output of the above script is
test string : 0
test string : 1
From the above script, we can see that it removes the trailing new line characters are returns the number of characters it removed from the given expression.
For Loop: The for loop is used to iterate over some fixed set of statements until some condition becomes false. Basically, if we want some statement to excute for certain number of times or till some condition, in such cases we can use for loop. The syntax of for loop is as follows:
for(Initiation; Condition; Increamentation) {
[Statements to execute]
}
In the initiation part, we intiate a variable. In condition part, we check whether the condition is false or not. If the condtion become false then we terminate the loop. In the incrementation part, we increment the value of the variable. Consider the following example: Suppoe we want to print the values from 1 to 5, in such case the format of the for loop would be
#Pearl Script for "for" loop
#!/usr/bin/perl -w
for($i = 1; $i <= 5; $ i = $i + 1) {
print "$i "
}
The output of the above script is
1
2
3
4
5
As we are dealing with the data processing using pearl, we will discuss "pos" and "m" one by one.
> POS Function : This function is basically used for getting the position of last matched substring in the given expression. The syntax of the pos function is as follows:
> pos(Expression)
where expression is the regex/substring that we need search in the given expression. Consider the below example for better understanding.
#Pearl Script for "pos" function
#!/usr/bin/perl -w
$string = "aaaa bbbb cccc dddd";
$string =~ m/ccccc/g;
print pos($string);
The output of the above script is
14
The above program returns the position of the matched substring in the given expression.
> m Function : As we are dealing with data processing, we need to match keyword with the given data. The function "m" does that for us. It is used to match any keyword in the given expession. The syntax of the m function is as follows :
It returns 1 on successful match and 0 in case of failures.
Lets look at the below program for better understading.
#Pearl Script for "m" function
#!/usr/bin/perl -w
$_ = "aaaa bbbb cccc dddd";
if (m/bbbb/) {
print ("Found bbbb");
}
The output of the above script is
"Found bbbb"
#!/usr/bin/perl -w
@retArray = split(:, "a:b:c:d:e");
print "@retArray";
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.