Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

In Matlab In linguistics, stemming is the process of reducing inflected words to

ID: 3844233 • Letter: I

Question

In Matlab

In linguistics, stemming is the process of reducing inflected words to their word stem, base, or root form. In this assignment, you are to write a simple word stemmer for English. The input is given a string text that may have punctuations or other non-alphabetical characters. Your program should stem the words in the text and and return these words as a cell array.

Here are the steps your program should perform to derive and filter the word stems:

Convert any upper case letter to lower case.

Replace each non-alphabetical or non-space character to a space character. e.g., "My 1st NLP program!!!" should become: "my st nlp program "

Extract the words from the string. e.g., "my st nlp program " will result in the list: "my", "st", "nlp", and "program".

Strip the following suffixes from the words that have them: -ly, -ed, -ing, -es, -s. Each suffixes should be considered once and in that order (first strip -ly, then strip -ed, then strip -ing, etc.). e.g., the word "excitedly" turns into "excit"; the word "feeding" turns into "feed".

Remove any word from the list that is 2 characters or less.

Remove the following common words from the list: the, and, that, have, for, not



Note that the stemming strategies used in this program are over-simplistic and may not give sensible results.

Explanation / Answer

function out = simplestemmer(a)
for i = 1:length(a)
    if ~((a(i) >= 'a' && a(i)<='z') || (a(i) >= 'A' && a(i)<='Z')) %remove non alphabetical chars
        a(i)= ' ';
    end
end

c = strsplit(a); %split
out = [];
for i = 1:numel(c)
    c(1,i) = regexprep(c(1,i), '(ly|ed|ing|es|s)$', ''); %remove suffix
    if numel(regexp(c(1,i),'the|and|that|have|for|not'){1})== 0 && length(c(1,i){1}) > 2 %remove certaim words
        out = [out, c(1,i){1},' '];
    end
end

end

I kept the code simple, and have also commented the code to make things simple. If have troble understanding the code, please feel free to comment below. I shall be glad to help you.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote