Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

=A non-trivial web page written in HTML will typically contain hundreds of HTML

ID: 3548718 • Letter: #

Question

=A non-trivial web page written in HTML will typically contain hundreds of HTML elements enclosed by (defined by) HTML tags. Your program

=A non-trivial web page written in HTML will typically contain hundreds of HTML elements enclosed by (defined by) HTML tags. Your program's job is to parse (read) an HTML file line-by-line, strip out (remove or ignore) all the HTML tags, and print only the text that appears in headings, paragraphs, ordered lists and unordered lists. Images () should be ignored. The title, given in the tags, should be printed on a line by itself, followed by a blank line. Headings should be printed on lines themselves, with a blank line after each heading. A blank line should be printed after each paragraph. The list items themselves () should also be printed on separate lines with a dash preceding each item in an unordered list and a number before each item in an ordered list that increases with each list item. If there are two or more ordered lists, the first item in each list should be numbered one and should increase from there. Finally, blank lines and extraneous spaces in the input file should be ignored completely. Your program should be able to handle HTML files (web pages) that contain the following HTML tags, both opening and closing tags, as appropriate (keep in mind that these are enclosed in angular brackets): html head body title h1, h2, h3, h4, h5, h6 p img ul, ol, li For sake of simplicity you may assume that matched opening and closing tags for the title, h1, h2, h3, h4, h5, h6 and li tags are on the same line of the file. This is not true of any of the other tags, however. You may assume that any img element is fully contained on one line. The input filename should be provided as a command-line argument in the main() method. For testing purposes you may hard-code the filename, but before submitting, make sure that the filename is given as in the following example:

Explanation / Answer

A special regular expression is used to strip out anything between a < and > .

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote