Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Write a method in Java called stripHtmlTags that accepts a Scanner representing

ID: 3556453 • Letter: W

Question

Write a method in Java called stripHtmlTags that accepts a Scanner representing an input file containing an HTML web page as its parameter, then reads that file and prints the file's text with all HTML tags removed. A tag is any text between the characters < and > .

For example, consider the following text:

<html>

<head>

<title>My web page</title>

</head>

<body>

<p>There are many pictures of my cat here,

as well as my <b>very cool</b> blog page,

which contains <font color="red">awesome

stuff about my trip to Vegas.</p>

Here's my cat now:<img src="cat.jpg">

</body>

</html>

If the file contained these lines, your program should output the following text:

My web page

There are many pictures of my cat here,

as well as my very cool blog page,

which contains awesome

stuff about my trip to Vegas.

Here's my cat now:

You may assume that the file is a well-formed HTML document and that there are no < or > characters inside tags.

Explanation / Answer

Here is java code to solve the givan problem.LEt me know if you have any issues with it.
replace file.htpl with your html content file name.


import java.io.*;

public class Html2TextWithRegExp {
private Html2TextWithRegExp() {}

public static void main (String[] args) throws Exception{
StringBuilder sb = new StringBuilder();
BufferedReader br = new BufferedReader(new FileReader("file.html"));
String line;
while ( (line=br.readLine()) != null) {
sb.append(line);
}
String nohtml = sb.toString().replaceAll("\<.*?>","");
System.out.println(nohtml);
}
}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote