Data manipulation using pattern matching in R: We have an input dataset (.csv fi
ID: 3836652 • Letter: D
Question
Data manipulation using pattern matching in R: We have an input dataset (.csv file) with only one field which contians job description. job description we have nearly similar description (with little difference) or miss-spelling in the dataset entries. Our goal is to compare those nearly similar job description with a pattern and replace them to a single/uniform kind of job description in the input dataset. Below is one detail explanation with example on how the code should work: Varied job description entires (total 125 rows) in input file but all of them actually should be 'manager':- manager (say occurs 100 times) management occurs 10 times) manager-in-training (occurs 5 times) manager-pmac (occurs 3 times) manager) (occurs 4 times) managet (occurs 3 times) steps Read the .csv input file for the job description column. Create a look-up table/file which will contain the pattern we will try to match with the dataset job description entires. eg Our pattern can bExplanation / Answer
//Here I am giving the sample code for the given problem
//Taking input as file
//Read the file
//Do the pattern matching with file
//Generate the output
import java.util.regex.*;
public class patternMtachingFile {
public static void main(String[] args) {
String input = "data.*";
try {
FileInputStream fstream = new FileInputStream("thomas.txt");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String dataLine;
while ((strLine = br.readLine()) != null) {
if (Pattern.matches(input, dataLine)) {
Pattern p = Pattern.compile("'(.*?)'");
Matcher mt = Pattern.compile("(]+)\([^\) (?m)^\s*([^\]*\|<([^>]*)>[^\)]*\)").matcher(dataLine);
while (mt.search()) {
String x = mt.group(1);
String y = x.toString() + ".*";
System.out.println(b);
if (Pattern.matches(c, dataLine)) {
Pattern ptrn = Pattern.compile("<(.*?)>");
Matcher match = ptrn.matcher(dataLine);
while (match.find()) {
System.out.println(mt.group(1));
}
} else {
System.out.println("There is no matching with specified string file input");
}
}
}
}
} catch (Exception e) {
System.err.println("e: " + e.getMessage());
}
}
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.