Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Python Write a Python program that can find and display the paths of all of the

ID: 3595164 • Letter: P

Question

Python

Write a Python program that can find and display the paths of all of the files in a directory (and, potentially, all of its subdirectories, and their subdirectories, and so on), and then take action on some of those files that have interesting characteristics. Both the notion of interesting characters and taking action will be configurable and each can work in a few different ways, but the core act of finding files will always be the same. One of your goals should be to avoid rewriting the same code over and over again (e.g., multiple functions that each perform a search for files with slightly different characteristics)

The input

Your program will take input from the console in the following format. It should not prompt the user in any way. The intent here is not to write a user-friendly user interface; what you're actually doing is building a program that we can test automatically, so it's vital that your program reads inputs and writes outputs precisely as specified below.

First, the program reads a line of input that specifies which files are eligible to be found. That can be specified in one of two ways:

The letter D, followed by a space, followed (on the rest of the line) by the path to a directory. In this case, all of the files in that directory will be under consideration, but no subdirectories (and no files in those subdirectories) will be. (You can think of the letter D here as standing for "directory.")

The letter R, followed by a space, followed (on the rest of the line) by the path to a directory. In this case, all of the files in that directory will be under consideration, along with all of the files in its subdirectories, all of the files in their subdirectories, and so on. (You can think of the letter R here as standing for "recursive.")

If this line of input does not follow this format, or if the directory specified does not exist, print the wordERROR on a line by itself and repeat reading this line of input; continue until the input is valid.

Next, the program prints the paths to every file that is under consideration. Each path is printed on its own line, with no whitespace preceding or following it, and with every line ending in a newline. Note, also, that the order in which the files' paths are printed is relevant; you must print them in the following order:

First, the paths to all of the files in the directory are printed. These are printed in lexicographical order of the file's names. (More on lexicographical order a bit later, but note that this is the default way that strings are sorted.)

Next, if the files in subdirectories are being considered, the files in each of the subdirectories are printed according to the same ordering rules here, with all of the files in one subdirectory printed before any of the others, and with the subdirectories printed in lexicographical order of their names.

Now that the program has displayed the paths of every file under consideration, it's time to narrow our search. The program now reads a line of input that describes the search characteristics that will be used to decide whether files are "interesting" and should have action taken on them. There are five different characteristics, and this line of input chooses one of them.

If this line of input is the letter A alone on a line, all of the files found in the previous step are considered interesting.

If this line of input begins with the letter N, the search will be for files whose names exactly match a particular name. The N will be followed by a space; after the space, the rest of the line will indicate the name of the files to be searched for.

Note that filenames include extensions, so a search for boo would not find a file named boo.doc.

If this line of input begins with the letter E, the search will be for files whose names have a particularextension. The E will be followed by a space; after the space, the rest of the line will indicate the desired extension.

For example, if the desired extension is py, all files whose names end in .py will be considered interesting. The desired extension may be specified with or without a dot preceding it (e.g., E .py or E py would mean the same thing in the input), and your search should behave the same either way.

Note, also, that there is a difference between what you might call a name ending and an extension. In our program, if the search is looking for files with the extension oc, a file named iliveinthe.oc would be found, but a file named invoice.doc would not.

If this line of input begins with the letter T, the search will be for text files that contain the given text. TheTwill be followed by a space; after the space, the rest of the line will indicate the text that the file should contain in order to be considered interesting.

For example, if this line of input reads T while True, any text file containing the text "while True" would be considered interesting.

One thing to note is that not all files are text files, but that you can't determine that by their name or their extension. Any file that can be opened and read as a text file is considered a text file for our purposes here, regardless of its name. Any file that cannot be opened and read as a text file should be skipped (i.e., it is not considered interesting).

If this line of input begins with the character <, the search will be for files whose size, measured in bytes, is less than a specified threshold. The < will be followed by a space; after the space, the rest of the line will be a non-negative integer value specifying the size threshold.

For example, the input < 65536 means that files whose sizes are no more than 65,535 bytes (i.e., less than 65,536 bytes) will be considered interesting.

If this line of input begins with the character >, the search will be for files whose size, measured in bytes, is greater than a specified threshold. The > will be followed by a space; after the space, the rest of the line will be a non-negative integer value specifying the size threshold.

For example, the input > 2097151 means that files whose sizes are at least 2,097,152 bytes (i.e., greater than 2,097,151 bytes) will be considered interesting.

If this line of input does not match one of the formats described above, print the word ERROR on a line by itself and repeat reading a line of input; continue until the input is valid. Note that it is not an error to specify a search characteristic that matches no files; it's only an error if this line of input is structurally invalid (i.e., it does not match one of the formats above).

Next, the program prints the paths to every file that is considered interesting, based on the search characteristic. Each path is printed on its own line, with no whitespace preceding or following it, and with every line ending in a newline. The paths should be printed using the same ordering rules as the last time you printed them (i.e., lexicographical ordering, as described above), though, of course, you will likely print fewer this time, since not every file will necessarily meet the search characteristic.

If there were no interesting files, the program ends; there is no action to take.

Now that we've narrowed down our search, it's time to take action on the files we found. The actions are to be taken on the files in the same order as you printed them previously. The program now reads a line of input that describes the action that will be taken on each interesting file. There are three different actions, and this line of input chooses one of them.

If this line of input contains the letter F by itself, print the first line of text from the file if it's a text file; print NOT TEXT if it is not.

If this line of input contains the letter D by itself, make a duplicate copy of the file and store it in the same directory where the original resides, but the copy should have .dup (short for "duplicate") appended to its filename. For example, if the interesting file is C:picturesoo.jpg, you would copy it toC:picturesoo.jpg.dup.

If the third line of the input contains the letter T by itself, "touch" the file, which means to modify its last modified timestamp to be the current date/time.

If this line of input does not match one of the formats described above, print the word ERROR on a line by itself and repeat reading a line of input; continue until the input is valid.

Once an action has been taken on each file, the program ends.

Output example

The following is an example of the program's execution, as it should work when you're done. Boldfaced, italicized text indicates input, while normal text indicates output. The directories and files shown are hypothetical, but the structure of the input and output is demonstrated as described above.

To reiterate a point from earlier, your program should not prompt the user in any way; it should read input, assuming that the user is aware of the proper format to use.

R C:TestProject1Example

C:TestProject1Example est1.txt

C:TestProject1Example est2.txt

C:TestProject1ExampleSubmeee.txt

C:TestProject1ExampleSub est1.txt

C:TestProject1ExampleSubyouu.txt

C:TestProject1ExampleZzzzzz.py

N

ERROR

N test1.txt

C:TestProject1Example est1.txt

C:TestProject1ExampleSub est1.txt

Q

ERROR

F

This is a line of text

Hello, my name is Sam

Explanation / Answer

import datetime
import pathlib
import queue
import subprocess
import sys
import threading
import time
import traceback
import typing

class TextProcessReadTimeout(Exception):pass

class TextProcess:_READ_INTERVAL_IN_SECONDS=0.025

def __init__(self,args:[str],working_directory:str):self._process=subprocess.Popen(args,cwd=working_directory,bufsize=0,stdin=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)

self._stdout_read_trigger=queue.Queue()self._stdout_buffer=queue.Queue()

self._stdout_thread=threading.Thread(target=self._stdout_read_loop,daemon=True)

self._stdout_thread.start()

def __enter__(self):
return self

def __exit__(self, tr, exc, val):
self.close()

def close(self):
self._stdout_read_trigger.put('stop')
self._process.terminate()
self._process.wait()
self._process.stdout.close()
self._process.stdin.close()

def write_line(self,line:str)->None:try:self._process.stdin.write((line+' ').encode(encoding='utf-8'))self._process.stdin.flush()

except OSError:pass

def read_line(self,timeout:float=None)->str or None:self._stdout_read_trigger.put('read')

sleep_time=0

while timeout==None or sleep_time<timeout:try:next_result=self._stdout_buffer.get_nowait()

if next_result==None:return None

elif isinstance(next_result, Exception):
raise next_result
else:
return next_result.decode(encoding = 'utf-8')


except queue.Empty:
time.sleep(TextProcess._READ_INTERVAL_IN_SECONDS)
sleep_time += TextProcess._READ_INTERVAL_IN_SECONDS

raise TextProcessReadTimeout()

def _stdout_read_loop(self):
try:
while self._process.returncode == None:
if self._stdout_read_trigger.get() == 'read':
line = self._process.stdout.readline()


if line == b'':
self._stdout_buffer.put(None)
else:
self._stdout_buffer.put(line)
else:
break


except Exception as e:
self._stdout_buffer.put(e)
class TestFailure(Exception):
pass


class TestInputLine:
def __init__(self, text: str):
self._text = text


def execute(self, process: TextProcess) -> None:
try:
process.write_line(self._text)


except Exception as e:
print_labeled_output(
'EXCEPTION',
*[tb_line.rstrip() for tb_line in traceback.format_exc().split(' ')])


raise TestFailure()


print_labeled_output('INPUT', self._text)


class TestOutputLine:
def __init__(self, text: str, timeout_in_seconds: float):
self._text = text
self._timeout_in_seconds = timeout_in_seconds


def execute(self, process: TextProcess) -> None:
try:
output_line = process.read_line(self._timeout_in_seconds)


except TextProcessReadTimeout:
output_line = None


except Exception as e:
print_labeled_output(
'EXCEPTION',
*[tb_line.rstrip() for tb_line in traceback.format_exc().split(' ')])


raise TestFailure()


if output_line != None:
if output_line.endswith(' '):
output_line = output_line[:-2]
elif output_line.endswith(' '):
output_line = output_line[:-1]


print_labeled_output('OUTPUT', output_line)


if output_line != self._text:
print_labeled_output('EXPECTED', self._text)


index = min(len(output_line), len(self._text))


for i in range(min(len(output_line), len(self._text))):
if output_line[i] != self._text[i]:
index = i
break


print_labeled_output('', (' ' * index) + '^')


print_labeled_output(
'ERROR',
'This line of output did not match what was expected. The first',
'incorrect character is marked with a ^ above.',
'(If you don't see a difference, perhaps your program printed',
'extra whitespace on the end of this line.)')


raise TestFailure()


else:
print_labeled_output('EXPECTED', self._text)


print_labeled_output(
'ERROR',
'This line of output was expected, but the program did not generate',
'any additional output after waiting for {} second(s).'.format(self._timeout_in_seconds))


raise TestFailure()


class TestEndOfOutput:
def __init__(self, timeout_in_seconds: float):
self._timeout_in_seconds = timeout_in_seconds


def execute(self, process: TextProcess) -> None:
output_line = process.read_line(self._timeout_in_seconds)


if output_line != None:
print_labeled_output('OUTPUT', output_line)


print_labeled_output(
'ERROR',
'Extra output was printed after the program should not have generated',
'any additional output')


raise TestFailure()


def write_test_file(dir_path: pathlib.Path, sub_path: pathlib.Path, lines: [str]) -> None:
path = dir_path / sub_path


if not path.parent.exists():
path.parent.mkdir(parents = True)


with path.open('w') as test_file:
for line in lines:
test_file.write(line + ' ')


TEST_FILES = [
(pathlib.Path('test1.txt'), [
'This is a line of text',
'and this is another'
]),
(pathlib.Path('test2.txt'), [
'There are a few lines of text',
'in this file',
'instead of just a couple',
'of them'
]),
(pathlib.Path('Sub/meee.txt'), [
'I am Boo',
'and it is all about me',
'and everything is about me',
'so everyone should be focused on me'
]),
(pathlib.Path('Sub/test1.txt'), [
'Hello, my name is Boo',
'How are you today?'
]),
(pathlib.Path('Sub/youu.txt'), [
'Or maybe it should be about you',
'I cannot decide'
]),
(pathlib.Path('Zzz/zzz.py'), [
'print('Sleep...')',
'for i in range(10):',
' print('ZZZZZZZZZZ')'
])
]


def create_test_directory() -> pathlib.Path:
now = datetime.datetime.now()


test_directory_name = 'project1_test_{:04}-{:02}-{:02}-{:02}-{:02}-{:02}'.format(
now.year, now.month, now.day, now.hour, now.minute, now.second)


test_directory_path = pathlib.Path.cwd() / pathlib.Path(test_directory_name)
test_directory_path.mkdir(parents = True)


for sub_path, lines in TEST_FILES:
write_test_file(test_directory_path, sub_path, lines)


return test_directory_path


def make_test_lines(test_directory_path: pathlib.Path) -> ['TestLine']:
test_lines = []


test_lines.append(TestInputLine(
'R {}'.format(str(test_directory_path))))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('test1.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('test2.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('Sub/meee.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('Sub/test1.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('Sub/youu.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('Zzz/zzz.py')), 10.0))


test_lines.append(TestInputLine('N'))
test_lines.append(TestOutputLine('ERROR', 1.0))
test_lines.append(TestInputLine('N test1.txt'))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('test1.txt')), 10.0))


test_lines.append(TestOutputLine(
str(test_directory_path / pathlib.Path('Sub/test1.txt')), 10.0))


test_lines.append(TestInputLine('Q'))
test_lines.append(TestOutputLine('ERROR', 1.0))
test_lines.append(TestInputLine('F'))
test_lines.append(TestOutputLine('This is a line of text', 10.0))
test_lines.append(TestOutputLine('Hello, my name is Boo', 10.0))


return test_lines


def run_test() -> None:
test_directory_path = create_test_directory()
process = None


try:
process = start_process()
test_lines = make_test_lines(test_directory_path)
run_test_lines(process, test_lines)
print_labeled_output(
'PASSED',
'Your "project1.py" passed the sanity checker. Note that there are',
'many other tests you'll want to run on your own, because there are',
'many different combinations of inputs that are legal.')


except TestFailure:
print_labeled_output(
'FAILED',
'The sanity checker has failed, for the reasons described above.')


finally:
if process != None:
process.close()


def start_process() -> TextProcess:
filenames_in_dir = [p.name for p in list(pathlib.Path.cwd().iterdir()) if p.is_file()]


if not 'project1.py' in filenames_in_dir:
print_labeled_output(
'ERROR',
'Cannot find file "project1.py" in this directory.',
'Make sure that the sanity checker is in the same directory as the',
'"project1.py" that comprises your Project #1 solution. Also, be',
'sure that you've named your "project1.py" file correctly, noting',
'that capitalization and spacing matter.')


raise TestFailure()


else:
return TextProcess(
[sys.executable, str(pathlib.Path.cwd() / 'project1.py')],
str(pathlib.Path.cwd()))


def print_labeled_output(label: str, *msg_lines: typing.Iterable[str]) -> None:
showed_first = False


for msg_line in msg_lines:
if not showed_first:
print('{:10}|{}'.format(label, msg_line))
showed_first = True
else:
print('{:10}|{}'.format(' ', msg_line))


if not showed_first:
print(label)


def run_test_lines(process: TextProcess, test_lines: 'TestLine') -> None:
for line in test_lines:
line.execute(process)


if __name__ == '__main__':
run_test()