Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

The skeleton code on below is the start of a program to perform sequence alignme

ID: 3732032 • Letter: T

Question

The skeleton code on below is the start of a program to perform sequence alignment. It is used in applications such as BLAST for DNA comparison, and diff for lines of a text file. Two text files are given as input arguments and the output should be the best alignment of the two strings. A simple depth-first search is provided, but it performs very poorly on larger inputs. Use dynamic programming to improve on the current solution’s runtime. (DFS will only run on text/a.txt and text/b.txt, not any of the other file pairs.)

There is one algorithm already implemented:

findLCSdfs Recursively searches through each string for character matches.

For this assignment, you will implement the findLCSdyn method (and any helper methods you want) in LAB7.java. Your method should return an array with two aligned strings, for example:

BEN-T-

--NOTE

Use the ‘-’ character to represent a gap. That character will not show up in any input text. In addition to returning an array with the two aligned string values, you should also update the class variable longest with the optimal number of overlapping characters in a solution.

The algorithm in the book shows how to reconstruct the subsequence exactly, but you can modify this algorithm to build the aligned strings with gaps in them. You can do this by adding a gap to one string and letter to the other string whenever their current characters don’t match. (So: if you move up, prepend the row’s character to one string and a dash to the other. If you move left, prepend a dash to the first string and the column’s character to the second string.)

Note: Java’s default stack size is not very large, so I recommend implementing iterative methods rather than recursive ones.

//Here are the files that come with the lab for testing

A.txt:

ACGGAT

B.txt:

CCGCTT

melania.txt:

From a young age, my parents impressed on me the values that you work hard for what you want in life: that your word is your bond and you do what you say and keep your promise; that you treat people with respect. They taught and showed me values and morals in their daily life. That is a lesson that I continue to pass along to our son, and we need to pass those lessons on to the many generation to follow. I travelled the world while working hard in the incredible arena of fashion.

michelle.txt:

And you know, what struck me when I first met Barack was that even though he had this funny name, even though he'd grown up all the way across the continent in Hawaii, his family was so much like mine. He was raised by grandparents who were workingclass folks just like my parents, and by a single mother who struggled to pay the bills just like we did. Like my family, they scrimped and saved so that he could have opportunities they never had themselves. And Barack and I were raised with so many of the same values: that you work hard for what you want in life; that your word is your bond and you do what you say you're going to do; that you treat people with dignity and respect, even if you don't know them, and even if you don't agree with them. And Barack and I set out to build lives guided by these values, and pass them on to the next generation. Because we want our children — and all children in this nation — to know that the only limit to the height of your achievements is the reach of your dreams and your willingness to work for them.

znc1.txt:

/*

* Copyright (C) 20042013 ZNC, see the NOTICE file for details.

*

* Licensed under the Apache License, Version 2.0 (the "License");

* you may not use this file except in compliance with the License.

* You may obtain a copy of the License at

*

* http://www.apache.org/licenses/LICENSE2.0

*

* Unless required by applicable law or agreed to in writing, software

* distributed under the License is distributed on an "AS IS" BASIS,

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

* See the License for the specific language governing permissions and

* limitations under the License.

*/

#include <znc/User.h>

#include <znc/IRCNetwork.h>

#include <signal.h>

unsigned int CSockManager::GetAnonConnectionCount(const CString &sIP) const {

const_iterator it;

unsigned int ret = 0;

for (it = begin(); it != end(); ++it) {

Csock *pSock = *it;

// Logged in CClients have "USR::<username>" as their sockname

if (pSock>GetType() == Csock::INBOUND && pSock>GetRemoteIP() == sIP

&& pSock>GetSockName().Left(5) != "USR::") {

ret++;

}

}

DEBUG("There are [" << ret << "] clients from [" << sIP << "]");

return ret;

}

int CZNCSock::ConvertAddress(const struct sockaddr_storage * pAddr, socklen_t iAddrLen, CS_STRING & sIP, u_short * piPort) {

int ret = Csock::ConvertAddress(pAddr, iAddrLen, sIP, piPort);

if (ret == 0)

sIP.TrimPrefix("::ffff:");

return ret;

}

#ifdef HAVE_PTHREAD

class CSockManager::CTDNSMonitorFD : public CSMonitorFD {

public:

CTDNSMonitorFD() {

Add(CThreadPool::Get().getReadFD(), ECT_Read);

}

virtual bool FDsThatTriggered(const std::map<int, short>& miiReadyFds) {

if (miiReadyFds.find(CThreadPool::Get().getReadFD())>second) {

CThreadPool::Get().handlePipeReadable();

}

return true;

}

};

#endif

#ifdef HAVE_THREADED_DNS

void CSockManager::CDNSJob::runThread() {

int iCount = 0;

while (true) {

addrinfo hints;

memset(&hints, 0, sizeof(hints));

hints.ai_family = AF_UNSPEC;

hints.ai_socktype = SOCK_STREAM;

hints.ai_protocol = IPPROTO_TCP;

hints.ai_flags = AI_ADDRCONFIG;

iRes = getaddrinfo(sHostname.c_str(), NULL, &hints, &aiResult);

if (EAGAIN != iRes) {

break;

}

iCount++;

if (iCount > 5) {

iRes = ETIMEDOUT;

break;

}

sleep(5); // wait 5 seconds before next try

}

}

void CSockManager::CDNSJob::runMain() {

if (0 != this>iRes) {

DEBUG("Error in threaded DNS: " << gai_strerror(this>iRes));

if (this>aiResult) {

DEBUG("And aiResult is not NULL...");

}

this>aiResult = NULL; // just for case. Maybe to call freeaddrinfo()?

}

pManager>SetTDNSThreadFinished(this>task, this>bBind, this>aiResult);

}

void CSockManager::StartTDNSThread(TDNSTask* task, bool bBind) {

CString sHostname = bBind ? task>sBindhost : task>sHostname;

CDNSJob* arg = new CDNSJob;

arg>sHostname = sHostname;

arg>task = task;

arg>bBind = bBind;

arg>iRes = 0;

arg>aiResult = NULL;

arg>pManager = this;

CThreadPool::Get().addJob(arg);

}

void CSockManager::SetTDNSThreadFinished(TDNSTask* task, bool bBind, addrinfo* aiResult) {

if (bBind) {

task>aiBind = aiResult;

task>bDoneBind = true;

} else {

task>aiTarget = aiResult;

task>bDoneTarget = true;

}

// Now that something is done, check if everything we needed is done

if (!task>bDoneBind || !task>bDoneTarget) {

return;

}

// All needed DNS is done, now collect the results

addrinfo* aiTarget = NULL;

addrinfo* aiBind = NULL;

try {

addrinfo* aiTarget4 = task>aiTarget;

addrinfo* aiBind4 = task>aiBind;

while (aiTarget4 && aiTarget4>ai_family != AF_INET) aiTarget4 = aiTarget4>ai_next;

while (aiBind4 && aiBind4>ai_family != AF_INET) aiBind4 = aiBind4>ai_next;

addrinfo* aiTarget6 = NULL;

addrinfo* aiBind6 = NULL;

#ifdef HAVE_IPV6

aiTarget6 = task>aiTarget;

aiBind6 = task>aiBind;

while (aiTarget6 && aiTarget6>ai_family != AF_INET6) aiTarget6 = aiTarget6>ai_next;

while (aiBind6 && aiBind6>ai_family != AF_INET6) aiBind6 = aiBind6>ai_next;

#endif

if (!aiTarget4 && !aiTarget6) {

throw "Can't resolve server hostname";

} else if (task>sBindhost.empty()) {

#ifdef HAVE_IPV6

aiTarget = task>aiTarget;

#else

aiTarget = aiTarget4;

#endif

} else if (!aiBind4 && !aiBind6) {

throw "Can't resolve bind hostname. Try /znc clearbindhost and /znc clearuserbindhost";

} else if (aiBind6 && aiTarget6) {

aiTarget = aiTarget6;

aiBind = aiBind6;

} else if (aiBind4 && aiTarget4) {

aiTarget = aiTarget4;

aiBind = aiBind4;

} else {

throw "Address family of bindhost doesn't match address family of server";

}

CString sBindhost;

CString sTargetHost;

if (!task>sBindhost.empty()) {

char s[INET6_ADDRSTRLEN] = {};

getnameinfo(aiBind>ai_addr, aiBind>ai_addrlen, s, sizeof(s), NULL, 0, NI_NUMERICHOST);

sBindhost = s;

}

char s[INET6_ADDRSTRLEN] = {};

getnameinfo(aiTarget>ai_addr, aiTarget>ai_addrlen, s, sizeof(s), NULL, 0, NI_NUMERICHOST);

sTargetHost = s;

DEBUG("TDNS: " << task>sSockName << ", connecting to [" << sTargetHost << "] using bindhost [" << sBindhost << "]");

FinishConnect(sTargetHost, task>iPort, task>sSockName, task>iTimeout, task>bSSL, sBindhost, task>pcSock);

} catch (const char* s) {

DEBUG(task>sSockName << ", dns resolving error: " << s);

task>pcSock>SetSockName(task>sSockName);

task>pcSock>SockError(1, s);

delete task>pcSock;

}

if (task>aiTarget) freeaddrinfo(task>aiTarget);

if (task>aiBind) freeaddrinfo(task>aiBind);

delete task;

}

#endif /* HAVE_THREADED_DNS */

CSockManager::CSockManager() {

#ifdef HAVE_PTHREAD

MonitorFD(new CTDNSMonitorFD());

#endif

}

CSockManager::~CSockManager() {

}

void CSockManager::Connect(const CString& sHostname, u_short iPort, const CString& sSockName, int iTimeout, bool bSSL, const CString& sBindHost, CZNCSock *pcSock) {

#ifdef HAVE_THREADED_DNS

DEBUG("TDNS: initiating resolving of [" << sHostname << "] and bindhost [" << sBindHost << "]");

TDNSTask* task = new TDNSTask;

task>sHostname = sHostname;

task>iPort = iPort;

task>sSockName = sSockName;

task>iTimeout = iTimeout;

task>bSSL = bSSL;

task>sBindhost = sBindHost;

task>pcSock = pcSock;

task>aiTarget = NULL;

task>aiBind = NULL;

task>bDoneTarget = false;

if (sBindHost.empty()) {

task>bDoneBind = true;

} else {

task>bDoneBind = false;

StartTDNSThread(task, true);

}

StartTDNSThread(task, false);

#else /* HAVE_THREADED_DNS */

// Just let Csocket handle DNS itself

FinishConnect(sHostname, iPort, sSockName, iTimeout, bSSL, sBindHost, pcSock);

#endif

}

void CSockManager::FinishConnect(const CString& sHostname, u_short iPort, const CString& sSockName, int iTimeout, bool bSSL, const CString& sBindHost, CZNCSock *pcSock) {

CSConnection C(sHostname, iPort, iTimeout);

C.SetSockName(sSockName);

C.SetIsSSL(bSSL);

C.SetBindHost(sBindHost);

TSocketManager<CZNCSock>::Connect(C, pcSock);

}

/////////////////// CSocket ///////////////////

CSocket::CSocket(CModule* pModule) : CZNCSock() {

m_pModule = pModule;

if (m_pModule) m_pModule>AddSocket(this);

EnableReadLine();

SetMaxBufferThreshold(10240);

}

CSocket::CSocket(CModule* pModule, const CString& sHostname, unsigned short uPort, int iTimeout) : CZNCSock(sHostname, uPort, iTimeout) {

m_pModule = pModule;

if (m_pModule) m_pModule>AddSocket(this);

EnableReadLine();

SetMaxBufferThreshold(10240);

}

CSocket::~CSocket() {

CUser *pUser = NULL;

// CWebSock could cause us to have a NULL pointer here

if (m_pModule) {

pUser = m_pModule>GetUser();

m_pModule>UnlinkSocket(this);

}

if (pUser && m_pModule && (m_pModule>GetType() != CModInfo::GlobalModule)) {

pUser>AddBytesWritten(GetBytesWritten());

pUser>AddBytesRead(GetBytesRead());

} else {

CZNC::Get().AddBytesWritten(GetBytesWritten());

CZNC::Get().AddBytesRead(GetBytesRead());

}

}

void CSocket::ReachedMaxBuffer() {

DEBUG(GetSockName() << " == ReachedMaxBuffer()");

if (m_pModule) m_pModule>PutModule("Some socket reached its max buffer limit and was closed!");

Close();

}

void CSocket::SockError(int iErrno, const CString& sDescription) {

DEBUG(GetSockName() << " == SockError(" << sDescription << ", " << strerror(iErrno) << ")");

if (iErrno == EMFILE) {

// We have too many open fds, this can cause a busy loop.

Close();

}

}

bool CSocket::ConnectionFrom(const CString& sHost, unsigned short uPort) {

return CZNC::Get().AllowConnectionFrom(sHost);

}

bool CSocket::Connect(const CString& sHostname, unsigned short uPort, bool bSSL, unsigned int uTimeout) {

if (!m_pModule) {

DEBUG("ERROR: CSocket::Connect called on instance without m_pModule handle!");

return false;

}

CUser* pUser = m_pModule>GetUser();

CString sSockName = "MOD::C::" + m_pModule>GetModName();

CString sBindHost;

if (pUser) {

sSockName += "::" + pUser>GetUserName();

sBindHost = pUser>GetBindHost();

CIRCNetwork* pNetwork = m_pModule>GetNetwork();

if (pNetwork) {

sSockName += "::" + pNetwork>GetName();

sBindHost = pNetwork>GetBindHost();

}

}

// Don't overwrite the socket name if one is already set

if (!GetSockName().empty()) {

sSockName = GetSockName();

}

m_pModule>GetManager()>Connect(sHostname, uPort, sSockName, uTimeout, bSSL, sBindHost, this);

return true;

}

bool CSocket::Listen(unsigned short uPort, bool bSSL, unsigned int uTimeout) {

if (!m_pModule) {

DEBUG("ERROR: CSocket::Listen called on instance without m_pModule handle!");

return false;

}

CUser* pUser = m_pModule>GetUser();

CString sSockName = "MOD::L::" + m_pModule>GetModName();

if (pUser) {

sSockName += "::" + pUser>GetUserName();

}

// Don't overwrite the socket name if one is already set

if (!GetSockName().empty()) {

sSockName = GetSockName();

}

return m_pModule>GetManager()>ListenAll(uPort, sSockName, bSSL, SOMAXCONN, this);

}

CModule* CSocket::GetModule() const { return m_pModule; }

/////////////////// !CSocket ///////////////////

znc2.txt:

/*

* Copyright (C) 20042012 See the AUTHORS file for details.

*

* This program is free software; you can redistribute it and/or modify it

* under the terms of the GNU General Public License version 2 as published

* by the Free Software Foundation.

*/

#include <znc/Socket.h>

#include <znc/Modules.h>

#include <znc/User.h>

#include <znc/znc.h>

unsigned int CSockManager::GetAnonConnectionCount(const CString &sIP) const {

const_iterator it;

unsigned int ret = 0;

for (it = begin(); it != end(); ++it) {

Csock *pSock = *it;

// Logged in CClients have "USR::<username>" as their sockname

if (pSock>GetType() == Csock::INBOUND && pSock>GetRemoteIP() == sIP

&& pSock>GetSockName().Left(5) != "USR::") {

ret++;

}

}

DEBUG("There are [" << ret << "] clients from [" << sIP << "]");

return ret;

}

int CZNCSock::ConvertAddress(const struct sockaddr_storage * pAddr, socklen_t iAddrLen, CS_STRING & sIP, u_short * piPort) {

int ret = Csock::ConvertAddress(pAddr, iAddrLen, sIP, piPort);

if (ret == 0)

sIP.TrimPrefix("::ffff:");

return ret;

}

/////////////////// CSocket ///////////////////

CSocket::CSocket(CModule* pModule) : CZNCSock() {

m_pModule = pModule;

if (m_pModule) m_pModule>AddSocket(this);

EnableReadLine();

SetMaxBufferThreshold(10240);

}

CSocket::CSocket(CModule* pModule, const CString& sHostname, unsigned short uPort, int iTimeout) : CZNCSock(sHostname, uPort, iTimeout) {

m_pModule = pModule;

if (m_pModule) m_pModule>AddSocket(this);

EnableReadLine();

SetMaxBufferThreshold(10240);

}

CSocket::~CSocket() {

CUser *pUser = NULL;

// CWebSock could cause us to have a NULL pointer here

if (m_pModule) {

pUser = m_pModule>GetUser();

m_pModule>UnlinkSocket(this);

}

if (pUser && m_pModule && (m_pModule>GetType() != CModInfo::GlobalModule)) {

pUser>AddBytesWritten(GetBytesWritten());

pUser>AddBytesRead(GetBytesRead());

} else {

CZNC::Get().AddBytesWritten(GetBytesWritten());

CZNC::Get().AddBytesRead(GetBytesRead());

}

}

void CSocket::ReachedMaxBuffer() {

DEBUG(GetSockName() << " == ReachedMaxBuffer()");

if (m_pModule) m_pModule>PutModule("Some socket reached its max buffer limit and was closed!");

Close();

}

void CSocket::SockError(int iErrno, const CString& sDescription) {

DEBUG(GetSockName() << " == SockError(" << sDescription << ", " << strerror(iErrno) << ")");

if (iErrno == EMFILE) {

// We have too many open fds, this can cause a busy loop.

Close();

}

}

bool CSocket::ConnectionFrom(const CString& sHost, unsigned short uPort) {

return CZNC::Get().AllowConnectionFrom(sHost);

}

bool CSocket::Connect(const CString& sHostname, unsigned short uPort, bool bSSL, unsigned int uTimeout) {

if (!m_pModule) {

DEBUG("ERROR: CSocket::Connect called on instance without m_pModule handle!");

return false;

}

CUser* pUser = m_pModule>GetUser();

CString sSockName = "MOD::C::" + m_pModule>GetModName();

CString sBindHost;

if (pUser) {

sSockName += "::" + pUser>GetUserName();

sBindHost = m_pModule>GetUser()>GetBindHost();

}

// Don't overwrite the socket name if one is already set

if (!GetSockName().empty()) {

sSockName = GetSockName();

}

m_pModule>GetManager()>Connect(sHostname, uPort, sSockName, uTimeout, bSSL, sBindHost, this);

return true;

}

bool CSocket::Listen(unsigned short uPort, bool bSSL, unsigned int uTimeout) {

if (!m_pModule) {

DEBUG("ERROR: CSocket::Listen called on instance without m_pModule handle!");

return false;

}

CUser* pUser = m_pModule>GetUser();

CString sSockName = "MOD::L::" + m_pModule>GetModName();

if (pUser) {

sSockName += "::" + pUser>GetUserName();

}

// Don't overwrite the socket name if one is already set

if (!GetSockName().empty()) {

sSockName = GetSockName();

}

return m_pModule>GetManager()>ListenAll(uPort, sSockName, bSSL, SOMAXCONN, this);

}

CModule* CSocket::GetModule() const { return m_pModule; }

/////////////////// !CSocket ///////////////////

Explanation / Answer

Main.java

import java.io.IOException;

import java.nio.file.Files;

import java.nio.file.Paths;

import java.util.Scanner;

public class Main {

public static String[] findLCSdyn(String text1, String text2) {

int m = text1.length()+1; //+1 to account for row of 0s

int n = text2.length()+1;

int[][] c = new int[m][n];

int[][] b = new int[m][n];

String[] alignment = new String[]{"",""};

for(int i=0;i<m;i++){

c[i][0] = 0;

}

for(int j=0;j<n;j++){

c[0][j] = 0;

}

for(int i=1;i<m;i++){

for(int j=1;j<n;j++){

if(text1.charAt(i-1)==text2.charAt(j-1)){

c[i][j] = c[i-1][j-1] + 1;

b[i][j] = 'd'; //d for diagonal

}

else if(c[i-1][j] >= c[i][j-1]){

c[i][j] = c[i-1][j];

b[i][j] = 'u'; //u for up

}

else{

c[i][j] = c[i][j-1];

b[i][j] = 'l'; //l for left

}

}

}

longest = c[m-1][n-1];

print(b,alignment,text1,text2);

StringBuffer a0 = new StringBuffer(alignment[0]);

a0.reverse();

alignment[0] = a0.toString();

StringBuffer a1 = new StringBuffer(alignment[1]);

a1.reverse();

alignment[1] = a1.toString();

return alignment;

}

public static void print(int b[][], String[] alignment, String text1, String text2){

int i = text1.length();

int j = text2.length();

while(i>=0 && j>=0){

if(i==0){ //if i is 0 and j isnt, then chars need to be added to a[1]

while(j!=0){

alignment[0] += '-';

alignment[1] += text2.charAt(j-1);

j--;

}

return;

}

else if(j==0){

while(i!=0){

alignment[0] += text1.charAt(i-1);

alignment[1] += '-';

i--;

}

return;

}

else{

if(b[i][j] == 'd'){

alignment[0] += text1.charAt(i-1);

alignment[1] += text2.charAt(j-1);

i--;

j--;

}

else if(b[i][j] == 'u'){

alignment[0] += text1.charAt(i-1);

alignment[1] += '-';

i--;

}

else{

alignment[0] += '-';

alignment[1] += text2.charAt(j-1);

j--;

}

}

}

}

private static int longest = 0;

// recursive helper for DFS

private static void dfs_solve(int i1, int i2, String s1, String s2, char[] out1, char[] out2, int score, int index) {

  

if ((i1 >= s1.length()) && (i2 >= s2.length())) {

if (score > longest) {

out1[index] = '';

out2[index] = '';

longest = score;

sol1 = String.valueOf(out1).substring(0, String.valueOf(out1).indexOf(''));

sol2 = String.valueOf(out2).substring(0, String.valueOf(out2).indexOf(''));

}

}

else if ((i1 >= s1.length()) && (i2 < s2.length())) { // at the end of first string

out1[index] = '-';

out2[index] = s2.charAt(i2);

dfs_solve(i1, i2 + 1, s1, s2, out1, out2, score, index+1);

}

else if ((i1 < s1.length()) && (i2 >= s2.length())) { // at the end of second string

out1[index] = s1.charAt(i1);

out2[index] = '-';

dfs_solve(i1 + 1, i2, s1, s2, out1, out2, score, index+1);

}

else {

if (s1.charAt(i1) == s2.charAt(i2)) { // matching next character

out1[index] = s1.charAt(i1);

out2[index] = s2.charAt(i2);

dfs_solve(i1 + 1, i2 + 1, s1, s2, out1, out2, score + 1, index + 1);

}

  

out1[index] = '-';

out2[index] = s2.charAt(i2);

dfs_solve(i1, i2 + 1, s1, s2, out1, out2, score, index + 1);

  

out1[index] = s1.charAt(i1);

out2[index] = '-';

dfs_solve(i1 + 1, i2, s1, s2, out1, out2, score, index + 1);

}

}

// Used for DFS solution

private static String sol1, sol2;

// recursively searches for longest substring, checking all possible alignments

public static String[] findLCSdfs(String text1, String text2) {

int max_len = text1.length() + text2.length() + 1;

char[] out1 = new char[max_len];

char[] out2 = new char[max_len];

  

dfs_solve(0, 0, text1, text2, out1, out2, 0, 0);

  

String[] ret = new String[2];

ret[0] = sol1; ret[1] = sol2;

return ret;

}

// returns the length of the longest string

public static int getLongest() {

return longest;

}

public static void main(String[] args) {

Scanner s = new Scanner(System.in);

String file1, file2, text1 = "", text2 = "";

System.out.printf("Enter <text1> <text2> <algorithm>, ([dfs] - depth first search, [dyn] - dynamic programming): ");

System.out.printf("(e.g: text/a.txt text/b.txt dfs) ");

file1 = s.next();

file2 = s.next();

try {

text1 = new String(Files.readAllBytes(Paths.get(file1)));

text2 = new String(Files.readAllBytes(Paths.get(file2)));

} catch (IOException e) {

System.err.println("Cannot open files " + file1 + " and " + file2 + ". Exiting.");

System.exit(0);

}

String algo = s.next();

String[] result = {""};

switch (algo) {

case "dfs":

result = findLCSdfs(text1, text2);

break;

case "dyn":

result = findLCSdyn(text1, text2);

break;

default:

System.out.println("Invalid algorithm");

System.exit(0);

break;

}

s.close();

System.out.printf("Best cost: %d Longest string alignment: %s %s ", longest, result[0], result[1]);

}

}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote