Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

How screwed am I?” asked a recent user on Reddit, before sharing a mortifying st

ID: 3885561 • Letter: H

Question

How screwed am I?” asked a recent user on Reddit, before sharing a mortifying story. On the first day as a junior software developer at a first salaried job out of college, his or her copy-and-paste error inadvertently erased all data from the company’s production database.

Posting under the heartbreaking handle cscareerthrowaway567, the user wrote:

The CTO told me to leave and never come back. He also informed me that apparently legal would need to get involved due to severity of the data loss. I basically offered and pleaded to let me help in someway to redeem my self and i was told that i “completely fucked everything up”…

I haven’t heard from HR, or anything and i am panicking to high heavens. I just moved across the country for this job, is there anything i can even remotely do to redeem my self in this situation? Can i possibly be sued for this? Should i contact HR directly? I am really confused, and terrified.

In December, a coding error in Snap’s latest iOS update accidentally jammed the network that keeps more than 15 million computer systems synchronized to the clock. A typo from a busy Clinton campaign aideinadvertently opened the door to the Russian hack of John Podesta’s emails. The British Airways power outage that disrupted tens of thousands of flights last week was reportedly caused by a tech support worker accidentally flipping the power off.

The point is, any system in which humans are involved will at some point be disrupted by human error. Organizations distinguish themselves not by stamping out the possibility of error, but by handling the inevitable mistake well.

As subRedditors saw it, cscareerthrowaway567 made one mistake. The company made several. It didn’t back up the database. It had poor security procedures and a sloppily-organized system that encouraged the very error cscareerthrowaway567 made. Then, rather than taking accountability for those problems, the CTO fired the rookie who revealed them. Of all the errors this company made, that last might be the most destructive to their future success.

An extensive review of employee teams at Google found that the most successful were those with a high level of psychological safety. In other words, when employees felt safe enough to take risks (and make mistakes) without being shamed or criticized, they did better work.

“The wisdom of learning from failure is incontrovertible. Yet organizations that do it well are extraordinarily rare,” wrote Amy Edmondson, the Harvard Business School professor who coined the term “psychological safety.”

For a rare example of a better company response, let’s look back at the engineer error that caused an Amazon server outage on Christmas Eve 2012, which disrupted Netflix and other services that relied on the company for cloud computing. Amazon wrote a detailed account of the event, explaining how the outage occurred, how it was resolved, and what had been changed to prevent the problem from happening again. The focus was on fixing the problem, not blaming the individual.

“For all that’s wrong with Amazon, the best part was when someone fucked up, the team and the company focused only on how we make it never happen again,” a former employee wrote on the forum. “A human mistake was a collective failure, not an individual one.”

Another respondent related all too well.

“Hi, guy here who accidentally nuked GitLab.com’s database earlier this year,” wrote Yorick Peterse, the software developer who accidentally wiped out live production data during a late night work session and nearly melted down the site. GitLab chronicled its recovery efforts live on YouTube and in a Google doc, and treated it as a company problem instead of an individual one.

“GitLab handled this very well,” wrote Peterse, who is still with the company. “Nobody got fired or yelled at, everybody realised this was a problem with the organisation as a whole.”

Peterse now has enough distance from his own experience to also see humor in the sheer scale of such screw-ups, he told Quartz over email. Still, he recognized the pain of the young software developer.

“I also felt quite annoyed by how the company of the story supposedly handled the situation, potentially scarring the junior’s career and in particular confidence,” he wrote from his home in the Netherlands. “Somebody new should be guided through a setup procedure (especially when it involves production access), not thrown into the depths, only to be told to ‘leave and never come back’ when they make a mistake.”

Indeed, the unfortunate young developer’s experience seems to have struck a chord for many, and whether there’s legal action or not, the court of public opinion is on the new guy’s side: In a poll on the tech site the Register, less than 1% of 5,400 respondents thought the new developer should be fired. Forty-five percent thought the CTO should go.

Based on this article both for what happened with a "simple" copy/paste action that created a major incident, the change management process, and your own research-- what do you think should be done to improve not only the secure quality of coding that an employee might be involved in but also the deployment of that code into the "live" production environment -- so as to avoid risk?

Submit your 2 page summary of your approach to providing a better way to ensure secure coding and implementation of that coding into a production environment.

Explanation / Answer

The following steps can be done to improve code quality, make code more secure and to deploy code in live environment in safe and secure manner.

1) For improving the code quality, focus should be given to reviews. Two level of reviews can be introduced via gerrit tool.
2) For improving the code quality, code guidelines sheet must be floated amongs the new developers.
3) New as well as young developers must go under a sesssion similar to "how to code" as well as "how to write effective code"
4) Retrospection amongs the development teams must be done time to time, to increase the work effectiveness as well as to avoid the commanly done coding mistakes.

5) For code security, code repository must have access policies, so that only authentic users can view as well as do check-ins in the code.
6) For securing the company's improtant code, code accessing devices like PC's as well as laptops must have usb disable. So no one can copy the company code in their external harddisk or in pen drives.
7) Various test suits must be developed and should be run on daily bases. These test suits will test the fuctionalities of company products.
8) It will be adviced to run all the testcases on test bed before deploying the actual code on live environment. By doing this we can early catch the code related bugs.
9) New joinees working on live environment, must have given a proper knowledge of live server's and should be allowed to work in the guidance of senior members.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote