How to reduce costs of software errors
Hello, folks! I am Alexey, CEO of Verdat. Today, I'm going to talk to you about the concept of a secure environment.
So, you're a business manager, and your company is automated by software developed by your engineers. This means that the software makes a lot of operational decisions. Therefore, the involvement of engineers (humans who can make mistakes) in the development leads to errors in the software and, consequently, in operational decisions. What if errors are unacceptable because of their high cost?
Operations with an unacceptably high error price, we will name sensitive
For example, in an electronic payment system, it is unacceptable that millions of dollars be credited to customers by mistake. But this can happen easily, for example, if a critical section is not coded properly and deposits were duplicated due to race condition. Congratulations, you just lost a lot of money because of some kind of geek gibberish, which the development team's leader confusingly explains to you before going to the cashier for his pretty big salary.
It must be understood that errors can occur either by the error of the engineer or by the malicious intent of those who have decided to damage the company or become rich through fraud. Is it a big difference whether you lose money by accident or intentionally?
Well, enough rhetorical questions, let's describe the algorithm of actions. It includes only three steps. First step.
You should start with a list of sensitive automated operations. In the example of an electronic payment system, this can include an increase of clients' accounts' balances and withdrawals of funds to an external payment system.
The success of the initial step will require synergy between programmers and business people, since, as practice shows, the full picture can only be achieved by combining ideas from the technology and business. Step two.
Ask engineers to create an additional environment (the servers on which the programs are run) and to move all components of the system that make sensitive operational decisions into it. Remove access to the new environment from ordinary engineers, and grant to a small group of engineers which you trust. Do not pay attention to the indignation. From now on, all changes in the new environment can take place only after the review and approval of trusted engineers.
Great, you created your first secure environment. Third step
, the most important and the most difficult.
Any logic that does not affect sensitive business decisions must be moved away from the secure environment. Only simple and thin parts of the system that are directly involved in making sensitive operational decisions remain. And that is why:
- The bigger the program, the more errors it contains. For example, if the program has 0 lines of code, then definitely there are no errors in it. According to the statistics, the engineer makes an error for every 1000 lines of code on average.
- The bigger the program, the harder it will be for your trusted engineers to find the error on review before delivering the upgrade to the secure environment.
- The bigger the program, the more often it is changed. Each update is a risk of breaking the stability of the system making sensitive operational decisions.
Creating a secure environment will almost certainly face the resistance of its engineers. Let us examine the typical objections and how to respond to them.
The unnatural division of components into parts that need to work in and outside a secure environment will complicate development and make the system sophisticated, leading to many errors.
True, at the exit we get a more complex system with an increased number of errors. But it is not the number of errors that matters, but its total cost. And the total cost of errors in a system using a secure environment is much lower than that of a system without it because errors are not made in sensitive operational decisions.
Creating a secure environment means that programmers are considered to be bad specialists and people who are likely to cheat.
That's not true. The point is, even the most competent and talented engineers are human beings, and human beings make mistakes. Accidents when a system administrator mistakenly erases data from a critical database or group of programmers cannot not see a critical error are too frequent and cost companies too much to be accused of a negligible fluctuation. Sometimes even such an error is enough to bankrupt the company.As far as cheat is concerned, take the following example. Imagine that a team of builders repairs your apartment. They look honest and professional. But what could be more unnatural and stupid than keeping a money box open when the builders are at your apartment? And not because you think the builders are thieves. It's just a "hygiene" issue.
Don't panic! We develop high-quality software and the risk is very exaggerated.
First of all, the risk estimate and its level of acceptance can only be done by the business leader, and not by the engineers, who will receive a salary even after an error, cause business knockout. Second, you must follow the rule "The weapon is always loaded." Even though you're 200% sure there are no bullets in the gun and no danger, you should still treat the weapon as if it could shoot. And maybe this will save your life one day. Remember Murphy's rule: if something can go wrong, it'll go wrong.
So, let's sum up. If you want to reduce the cost caused by software errors, you must use a secure environment and put all components which make sensitive operational decisions in it. Any logic that does not affect these decisions should be removed from the secure environment.
Next time, I will share more recipes on how to decrease the risk of errors and explain how Verdat's product makes the use of such recipes as easy as a cake.