System

static/images/da_Vinci_exploded_view.jpeg

Leonardo da Vinci. 1478 - 1519. Codex Atlanticus.

/static/images/system.jpg

A system (from Greek synistanai "to place together, organize, form in order,", from syn- "together" + histanai "cause to stand") is a set of interacting or interdependent components that form an integrated whole. For example, an automobile, an airplane, a bicycle, an organism, a building, a communication system, a computer program, a garment, a human organization, a `mechanical system`_, the nervous system, the `Solar System`_, a spacecraft_, or a tire.

Contents

1   Function

The function of a system is to solve a problem of the commissioner.

Function = Design goals. The design goals of a system influence its structure.

2   Behavior

Behavior is the way in which a system acts in response to a particular situation or stimulus.

3   Substance

static/images/complex_koala.jpg

A system consists of components, connectors, and data.

3.1   Component

Merge component into this.

A component is an abstract mechanism that provides a transformation of data via its interface; the implementation of a component is unimportant.

Components may generate data, as in the case of a software encapsulation of a clock or sensor.

3.2   Connector

A connector is an abstract mechanism that mediates communication among components by transferring data elements from one interface to another without changing the data. Informally, a connector "wires" two components together.

For example, shared representations, remote procedure calls, message-passing protocols, and data streams.

Making connectors explicit means that much more complex structures can be wired together using the ADL.

3.3   Datum

A datum is an element of information that is transferred between components via connectors.

Examples include byte-sequences, messages, marshalled parameters, and serialized objects, but do not include information that is permanently resident or hidden within a component.

3.4   Documentation

Blend Labs documents the architecture of their project by putting READMEs in every directory of their application. This is really great, because it displays nicely in GitHub.

3.5   Vestige

A vestige is a trace of something that is disappearing or no longer exists.

This arrangement was originally intended to slow down the typist using a mechanical typewriter to avoid the keys' jamming. This gave rise to the expression "qwerty effect" to signal an arrangement that one is stuck with even though the original reason has evaporated.

4   Properties

The distinction between external and internal factors was introduced in a 1977 General Electric study commissioned by the US Air Force (McCall 1977). [3]

4.1   External factors

External factors are those whose presence or absence in a software product may be detected by its users. [3]

4.1.1   Compatibility

Compatibility is the ability of a system to easily combine with others.

4.1.2   Ease of use (Usability)

Ease of use is the ease with which people of various background can learn to use software products and apply them to solve problems. [3] Also includes the ease of installation, operation and monitoring.

Benefits:

  • Minimized need for training
  • Improved efficiency
  • Decreased error rate

4.1.3   Functionality

Functionality is the extent of possibilities provided by the system. [3] It is important to note here that functionality needs to managed carefully to avoid introducing a level of complexity that makes it hard to understand.

The pressure for more facilities, know in industry parlance as "featurism," is contently there. Actually the combination of two problems. Loss of consistency that may result from new features, affecting ease of use. Being so focused on features as to forget other qualities

4.1.4   Integrity

The ability of software system to protect their various components against unauthorized access and modification. [3]

4.1.5   User-perceived performance

User-perceived performance measure the performance of an action in terms of its impact on the user in front of an application The primary measures for user-perceived performance are latency and completion time.

4.1.5.1   Latency

Latency is the time period between initial stimulus and the first indication of a response.

Latency is the minimum time required to get any form of response, even if the work to be done is nonexistent. It's usually the big issue in remote systems. If I ask a program to do nothing, but to tell me when it's done doing nothing, I should get an almost instantaneous response on my laptop. However if I run the program on a remote computer, I may get a few seconds just because of the time taken for the request and response to make their way across the wire. As an application developer, I can usually not do anything to improve latency. Latency is the reason why you should minimize remote calls. [5]

Responsiveness is about how quickly the system acknowledges a request as opposed to processing it. This is important in many systems because users may become frustrated if a system has low responsiveness, even if its response time is good. If your system waits during the whole request, then your responsiveness and response time are the same. However, if you indicate that you've received the request before you complete, then your responsiveness is better. Providing a progress bar during a file copy improves the responsiveness of your user interface, even though it doesn't improve response time. [5]

4.1.5.2   Response time

Response time is the amount of time it takes for a system to process a request from the outside. This may be a UI action, such as a pressing a button, or a server API call. [5]

Completion time is the amount of time taken to complete an application action. Completion time is dependent upon all of the aforementioned measures. [3]

For example, a Web browser that can render a large image while it is being received provides significantly better user-perceived performance than one that waits until the entire image is completely received prior to rendering, even though both experience the same network performance. [3]

It is important to note that design considerations for optimizing latency will often have the side-effect of degrading completion time, and vice versa. [3]


[9]

When a computer and its users interact at a pace that ensures that neither has to wait on the other, productivity soars, the cost of the work done on the computer tumbles, employees get more satisfaction from their work, and its quality tends to improve. Few online computer systems are this well balanced; few executives are aware that such a balance is economically and technically feasible.

In fact, at one time it was thought that a relatively slow response, up to two seconds, was acceptable because the person was thinking about the next task. Research on rapid response time now indicates that this earlier theory is not borne out by the facts: productivity increases in more than direct proportion to a decrease in response time.

A transaction consists of a user command from a terminal and the system's reply. It is the fundamental unit of work for online system users. It can be divided into two time sequences (Figure 1):

User Response Time. This is the time span between the moment a user receives a complete reply to one command and enters the next command. People often refer to this as think time.

System Response Time. This is the time span between the moment the user enters a command and the moment a complete response is displayed on the terminal. System response time can be further divided into:

  • Computer response time, the time the computer actually spends processing and servicing the user's command
  • Communication time, the transit time for a command to go to the computer and the time for the reply to come back

When online systems first began to spread throughout the business world, psychologists such as Robert B. Miller, then of IBM's Poughkeepsie laboratory, argued that two seconds was the longest a person should wait for a response from the computer. This interval became a challenge that designers and managers of online systems strove to meet. With those early online systems, this was not easy, but people comforted themselves with the thought that the user was thinking out the next step in the transaction stream while waiting for the computer to reply. Implicit was the belief that users were thinking as rapidly as they could, uninfluenced by how long the system took to respond.

Today's online systems, easily performing many millions of instructions per second with memories far larger than the largest available with the most powerful of IBM's System/360 machines, can now respond to hundreds of users in less than two seconds each. Walter J. Doherty, of IBM's Thomas J. Watson Research Center, was one of the first to see the significance of this rapid improvement in system capability. He and Richard P. Kelisky, Director of Computing Systems for IBM's Research Division, wrote about their observations in 1979, "...each second of system response degradation leads to a similar

degradation added to the user's time for the following [command]. This phenomenon seems to be related to an individual's attention span. The traditional model of a person thinking after each system response appears to be inaccurate. Instead, people seem to have a sequence of actions in mind, contained in a short-term mental memory buffer. Increases in SRT [system response time] seem to disrupt the thought processes, and this may result in having to rethink the sequence of actions to be continued."

In a pioneering article, inspired by Doherty's work, Arvind J. Thadhani, of IBM's San Jose Laboratory, suggests that the number of transactions a programmer completes in an hour increases noticeably as system response time falls, and rises dramatically once system response time falls below one second. To illustrate (Figure 2), with system response of three seconds, Thadhani found that a programmer executes about 180 transactions per hour. But, bring system response time down to 0.3 seconds and the number of transactions the programmer can execute in an hour jumps to 371, an increase of 106 percent. Put another way, a reduction of 2.7 seconds in system response saves 10.3 seconds of the user's time (Figure 3). This seemingly insignificant time saving is the springboard for sizable increases in productivity.

Saving a few seconds of a person's time here and there may seem to be of little matter, but these seconds accumulate rapidly and build quickly to represent large dollar amounts, large enough to more than justify the cost of installing a larger processor if one is needed to provide more rapid system response.

4.1.6   Portability

Portability is the ease of transferring software products to various hardware and software environments. [3]

4.1.7   Reliability

4.1.7.1   Correctness

Correctness is the ability of a system to function according to its specification. [3]

Prime quality. If a system doesn't do what it's supposed to do, everything else is irrelevant.

Must be able to specify requirements concisely.

Methods for ensuring correctness will usually be conditional. A serious software system touches on some many areas that it would be impossible to guarantee its correctness by dealing with all components and properties on a single level. Instead, a layered approached is necessary, each layer relying on lower ones. In the conditional approach, we only worry about guaranteeing correctness on the assumption that the lower levels are correct. This is the only realistic technique, as it achieves a separation of concerns and lets us concentrate at each stage on a limited set of problems.

4.1.7.2   Robustness

Robustness is the ability of a system to react appropriately to abnormal conditions. [3] (The opposite might be called fragility.)

Taleb illustrates this with an analogy about a child which is raised by its parents in a completely sterile environment having a perfect life without any hard times. That child will likely grow up with many allergies and will not be able to navigate the real world.

There will always be cases the specification does not explicitly address. The role of robustness requirement is to make sure that if such cases do arise, the system does not cause catastrophic events; it should produce appropriate error messages, terminate its execution cleanly, or enter a so-called "graceful degradation" mode.

Complements correctness. Correctness addresses the behavior of a system in cases covered by its specification; robustness characterizes what happens outside of the specification.

(This notion is more useful in the case of exception handling. The point is to address how bad exceptions are: should the program terminate if user privacy is breached, or should there be some mode of graceful degradation?) Of course, widening the specification reduces what becomes abnormal.

I wonder if we could not only throw exceptions, but tag with levels of severity. e.g., if X fails, should it kill the program or is it okay if it fails?)

For example: a maps application is robust when it can parse addresses in various formats with various misspellings and return a useful location. [1]

4.1.7.3   Fault-tolerance

An system is fault-tolerant if it can work consistently in an inconsistent environment.

For example, a database application is fault-tolerant when it can access an alternate shard when the primary is unavailable. [1]

Netflix uses chaosmonkey to a test their system for reliability.

4.2   Internal factors

Internal factors are factors that only computer professionals who have access to the actual software text can perceive. [3]

Internal factors, while ultimately irrelevant, are a means to achieving external factors. [3]

4.2.1   Capacity

Capacity is an indication of maximum effective throughput or load. This might be an absolute maximum or a point at which the performance dips below an acceptable threshold. [5]

4.2.2   Complexity

See complexity.

Simplicity implies ease of use, but ease of use does not imply simplicity.

There are three means of reducing complexity in any type of system: partitioning the system into parts having identifiable and understandable boundaries, representing the system as a hierarchy, and maximizing the independence among the parts of the system. [7]

Partitioning a system into individual can reduce its complexity.

4.2.3   Efficiency

Efficiency is performance divided by resources. [5]

Efficient is the ability of a software system to minimize demands on hardware resources. [3] For example, processor time, space occupied in internal and external memories.

  • The concern for efficiency must be balanced with other goals such as extendibility and reusability; extreme optimizations may make the software so specialized as to be unfit for change and reuse.

    "The issue reflects a major characteristic of software engineering: software construction is difficult precisely because it requires taking into account many different requirements, some of which, such as correctness, are abstract and conceptual, whereas other, such as efficiency, are concrete and bound to the properties of computer hardware."

  • The constant improvement in computer power is not an excuse for overlooking efficiency.

4.2.4   Modifiability

Modifiability refer to the ease with which a change can be made to an application architecture.

Perfect modifiability would imply that the cost of changes to a system are proportional to the size (in some sense) of the change.

4.2.4.2   Customizability

Customizability refers to the ability to temporarily specialize the behavior of an architectural element, such that it can then perform an unusual service.

4.2.5   Load

Load is a statement of how much stress a system is under, which might be measured in how many users are currently connected to it. The load is usually a context for some other measurement, such as a response time. For example, you might say that the response time for some request is 0.5 second with 10 users and 2 seconds with 20 users. [5]

4.2.6   Load sensitivity

Load sensitivity is an expression of how the response time varies with the load. We might use the term degradation to say that system B degrades more than system A. [5]

4.2.7   Modularity

Modularity is the degree to which a system's components may be separated and recombined.

Modularity is the sum of extendibility and reusability. [3]

4.2.7.1   Extensibility

Extensibility is the ease of adapting a system to a change of specification. [3]

Extensibility is defined as the ability to add functionality to a system

The problem of extendability is one of scale; change becomes increasingly difficult with size. This matters largely because "requirements change" as the domain the software addresses is better understood, usually undermining assumptions. This is a relatively new value: traditional software development worked like other engineering disciplines (waterfall).

  • The problem of extendibility is one of scale. For small programs change is usually not a difficult issue; but as software grows bigger,it becomes harder and harder to adapt.
  • Two principles are essential for improving extendability:
    • Design simplicity - A simple architecture will always be easier to adapt to changes than a complex one.
    • Decentralization - The more autonomous the module, the higher the likelihood that a simple change will affect just one module, or a small number of modules, rather than triggering off a chain reaction of changes over the whole system.

4.2.7.2   Evolvability

Evolvability represents the degree to which a component implementation can be changed without negatively impacting other components.

4.2.7.3   Reusability

Reusability is the ability of software elements to serve for the construction of many different applications. [3]

Reusability refers to the ability of a component, connector, or data element to be reused without modification in other applications.

The primary mechanisms for inducing reusability within architectural styles is reduction of coupling (knowledge of identity) between components and constraining the generality of component interfaces.

  • Software systems often follow similar patterns; it should be possible to exploit this commonality and avoid reinventing solutions to problems that have been encountered before.
  • Solving the reusability problem essentially means that less software must be written, and hence that more effort may be devoted to improving other factors.

4.2.8   Performance

Performance is either throughput or response time -- whichever matters more to you. It can be difficult to discuss performance when a technique improves one of these at the cost of the other, so it's best to use the more precise term. [5]

4.2.9   Scalability

Scalability is a measure of how adding resources (usually hardware) affects performance. Vertical scalability, or scaling up, means adding more power to a single server, such as more memory. Horizontal scalability, or scaling out, means adding more servers. [5]

Scalability refers to the ability of the architecture to support large numbers of components or interactions among components.

Scalability can be improved by simplifying components, distributing services across many components, and controlling interactions and configurations as a result of monitoring.

4.2.10   Testability

Testability is the relative ease and expense of revealing software faults.

For a system to be testable, it must be possible to control the variables of an environment during an experiment and to observe the result after modification. This does not necessarily imply a sense of measurement-- for example, an object may change colors when treatment is applied.

4.2.11   Throughput

Throughput is how much work you can do in a given amount of time. If you're timing the copying of a file, throughput might be measured in bytes per second. [5]

Throughput can be improved by using the appropriate algorithms and data structures.

There are a few piece of conventional wisdom that we often hear about code optimization:

  • Only optimize the bottlenecks, and only after they are identified
  • Constant factors don't matter
  • Engineering time costs more than CPU time
  • The machine is so fast it won't matter

Performance tuning is neither necessary nor desirable once the performance is within the required limits. Modern computers are so fast that the threshold of adequacy is easily reached. Typical applications can and do waste much of their capacity and get away with it.

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

—Donald Knuth, Structured Programming with go to Statements (1974) http://web.archive.org/web/20130731202547/http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf

This quite is arguably quoted out of context. Later in the same article Knuth writes:

The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise- and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don't want to restrict myself to tools that deny me such efficiencies.

Performance faults often show up when programs fail to stop immediately when exited, and as progress bars and splash screens.

4.2.12   Visibility

Visibility refers to the ability of a component to monitor or mediate the interaction between two other components.

Visibility can enable improved performance via shared caching of interactions, scalability through layered services, reliability through reflective monitoring, and security by allowing the interactions to be inspected by mediators.

5   Classification

5.2   Distributed system

See distributed system.

6   Grading

Of course, no engineering solution is truly objective. It's likely that even two engineers of equivalent skill might come up with different but equally valid solutions to a problem.

We can recognize quality software via external factors, perceptible to users and clients, and internal factors, perceptible to designers and implementors.

6.1   External quality

External quality is how well a system meets the needs of its users. [4]

Correctness is usually the most important quality. If a system does not do what it is supposed to do, everything else matters little. (On the other hand, knowing a system is correct is often not particularly important.)

6.2   Internal quality

Internal quality is how well a system meets the needs of its developers and administrators. [4] Internal quality is what lets us cope with continual and unanticipated change. The end of maintaining internal quality is to allow us to modify the system's behavior safely and predictably, because it minimizes the risk that a change will force major rework. [4]

7   Production

A person who produces systems is called an engineer.

The process of producing systems is called an engineering project. It consists of several steps:

  1. Requirements analysis, which produces a functional specification
  2. Architecture, which given a functional specification, produces a technical specification
  3. Construction
  4. Maintenance

These steps may be iterative or ....

Producing a system should be like wiring together electronics components. (Move to construction?)

8   Maintenance

This section needs work

There are at least four reasons to change software:

  1. Enhancement: Adding a feature
  2. Patch: Fixing a bug
  3. Improving the design
  4. Optimizing resource usage

There are three kinds of changes:

  1. Behavioral change
  2. Non-behavioral change
    1. Structural change ("refactoring") (changes internal structure to increase clarity or ease of modification)
    2. Performance change

Typically, we want to preserve more behavior than not in all cases.

To mitigate change risk, we ask three questions:

  1. What changes do we have to make?
  2. How will we know that we've done them correctly?
  3. How will we know that we haven't broken anything?

Many teams manage risk by by minimizing the number of changes that they make to the system. Maxim: "If it's ain't broke, don't fix it" This approach tends to cause technical debt.


Changes in a system can be made in two ways:

  1. Edit and pray (carefully plan change, make sure you understand code, make the changes, then poke around to make sure it works and nothing broke)
  2. Cover and modify

Prepatory refactoring is where I'm adding a new feature, and I see that the existing code is not structured in such a way that makes adding the feature easy. So first I refactor the code into the structure that makes it easy to add the feature, or as Kent Beck pithily put it "make the change easy, then make the easy change". [6]

It’s like I want to go 100 miles east but instead of just traipsing through the woods, I’m going to drive 20 miles north to the highway and then I’m going to go 100 miles east at three times the speed I could have if I just went straight there. When people are pushing you to just go straight there, sometimes you need to say, “Wait, I need to check the map and find the quickest route.” The preparatory refactoring does that for me.

—Jessica Kerr

Another good metaphor I've come across is putting tape over electrical sockets, door frames, skirting boards and the like when painting a wall. The taping isn't doing the painting, but by spending the time to cover things up first, the painting can be much quicker and easier. [6]

9   Configuration

A configuration is the structure of architectural relationships among components, connectors, and data during a period of system run-time.

10   Representation

Systems are represented by multiple artifacts.

10.1   Functional specification

A functional specification describes the function of a systems.

10.2   Technical specification

A technical specification describes the internal implementation of the program. It talks about data structures, relational database models, choice of programming languages and tools, algorithms, etc. For example, a `schematic diagram`_.

11   Cost

12   Maintenance

Maintenance is ...

Maintenance actions are subdivided into groups or categories in several different ways; for example:

The operational-technical and overhaul-repair categories can be bound together according to the technical knowledge and skill needed to do the work.

> Operational maintenance is a type of preventative maintenance used to extend the life of equipment and maximize performance. It includes many types of minor adjustments, cleaning, and inspections, depending on the machine. While major repairs are typically handled by trained technicians, operational maintenance is performed during the normal course of operations by the equipment operator himself. By training operators to handle these routine tasks, companies can help reduce downtime and cut costs associated with repairs and replacement part

> Depending on the type of equipment in use, operators may also be responsible for replacing worn out filters or cartridges, or removing and replacing a worn belt, cutting tool, or grinding stone. Operational maintenance may entail keeping machinery well lubricated to reduce the risk of friction or failure. Many basic machine adjustments needed during the course of operation also fall within this category of preventative maintenance.

This is what IT does!

# RCM

Reliability centered maintenance is an engineering framework that enables the definition of a complete maintenance regime. It regards maintenance as the means to maintain the functions a user may require of machinery in a defined operating context. As a discipline it enables machinery stakeholders to monitor, assess, predict and generally understand the working of their physical assets. This is embodied in the initial part of the RCM process which is to identify the operating context of the machinery, and write a Failure Mode Effects and Criticality Analysis (FMECA). The second part of the analysis is to apply the "RCM logic", which helps determine the appropriate maintenance tasks for the identified failure modes in the FMECA. Once the logic is complete for all elements in the FMECA, the resulting list of maintenance is "packaged", so that the periodicities of the tasks are rationalised to be called up in work packages; it is important not to destroy the applicability of maintenance in this phase. Lastly, RCM is kept live throughout the "in-service" life of machinery, where the effectiveness of the maintenance is kept under constant review and adjusted in light of the experience gained.

12.1   Maintenance, repair, and operations

Operations people value maximum production. Maintenance people value preservation of the equipment.

From an Operations point of view, running the equipment and producing product 100% of the time is the ultimate goal. However, from a Maintenance point of

view, taking the equipment down for repair or renewal is equally as important.

13   Second-system effect

The second-system effect is the tendency of small, elegant, and successful systems to have elephantine feature-laden monstrosities as their successors due to inflated expectations.

The phrase was first used by `Fred Brooks`_ in his classic The Mythical Man-Month. It described the jump from a set of simple operating systems on the IBM 700/7000 series to OS/360 on the 360 series.

Less well known, because less common, is the ‘third-system effect’; sometimes, after the second system has collapsed of its own weight, there is a chance to go back to simplicity and get it really right.

14   Further reading

Distributed systems:

15   References

[1](1, 2) http://programmers.stackexchange.com/questions/219976/whats-the-difference-between-robustness-and-fault-tolerance
[2]http://www.ics.uci.edu/~fielding/pubs/dissertation/net_app_arch.htm
[3](1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19) Meyer 1988
[4](1, 2, 3, 4) Freeman Pryce 2010
[5](1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11) Fowler 2002
[6](1, 2) Martin Fowler. Jan 05 2015. An example of prepatory refactoring. http://martinfowler.com/articles/preparatory-refactoring-example.html
[7]Glenford J. Myers. 1978. Composite/Structure Design.
[8]Andrew Tanenbaum. 1981. Computer Networks.
[9]Walter J. Doherty, Manager of Systems Performance and Technology Transfer for the Computing Systems Department at IBM's Thomas J. Watson Research Center and Ahrvind J. Thadani, Advisory Engineer at IBM's General Products Division headquarters, San Jose, California. November 1982. The Economic Value of Rapid Response Time. http://jlelliotton.blogspot.com/p/the-economic-value-of-rapid-response.html

A system is deployable when the acceptance tests all pass. [4]


Have I missed a step in the engineering process? Namely, deploying.


Engineer is difficult to break down because there are many ideas of how to break it down. Six-sigma is popular for most engineering. But there are also iterative approaches. The only essential stage seems to be construction.

Under the auspices of the Society, in 1959 the industrialist Henry Kremer offered the first Kremer Prizes of £5,000 for the first human-powered aircraft to fly a figure-of-eight course round two markers half-a-mile apart.

In 1973, Kremer increased his prize money tenfold to £50,000. At that time, the human-powered aircraft had flown only in straight (or nearly straight) line courses, and no-one had yet even attempted his more challenging figure-eight course, which required a fully controllable aircraft. He also opened the competition to all nationalities; previously it was restricted to British entries only.

On 23 August 1977, the Gossamer Condor 2 flew the first figure-eight, a distance of 2,172 metres winning the first Kremer prize. It was built by Dr Paul B. MacCready and piloted by amateur cyclist and hang-glider pilot Bryan Allen. Although slow, cruising at only 11 mph (18 km/h), it achieved that speed with only 0.35 hp (0.26 kW).

The second Kremer prize of £100,000 was won on June 12, 1979, again by Paul MacCready, when Bryan Allen flew MacCready's Gossamer Albatross from England to France: a straight distance of 35.82 km (22 miles 453 yards) in 2 hours, 49 minutes.

The Gossamer Condor is built around a large wing with a gondola for the pilot underneath and a canard control surface on a fuselage extension in front, and is mostly built of lightweight plastics with aluminum spar.

The difference, according to Eric Allen, was that the engineers questioned not how to solve building an aircraft, but how to solve that.

The Flight of the Gossamer Condor, a 1979 short documentary film.


The mill and lathe re the minimum tools needed for a mechanical engineering. With those two, an engineer could build an entire workshop.


Software is unique because, unlike every other engineering discipline, it does not depend on any other disciplines. For example, electrical engineering can build their own oscilloscopes, which tend to be fantastic, because it was built by and for electrical engineers, but they cannot make their own software, which is why their software tools, such as Verilog_, still suck. Software gains the benefit of having tools built by and for software engineers being excellent, but they don't require tools from anyone else.

Software engineers have also automated so much of the process of engineering that the creative part, typically a small part in most engineering disciplines, is actually the majority of it.


When building enterprise systems, it often makes sense to build for hardware scalability rather than capacity or even efficiency. Scalability gives you the option of better performance if you need it and is often easier to do. New hardware is often cheaper than making software run on less powerful systems, and more servers is cheaper than more programmers. [5]


One trap that’s easy to fall into is looking at the full feature set of a product during evaluation. Who cares about everything a product can do. Our project has a specific list of requirements, and those are the only requirements we should care about.


The Swiss cheese model of accident causation illustrates that, although many layers of defense lie between hazards and accidents, there are flaws in each layer that, if aligned, can allow the accident to occur.

Therefore, in theory, lapses and weaknesses in one defense do not allow a risk to materialize, since other defenses also exist, to prevent a single point of failure. The model was originally formally propounded by Dante Orlandella and James T. Reason of the University of Manchester, and has since gained widespread acceptance. It is sometimes called the cumulative act effect.