As computing devices become more pervasive, the software systems that control them have become increasingly more complex and sophisticated. Computers are increasingly being introduced into safety-critical systems, and, as a consequence, have been involved in accidents. The world has become reliant on software-enabled systems and components. In addition, software is now embedded in the cyberspace domain that enables defense military, intelligence, and business operations. Consequently, despite the tremendous resources devoted to making software more robust and resilient, ensuring that programs are correct—especially at scale—remains a difficult and challenging endeavor.
The $500 million Ariane 5 space rocket self-destructed after a fault occurred 40 seconds from launch in 1996. The main cause was inappropriate software reuse; the code was taken from the Ariane 4, without proper analysis. The Mars space probes Mars Climate Orbiter disintegrated on entry to Mars atmosphere, 1998 and in Mars Polar Lander landing gear prematurely activated on entry to atmosphere also in 1998. The causes in Mars Climate Orbiter were attributed to mismatch between use of anglo-american and metric units, multiple process errors and lack of formal interfaces. The causes of Polar landed were due to lack of integration testing, shock was interpreted as landing the engines were stopped and lander fell.
Between June 1 985 and January 1987, six known accidents involved massive overdoses by the Therac-25 – with resultant deaths and serious injuries. The cause was attributed to the errors in control program, excessive trust in software when designing system, lack of hardware safety measures (interlocks) and lack of appropriate software engineering practices.
In October 2018, Lion Air Flight 610 crashed just minutes after taking off from Jakarta, Indonesia. It was the first fatal accident involving a 737 Max. 189 people died. On March 10 2019, Ethiopian Airlines Flight 302, involving the same Max jet model, also crashed minutes after takeoff, killing all 157 people on board.
In both accidents, the automated Maneuvering Characteristics Augmentation System, or MCAS, pushed the planes’ noses down while the pilots struggled to regain control. The Max has larger engines, which alter the plane’s aerodynamics and make it more likely to stall in some flight conditions. Boeing developed an automatic system, known as MCAS, that pushes the plane’s nose down in some circumstances in order to stabilize the aircraft.
Initial data from the investigation of the crash of Lion Air Flight 610 indicates that the AOA sensor was providing “erroneous input,” according to a Boeing statement.The faulty AOA sensor data may have caused the aircraft’s trim system to lower the nose down in order to avoid a stall that never happened.
Boeing said in May 2019 that it has finished the development of a software fix to its troubled 737 Max. The plane maker said in a statement it has flown the aircraft with the updated software on 207 flights for more than 360 hours. The company has said its fix will feed MCAS with data from two, rather than just one, sensor, making the plane less susceptible to a crash because of bad data. It will also make the system less potent, which is expected to prevent the steep dives seen in the two crashes, and provide additional training materials.
A recent NTT Threat Report spoke about the ‘digital wild west’ and how digital transformation has meant that organizations, irrespective of their core business function, are ‘all software companies now’. COVID-19 has exacerbated this, being an unexpected driver of further digital adoption, as organizations were forced to react to a very different 2020.
Software Assurance
The purpose of software assurance is to assure those software products are of sufficiently high quality and operate safely, securely and reliably. Software assurance assures that the software and its related products meet their specified requirements, conform to standards and regulations, are consistent, complete, correct, safe, secure and as reliable as warranted for the system and operating environment, and satisfying customer needs.
DOD defines Software Assurance as the level of confidence that software functions as intended and is free of vulnerabilities, either intentionally or unintentionally designed or inserted as part of the software throughout the lifecycle. Failures in software assurance can be of particularly high consequence for defense systems due to their growing roles in protecting human lives, in war fighting, and in safeguarding national assets.
NASA’s Approach to Software Assurance
The software assurance process is the planned and systematic set of activities that ensure conformance of software life cycle processes and products to requirements, standards, and procedures.
Software assurance assists in risk mitigation by helping expose potential defects in products and processes, thus preventing problems from evolving. However, it also, through its metrics, tracking and analyses activities, enables improvement of future products and services. Software assurance often serves as the corporate memory from project to project, sharing potential problem areas and lessons learned. It provides a consistent, uniform basis for defining the requirements for software assurance programs to be applied and maintained throughout the life of that software, that is, from project conception, through acquisition, development, operations and maintenance, and then evaluates if the software is properly retired.
Software engineering is a core capability and key enabling technology for NASA’s missions and supporting infrastructure. It applies to the complete software development life cycle, including software planning, development, testing, maintenance, retirement, operations, management, acquisition, and assurance activities. NASA has defined the software development process through a set of standards. A series of software safety characteristics have been incorporated into the development and certification efforts to ensure readiness for use and compatibility with the space systems.
Software life cycle planning covers the software aspects of a project from inception through retirement. The software life cycle planning cycle is an organizing process that considers the software as a whole and provides the planning activities required to ensure a coordinated, well-engineered process for defining and implementing project activities. These processes, plans, and activities are coordinated within the project. At project conception, software needs for the project are analyzed, including acquisition, supply, development, operation, maintenance,
retirement, decommissioning, and supporting activities and processes. The software effort is scoped, the development processes defined, measurements defined, and activities are documented in software planning documents.
NPD 7120.4 is an overarching document that establishes top-level policies for all software created, acquired, and maintained by or for NASA, including Commercial-off-the-shelf (COTS) software, Government-off-the-shelf (GOTS) software, and Modified-off-the-shelf (MOTS) software and open-source software, embedded software, reused software, legacy software, and heritage software. NPR 7150.2 establishes the set of software engineering requirements established by the Agency for software acquisition, development, maintenance, retirement, operations, and management. NPR 7150.2 provides a common framework of Software Engineering and Software Assurance requirements which allows for engineers to effectively communicate and work seamlessly across the agency. This policy establishes the agency-level Software Engineering requirements, Software Assurance requirements, software safety-critical requirements and the scoping requirements for IV&V support for all software acquired or developed at NASA.
NASA performs high-risk functions in the process of achieving its goals and objectives. The Program/Project Manager plans the best risk mitigation strategy for the entire project, of which software is a part. Software assurance is an umbrella risk mitigation strategy for safety and mission assurance of all of NASA’s software.
The project manager shall classify each system and subsystem containing software in accordance with the highest applicable software classification definitions for Classes A, B, C, D, E, and F software. These definitions are based on (1) usage of the software with or within a NASA system, (2) criticality of the system to NASA’s major programs and projects, (3) extent to which humans depend upon the system, (4) developmental and operational complexity, and (5) extent of the Agency’s investment. Defining software safety criticality involves the determination of whether the software is performing a safety-critical function, including verification of safety-critical software, hardware, or operations component, subsystem, or system.
Software assurance assists in risk mitigation by helping expose potential defects in products and processes, thus preventing problems from evolving. However, it also, through its metrics, tracking and analyses activities, enables improvement of future products and services. Software assurance often serves as the corporate memory from project to project, sharing potential problem areas and lessons learned. The project manager shall plan and implement software assurance per NASA-STD-8739.8. [SWE-022]
Software assurance reviews and analyzes all processes used to acquire, develop, assure, operate and maintain the software independently; evaluating if those processes are appropriate, sufficient, planned, reviewed, and implemented according to an adequate plan, meeting any required standards, regulations, and quality requirements.
It provides a consistent, uniform basis for defining the requirements for software assurance programs to be applied and maintained throughout the life of that software, that is, from project conception, through acquisition, development, operations and maintenance, and then evaluates if the software is properly retired.
Software safety characteristics are built into the design and development process to enable the human rated systems to begin their missions safely and successfully. Exploration missions beyond Earth are inherently risky, however, with solid safety approaches in both hardware and software, the boldness of these missions can be realized for all on the home planet. The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8. [SWE-205] 3.7.2 If a project has safety-critical software, the project manager shall implement the safety-critical software requirements contained in NASA-STD-8739.8. [SWE-023]
Software Development Processes and Practices
The CMMI model is an industry-accepted model of software development practices. It is utilized to assess how well NASA projects are supported by software development organization(s) having the necessary skills, practices, and processes in place to produce reliable products within cost having the necessary skills, practices, and processes in place to produce reliable products within cost and schedule estimates.
The CMMI model provides NASA with a methodology to:
a. Measure software development organizations against an industry-wide set of best practices that address software development and maintenance activities applied to products and services.
b. Measure and compare the maturity of an organization’s product development and acquisition processes with the industry state of the practice.
c. Measure and ensure compliance with the intent of the directive’s process related requirements using an industry standard approach.
d. Assess internal and external software development organization’s processes and practices.
e. Identify potential risk areas within a given organization’s software development processes and practices.
The CMMI-DEV is an internationally used framework for process improvement in development organizations. It is an organized collection of best practices and proven processes that thousands of software organizations have both used and been appraised against over the past two
decades. CMMI ratings can cover a team, a group, a project, a division, or an entire organization.
Software Reuse
The project manager shall specify reusability requirements that apply to its software development activities to enable future reuse of the software, including the models, simulations, and associated data used as inputs for auto-generation of software, for United States Government purposes. The project manager shall evaluate software for potential reuse by other projects across NASA and contribute reuse candidates to the NASA Internal Sharing and Reuse Software systems,
Safety-critical software or mission-critical software requirements
If a project has safety-critical software or mission-critical software, the project manager shall implement the following items in the software: [SWE-134]
a. The software is initialized, at first start and restarts, to a known safe state.
b. The software safely transitions between all predefined known states.
c. Termination performed by software functions is performed to a known safe state.
d. Operator overrides of software functions require at least two independent actions by an operator.
e. Software rejects commands received out of sequence when execution of those commands out of sequence can cause a hazard.
f. The software detects inadvertent memory modification and recovers to a known safe state.
g. The software performs integrity checks on inputs and outputs to/from the software system.
h. The software performs prerequisite checks prior to the execution of safety-critical software commands.
i. No single software event or action is allowed to initiate an identified hazard.
j. The software responds to an off-nominal condition within the time needed to prevent a hazardous event.
k. The software provides error handling.
l. The software can place the system into a safe state.
Software Cybersecurity
The project manager shall identify cybersecurity risks, along with their mitigations, in flight and ground software systems and plan the mitigations for these systems. The project manager shall implement protections for software systems with communications
capabilities against unauthorized access. [SWE-157] . 3.11.5 The project manager shall ensure that space flight software systems are assessed for possible cybersecurity vulnerabilities and weaknesses. [SWE-158]. 3.11.6 The project manager shall address identified cybersecurity vulnerabilities and weaknesses. [SWE-155] The project manager shall test the software and record test results for the required software
cybersecurity mitigation implementations identified from the security vulnerabilities and security weaknesses analysis. [SWE-159]
https://www.youtube.com/watch?v=UB9q6vXkMMw
Software Bi-Directional Traceability
The project manager shall perform, record, and maintain bi-directional traceability between Higher-level requirements to the software requirements; Software requirements to the system hazards; Software requirements to the software design components; Software design components to the software code; Software requirements to the software test procedures and Software requirements to the software non-conformances. The project manager will maintain bi-directional traceability between the software requirements and software-related system hazards, including hazardous controls, hazardous mitigations, hazardous conditions, and hazardous events.
Software Engineering Life-Cycle Requirements
The requirements phase is one of the most critical phases of software engineering. Studies show that the top problems in the software industry are due to poor requirements elicitation, inadequate requirements specification, and inadequate management of changes to requirements. Requirements provide the foundation for the entire life cycle, as well as for the software product. Requirements also provide a basis for planning, estimating, and monitoring. Requirements are based on customer, user, and other stakeholder needs and design and development constraints.
The development of requirements includes elicitation, analysis, documentation, verification, and validation. Ongoing customer validation of the requirements to ensure the end products meet customer needs is an integral part of the life-cycle process. Customer validation can be
accomplished via rapid prototyping and customer-involved reviews of iterative and final software requirements.
The software technical requirements definition process is used to transform the baselined stakeholder expectations into unique, quantitative, and measurable technical software requirements that can be used for defining a design solution for the software end products and related enabling products. This process also includes validation of the requirements to ensure that the requirements are well-formed (clear and unambiguous), complete (agrees with customer and stakeholder needs and expectations), consistent (conflict free), and individually
verifiable and traceable to a higher level requirement. Recommended content for a software specification can be found in NASA-HDBK-2203.
Software Architecture
Experience confirms that the quality and longevity of a software-reliant system is primarily determined by its architecture. The software architecture underpins a system’s software design and code; it represents the earliest design decisions, ones that are difficult and costly to change later. The transformation of the derived and allocated requirements into the software architecture results in the basis for all software development work.
A documented software architecture that describes: the software’s structure; identifies the software qualities (i.e., performance, modifiability, and security); identifies the known interfaces between the software components and the components external to the software (both software and hardware); identifies the interfaces between the software components and identifies the software components.
Software Design
Software design is the process of defining the software architecture, components, modules, interfaces, and data for a software system to satisfy specified requirements. The software architecture is the fundamental organization of a system embodied in its components, their relationships to each other and the environment, and the principles guiding its design and evolution. The software architectural design is concerned with creating a strong overall structure for software entities that fulfill the allocated system and software-level requirements.
Typical views captured in an architectural design include the decomposition of the software subsystem into design entities, computer software configuration items, definitions of external and internal interfaces, dependency relationships among entities and system resources, and finite state machines. The design should be further refined into lower-level entities that permit the implementation by coding in a programming language.
Software Implementation
Software implementation consists of implementing the requirements and design into code, data, and records. Software implementation also consists of following coding methods and standards. Unit testing is also usually a part of software implementation (unit testing can also be conducted during the testing phase).
The project manager shall implement the software design into software code. [SWE-060]
The project manager shall select and adhere to software coding methods, standards, and criteria. [SWE-061]
The project manager shall use static analysis tools to analyze the code during the development and testing phases to detect defects, software security, and coding errors. [SWE-135]
The project manager shall unit test the software code. [SWE-062]
The project manager shall assure that the unit test results are repeatable. [SWE-186]
The project manager shall provide a software version description for each software release.[SWE-063]
The project manager shall validate and accredit the software tool(s) required to develop or maintain software. [SWE-136]
All software development tools contain some number of software defects. Validation and accreditation of the critical software development and maintenance tools ensure that the tools being used during the software development life cycle do not generate or insert errors in the
software executable components. Software tool accreditation is the certification that a software tool is acceptable for use for a specific purpose.
Software Testing
The purpose of testing is to verify the software functionality and remove defects. Testing verifies the code against the requirements and the design to ensure that the requirements are implemented. Testing also identifies problems and defects that are corrected and tracked to closure before product delivery. Testing also validates that the software operates appropriately in the intended environment.
The project manager shall establish and maintain: [SWE-065]
a. Software test plan(s).
b. Software test procedure(s).
c. Software test report(s).
The project manager shall test the software against its requirements
The project manager shall place software items under configuration management prior to testing. [SWE-187] This includes the software components being tested and the software components being used to test the software, including components like support software, models, simulations, ground support software, COTS, GOTS, MOTS, OSS, or reused software components.
Software Operations, Maintenance, and Retirement
Planning for operations, maintenance, and retirement are typically considered throughout the software life cycle. Operational concepts and scenarios are derived from customer requirements and validated in the operational or simulated environment. Software maintenance activities sustain the software product after the product is delivered to the customer until retirement.
The project manager shall plan and implement software operations, maintenance, and retirement activities. [SWE-075] The project manager shall complete and deliver the software product to the customer with appropriate records, including as-built records, to support the operations and maintenance phase of the software’s life cycle.
The project manager shall maintain the software using standards and processes per the applicable software classification throughout the maintenance phase. [SWE-195] The project manager shall identify the records and software tools to be archived, the location of the archive, and procedures for access to the products for software retirement or disposal.
Supporting Software Life-Cycle Requirements
Software Configuration Management (SCM)
Software Configuration Management (SCM) is the process of applying configuration management throughout the software life cycle to ensure the completeness and correctness of software configuration items. SCM applies technical and administrative direction and surveillance to identify and record the functional and physical characteristics of software configuration items, control changes to those characteristics, record and report change processing and implementation status, and verify compliance with specified requirements. SCM establishes and maintains the integrity of the products of a software project throughout the software life-cycle. Use of standard Center or organizational SCM processes and procedures is encouraged where applicable.
The project manager shall develop a software configuration management plan that describes the functions, responsibilities, and authority for the implementation of software configuration management for the project.
Software Risk Management
The project manager shall record, analyze, plan, track, control, and communicate all of the software risks and mitigation plans. [SWE-086]
Software Peer Reviews and Inspections
Software peer reviews and inspections are the in-process technical examination of work products by peers to find and eliminate defects early in the life-cycle. Software peer reviews and inspections are performed following defined procedures covering the preparation for the review, the review itself is conducted, results are recorded, results are reported, and completion criteria is certified. When planning the composition of a software peer review or inspection team, consider including software testing, system testing, software assurance, software safety, software cybersecurity, and software IV&V personnel.
The project manager shall perform and report the results of software peer reviews or software inspections for: [SWE-087]
a. Software requirements.
b. Software plans.
c. Any design items that the project identified for software peer review or software inspections according to the software development plans.
d. Software code as defined in the software and or project plans.
e. Software test procedures.
Software Measurements
Software measurement is a primary tool for managing software processes and evaluating the quality of software products. Analysis of measures provides insight into the capability of the software organization and identifies opportunities for software process and product improvements. Software measurement programs at multiple levels are established to meet measurement objectives.
Software Non-conformance or Defect Management
The project manager shall track and maintain software non-conformances (including defects in tools and appropriate ground software). [SWE-201] The project manager shall define and implement clear software severity levels for all software non-conformances (including tools, COTS, GOTS, MOTS, OSS, reused software components, and applicable ground systems). [SWE-202]
References and resources also include:
https://nodis3.gsfc.nasa.gov/npg_img/N_PR_7150_002C_/N_PR_7150_002C_.pdf