Checkpoints: Process View

Can there exist "race conditions" between processes? Do processes compete for critical resources? What happens if they cannot get them?
What happens when I/O queues or buffers are full?
Does the system monitor itself (capacity threshold, critical performance threshold, resource exhaustion)? What actions does it take?

What are the response time requirements for each message? Is there a diagnostic mode for the system which allows message response times to be measured?
Have you specified the nominal and maximal performance thresholds? Are there test sets that represent them? How will testing assess that the requirements have been met?
Is there a performance model (back of the envelope, queuing model, discrete-event simulation) to determine that performance will be met? Is the model fed with realistic or measured data?
Are the tests and the performance model taking care of only the steady state mode, or do they also take into account startup and major failures?
Where are the performance bottlenecks (and they are there!)? Every system has points at which performance will drop precipitously if any workload is added; it’s better to know where these are in advance. Clues to look for include:
Use of some finite shared resource such as (but not limited to) semaphores, file handles, locks, latches, shared memory, etc.
Excessive inter-process communication. Communication across process boundaries is always more expensive than in-process communication.
Excessive inter-processor communication. Communication across process boundaries is always more expensive than inter-process communication.
The point at which the system runs out of physical memory and starts using virtual memory is a point at which performance usually drops precipitously. Avoid using virtual memory if at all possible.

If you have a redundant system - with both primary and backup processes – can two or more processes "think" that they are primary? What happens then? How is the situation resolved? Can no processes be primary at some point in time?
Are there external processes or programs that can clean up when things are left in an inconsistent state?
Is the system tolerant of errors and exceptions. When an error or exception occurs, can the system revert to a consistent state?
Can you run diagnostic routines on a running system if necessary?
Can the system be upgraded while running? Does it need to be?
Where do alarms go? Is there a single alarm mechanism? Can you "tune" it to prevent false or redundant alarms? Can the users determine which alarms they want to monitor?
Can some tracing facility be turned on or off to help troubleshooting? What is the added overhead? Does the facility require special tools or training?
How much "head room" (free memory/free CPU cycles) is allowed in the CPU utilization? How is it assessed?
Are the load or performance requirements reasonable? (e.g. can a user really enter X bytes per minute? Does the user really need to see the result in less than Y milliseconds?)
Are there memory budgets? How do you detect or prevent memory leaks? How do you use the virtual memory system? Monitor it? Tune it?

Are the processes sufficiently independent of one another that they can be easily distributed across processors or nodes? Do the throughput or response time requirements virtually dictate that certain processes remain co-located? Does the inter-process communication mechanism (e.g. semaphores or shared memory) virtually require the processes to be co-located?
Can certain messages be made asynchronous, so that they can be processed when resources are more available?
Can the system be scaled-up by adding processes and nodes?

Rational Unified Process 5.1 (build 43)