Home | Theses | Software Repository | Wiki

Papers on Analytical System Administration

  • Computer Immunology

    Proceedings of the Twelfth Systems Administration Conference (LISA XII) (USENIX Association: Berkeley, CA), page 283, 1998

    Present day computer systems are fragile and unreliable. Human beings are involved in the care and repair of computer systems at every stage in their operation. This level of human involvement will be impossible to maintain in future. Biological and social systems of comparable and greater complexity have self-healing processes which are crucial to their survival. It will be necessary to mimic such systems if our future computer systems are to prosper in a complex and hostile environment. This paper describes strategies for future research and summarizes concrete measures for the present, building upon existing software systems.

  • Measuring Host Normality

    ACM Transactions on Computing Systems, 20, p-125-160 (2001)

    A comparative analysis of transaction time-series is made, for light to moderately loaded hosts, motivated by the problem of anomaly detection in computers. Criteria for measuring the statistical state of hosts are examined. Applying a scaling transformation to the measured data, it is found that the distribution of fluctuations about the mean is closely approximated by a steady-state, maximum-entropy distribution, modulated by a periodic variation. The shape of the distribution, under these conditions, depends on the dimensionless ratio of the daily/weekly periodicity and the correlation length of the data. These values are persistent or even invariant. We investigate the limits of these conclusions, and how they might be applied in anomaly detection.

  • The kinematics of distributed computer transactions

    International Journal of Modern Physics C12, p759-789 (2000)

    A causal, stochastic model of networked computers, based on information theory and non-equilibrium dynamical systems is presented. This provides a simple explanatino for recent experimental results revealing the structure of information in network transactions. The model is based on non-Poissonian stochastic variables, and pseudo-periodic functions. It explains the measured patterns seen in resource variables on computers in network communities. Weakly non-Poissonian behaviour can be eliminated by a conformal scaling transformation, and leads to a mapping onto statistical field theory. From this it is possible to calculate the exact profile of the spectrum of fluctuations. This work has applications to anomaly detection and time-series analysis of computer transactions.

  • Theoretical System Administration

    Proceedings of the Fourteenth Systems Administration Conference (LISA XIV) (USENIX Association: Berkeley, CA), page 1, 2000

    In order to develop system administration strategies which can best achieve organizations' goals, impartial methods of analysis need to be applied, based on the best information available about needs and user practices. This paper draws together several threads of earlier research to propose an analytical method for evaluating system administration policies, using statistical dynamics and the theory of games.

  • Scaling behaviour of peer configuration in logically ad hoc networks

    IEEE eTransactions on Network and Service Management, p1 (2004)

    Current interest in ad hoc and peer-to-peer networking technologies prompts a re-examination of models for configuration management, within these frameworks. In the future, network management methods may have to scale to millions of nodes within a single organization, with complex social constraints. In this paper, we discuss whether it is possible to manage the configuration of large numbers of network devices using well-known and not-so-well-known configuration models, and we discuss how the special characteristics of ad hoc and peer-to-peer networks are reflected in this problem.

  • On the Theory of System Administration

    Science of Computer Programming 49, p1 (2003)

    This paper describes a mean field approach to defining and implementing policy-based system administration. The concepts of regulation and optimization are used to define the notion of maintenance. These are then used to evaluate stable equilibria of system configuration, that are associated with sustainable policies for system management. Stable policies are thus associated with fixed points of a mapping that describes the evolution of the system. In general, such fixed points are the solutions of strategic games. A consistent system policy is not sufficient to guarantee compliance; the policy must also be implementable and maintainable. The paper proposes two types of model to understand policy driven management of Human-Computer systems: i) average dynamical descriptions of computer system variables which provide a quantitative basis for decision, and ii) competetive game theoretical descriptions that select optimal courses of action by generalizing the notion of configuration equilibria. It is shown how models can be formulated and simple examples are given.

  • Configurable immunity for evolving human-computer systems

    Science of Computer Programming 51 2004, p197-213

    The immunity model, as used in the GNU cfengine project, is a distributed framework for performing policy conformant system administration, used on hundreds of thousands of Unix-like and Windows systems. This paper describes the idealized approach to policy-guided maintenance, that is approximated by cfengine, building on the notion of `convergent' operations, i.e. those that reach stable equilibrium. Agents gravitate towards a policy determined configurations, through the repeated application of unintelligent `anti-body' operations or discrete, coded countermeasures. The distributed agents turn passive discovery of state into active strategy for `curing' systems of policy transgressions. Keywords: autonomous computer management, cfengine, immunity model.

  • Autonomic Computing Approximated by Fixed-Point Promises

    Proceedings of First IEEE International Workshop on Modelling Autonomic Communication Environments (MACE2006). p197-222

    We use the concept of promises to develop a service oriented abstraction of the primitive operations that make an autonomic computer system. Convergent behaviour does not depend on centralized control. We summarize necessary and sufficient conditions for maintaining a convergently enforced policy without sacrificing autonomy of decision, and we discuss whether the idea of versioning control or ``rollback'' is compatible with an autonomic framework.

  • A Risk Analysis of Disk Backup or Repository Maintenance

    Science of Computer Programming (to appear 2006/7)

    We discuss a simple model of disk backups and other maintenance processes that include change to computer data. We determine optimal strategies for scheduling such processes. A maximum entropy model of random change provides a simple and intuitive guide to the process of sector based disk change and leads to an easily computable optimum time for backup that is robust to changes in the model. We conclude with some theoretical considerations about strategies for organizing backup information.
  • Modelling Next Generation Configuration Management Tools

    Proceedings of the XX Large Installation System Administration Conference, LISA 2006. p131-147

    There are several current theoretical models used to discuss configuration management, including aspects, closures, and promises. We examine how these models relate to one another, and develop a overall theoretical framework within which to discuss configuration management solutions. We apply this framework to classify the capabilities of current tools, and develop requirements for the next generation of configuration management tools.