SATAN Architecture

Architecture Overview

SATAN has an extensible architecture. At the center is a relatively small generic kernel that knows little to nothing about system types, network service names, vulnerabilities, or other details. Knowledge about the details of network services, system types, etc. is built into small, dedicated, data collection tools and rule bases. The behaviour of SATAN is controlled from a configuration file. Settings may be overruled via command-line options of via a hypertext user interface.

The SATAN kernel consists of the following main parts:

Magic cookie generator: Each time SATAN is started up in interactive mode, the magic cookie generator generates a pseudorandom string that the HTML browser must send to the SATAN custom http server as part of all commands.
Policy engine.: Given the constraints specified in the SATAN configuration file, this subsystem determines whether a host may be scanned, and what scanning level is appropriate for that host.
Target acquisition.: Given a list of target hosts, this SATAN subsystem generates a list of probes to be run on those hosts. The list of probes serves as input to the data acquisition subsystem. The target acquisition module also keeps track of a host's proximity level, and handles the so-called subnet expansions.
Data acquisition.: Given a list of probes, this SATAN subsystem runs the corresponding data collection tools and generates new facts. These facts serve as input to the inference engine.
Inference engine.: Given a list of facts, this subsystem generates new target hosts, new probes, and new facts. New target hosts serve as input to the target acquisition subsystem; new probes are handled by the data acquisition subsystem, and new facts are processed by the inference engine.
Report and analysis.: This subsystem takes the collected data and builds a virtual hyperspace that you can explore with your favourite HTML browser.

Once SATAN is given an initial target host, the target acquisition, data acquisition and inference engine subsystems keep feeding each other new data until nothing new comes up. Technically speaking, the system does a breadth-first search.

Magic cookie generator

When you start SATAN in interactive mode, i.e., using the HTML user interface, SATAN performs the following actions before starting up the HTML browser:

Start the SATAN httpd daemon. This is a very limited subset of the typical httpd daemon, sufficient to support all activities that SATAN can perform.
Generate a (hopefully "good") 32 byte cryptographic magic cookie for the upcoming SATAN run. SATAN runs several system utilities in parallel and compresses their quasi-random output with the MD5 hashing function. The HTML browser must specify this magic cookie as part of the URLs that it sends to the custom SATAN httpd daemon. If this key is ever compromised, intruders could potentially execute any programs that the SATAN program can run, with the same privileges as the user that started the SATAN program. SATAN generates a new magic cookie for each session. SATAN and the HTML browser always run on the same host, so there is no need to send the magic cookie over the network.
Read in any previously collected scan data. By default, SATAN will read data in the $satan_data database. In the mean time HTML browser comes up, but it will not be ale to communicate with SATAN until the database has been read in. This can take anywhere from a few seconds to several minutes, depending on the size of the database, the speed of the machine you're using to run SATAN on, the amount of available RAM, etc.

Policy engine

The policy engine controls what hosts SATAN may probe. The probing intensity depends on the host's proximity level, which is basically a measure for the distance from the initial target host(s). Probing intensities and probing constraints are specified in the configuration file. This file can direct SATAN to stay within certain internet domains, or to stay away from specific internet domains.

Proximity levels

While SATAN gathers information from the so-called primary target(s) that you specified, the program may learn about the existence of other hosts. Examples of such non-primary systems are:

hosts found in remote login information from the finger service,
hosts that import file systems from the target, according to the showmount command.

For each host, SATAN maintains a proximity count. The proximity of a primary host is zero; for hosts that SATAN finds while probing a primary host, the proximity is one, and so on. By default, SATAN stays away from hosts with non-zero proximity, but you can override this policy by editing the configuration file, via command-line switches, or from the hypertext user interface.

Target acquisition

SATAN can gather data about just one host, or it can gather data about all hosts within a subnet (a block of 256 adjacent network addresses). The latter process is called a subnet scan . Target hosts may be specified by the user, or may be generated by the inference engine when it processes facts that were generated by the data acquisition module.

Once a list of targets is available, the target acquisition module generates a list of probes, according to the scanning level derived by the policy engine. The actual data collection is done under control of the data acquisition module.

Subnet scan

When requested to scan all hosts in a subnet (a block of 256 internet addresses), SATAN uses the fping utility to find out what hosts in that subnet actually are available. This is to avoid wasting time talking to hosts that no longer exist or that happen to be down at the time of the measurement. The fping scan also may discover unregistered systems that have been attached to the network without permission from the network administrator.

Data acquisition

The data acquisition engine takes a list of probes and executes each probe, after it has verified that the probe may be run at the target's scanning level. What tool may be run at a given scanning level is specified in the configuration file. The software keeps a record of what probes it has already executed, to avoid doing unnecessary work. The result of data acquisition is a list of new facts that is processed by the inference engine.

SATAN comes with a multitude of little tools. Each tool implements one type of network probe. By convention, the name of a data collection tool ends in .satan. Often these tools are just a few lines of PERL or shell script language. All tools produce output according to the same common tool record format. SATAN derives a great deal of power from this toolbox approach. When a new network feature becomes of interest, it is relatively easy to add your own probe.

Scanning levels

SATAN can probe hosts at various levels of intensity. The scanning level is controlled with the configuration file, but can be overruled with command-line switches or via the graphical user interface.

light: This is the least intrusive scan. SATAN collects information from the DNS (Domain Name System), tries to establish what RPC (Remote Procedure Call) services the host offers, and what file systems it shares via the network. With this information, SATAN finds out the general character of a host (file server, diskless workstation).
normal (includes light scan probes): At this level, SATAN probes for the presence of common network services such as finger, remote login, ftp, WWW, Gopher, email and a few others. With this information, SATAN establishes the operating system type and, where possible, the software release version.
heavy (includes normal scan probes): After it has found out what services the target offers, SATAN looks at them in more depth, and does a more exhaustive scan for network services offered by the target. At this scanning level SATAN finds out if the anonymous FTP directory is writable, if the X Windows server has its access control disabled, if there is a wildcard in the /etc/hosts.equiv file, and so on.

The fourth level, breaking into systems, has not been implemented.

At each level SATAN may discover that critical access controls are missing or defective, or that the host is running a particular software version that is known to have problems. SATAN takes a conservative approach and does not exploit the problem.

Inference engine

The heart of SATAN is a collection of little inference engines. Each engine is controlled by its own rule base. The rules are applied in real time, while data is being collected. The result of these inferences are lists of new facts for the inference engine, new probes for the data acquisition engine, or new targets for the target acquisition engine.

rules/todo: Rules that decide what probe to perform next. For example, when the target host offers the FTP service, and when the target is being scanned at a sufficient level, SATAN will attempt to determine if the host runs anonymous FTP, and if the FTP home directory is writable for anonymous users.
rules/hosttype: Rules that deduce the system class (example: DEC HP SUN) and, where possible, the operating system release version, from telnet, ftp and other banners.
rules/facts: Rules that deduce potential vulnerabilities. For example, several versions of the FTP or sendmail daemons are known to have problems. Daemon versions can be recognized by their greeting banners.
rules/services: Rules that translate cryptic daemon banners and/or network port numbers to more user-friendly names such as WWW server, or diskless NFS client.
rules/trust: Like the services rules, these rules help SATAN to classify the data that was collected by the tools on NFS service, DNS, NIS, and other cases of trust.
rules/drop: What data-collection tool output SATAN should ignore. This can be used to shut up SATAN about things that you do not care about. Implemented by the drop_fact.pl module.

Application of these rules in real time, to each tool output record, and within the context of all information that has been collected sofar, offers an amazing potential that we are only beginning to understand.

Report and Analysis

When SATAN scans a network with hundreds or thousands of hosts, it can collect a tremendous amount of information. As we have found, it does not make much sense to simply present all that information as huge tables. You need the power of hypertext technology, combined with some unusual implementation techniques to generate a dynamic hyperspace on the fly.

With a minimal amount of effort (at least, by you; your computer may disagree), SATAN allows you to navigate though your networks. You can break down the information according to:

Domain or subnet,
Network service,
System type or operating system release,
Trust relationships,
Vulnerability type, danger level, or count.

Breakdowns by combinations of these properties are also possible. SATAN's reporting capabilities makes it relatively easy to find out, for example:

What subnets have diskless workstations,
What hosts offer anonymous FTP,
Who runs Linux or FreeBSD on their PC,
What unregistered (no DNS hostname) hosts are attached to your network.

Questions like these can be answered with only a few mouse clicks. Printing a report is a matter of pressing the print button of your favourite hypertext viewer.

Back to the Reference TOC/Index