Domains versus Branches
ISconf provides two mechanisms for isolating machines from each other: domains and branches. A domain can contain multiple branches, and a branch can describe multiple machines. Domains are designed to isolate legal entities (specifically, their intellectual property) from each other, while branches are designed to isolate development paths. Below we describe domains, then branches, then their relationship to each other, and, in the last section, we describe when it's appropriate to create one versus the other.
About ISconf Domains
(extracted from the isconf man page, changeset:200)
An ISconf domain name is more or less equivalent to a NIS domain name, an AFS cell name, or a Kerberos realm name. This name is an arbitrary string, but by convention it is usually based on the DNS domain name.
ISconf domains are a security mechanism, primarily in regards to information hiding. All of the machines sharing the same ISconf domain name will share the same distributed cache, so root users on all of these machines will be able to read the contents of the cache. Likewise, machines that are in different domains will not share the same cache, so root users of these machines will not have access to the cache contents of the other domain. This becomes important if there is any proprietary or sensitive information stored in the ISconf cache, for example via a 'snap' or 'exec' command.
Normally you'd want all of the machines in a given legal entity -- the same corporation, for instance, to use the same domain name. For example, a small company using ISconf might use an ISconf domain name of 'example.com' on all of their machines. A larger company might have multiple divisions or subsidiaries and legal or security reasons for segregating machines. The large campany might put most of their machines in 'example.com', but for regulatory or security reasons might isolate a subsidiary into 'foo.example.com', and might put their bastion and firewall machines into 'security.example.com'. Note again that there doens't need to be a 'security.example.com' DNS domain for this to work.
The idea of ISconf domains is to completely isolate legal entities from each other when sharing the same net. Machines in different domains refuse to cache each other's data, answer each other's queries, and so on. Domains really come into play in the TCP crypto and user auth code (ISconf 4.3 and later), where each domain has its own PGP keyring; its own database of hosts and users, and all of the wire traffic is encrypted accordingly.
Establishing two machines in different domains means "I don't want these machines to ever cooperate at all. I will never merge their branches, I don't want them to be able to share or see each other's packages, cache space, or wire traffic."
Future ISconf features, such as monitoring, inventory, and asset management, will also operate on domain boundaries.
About ISconf Branches
(derived from the isconf man page, changeset:200)
An ISconf branch is a set of identical machines. The word 'branch' as used here means more or less the same thing as in software version control -- a branch of development. A different branch is normally used for each set of hosts that need their own disk image and that do wildly different or conflicting things. For example, a DNS server and a database server might be on different branches.
A branch is described by the sequence of transactions in a journal. A new branch is created by forking an existing branch, then creating a checkpoint image.
The Relationship between Domains and Branches
One domain can (and normally does) contain multiple branches. The 'example.com' domain can contain branches named 'db', 'dns', 'firewall', and so on.
Each branch exists in exactly one domain. A branch named 'foo' in the 'example.com' domain is not the same branch as a branch named 'foo' in 'anotherexample.com' -- these two 'foo' branches have no relation to each other.
Each machine is described by exactly one branch, which means it lives in exactly one domain.
The ISconf cache is distributed to all of the members of the same domain, no more nor less. Machines not in the same domain cannot share cache space.
Branches in the same domain do share cache space. This means that if you do an 'isconf snap some_large.tar.gz' in multiple branches in the same domain, that large tarball only gets copied into cache space once. Any later 'snap' commands simply refer to the first one. Likewise, that large tarball only needs to be shipped across WAN links once; nearby machines on different branches, but in the same domain, will fetch it from each other.
Branches in the same domain can be easily merged, by merging the contents of the journal file. Because branches in the same domain share cache space, a merge does not require copying any more cache blocks around.
Stroll around in the /var/is/fs tree on a few of your machines to get a feel for how this all works at a nuts-and-bolts level.
When to Create Branches versus Domains
Domains are about security. Branches are about development paths.
You usually want to create exactly one ISconf domain, then put all of your machines in it. If you personally have root access on machines that are in two different domains, then you've probably created too many domains. The purpose of domains is to keep the other guy from being able to see what's in your domain.
Domains only exist so that multiple legal entities can share the same network without sharing the same intellectual property. Entities which need to hide intellectual property from each other need to be in different domains. If this hiding isn't necessary, then put all of your machines in the same domain. If this hiding is necessary, then you, a single person, likely should not have root on both sets of machines in the first place, unless you're an external consultant for both entities, under mutual NDA, insured and/or bonded, and so on.
You usually want to create as few branches within your domain as possible, but you'll probably still have a few. If you find yourself frequently running the same 'snap' and 'exec' commands in multiple branches at the same time, then you've probably created too many branches. Ideally, all of your machines which are of the same operating system and hardware family should be on the same branch.
Branches exist only to allow different machines to follow different development paths. Machines which need completely different, mutually exclusive packages might want to be on different branches, otherwise you should try to keep them on the same branch. Even if two machines need to run completely different daemons, they can still exist on the same branch, by using a configuration file to control which daemons start on which machines. See the Cluster::Init perl module for one example of how to do this.
Branches can also be used for staging changes into production. For example, you can install and test changes by using 'snap' and 'exec' commands on a staging branch machine, then when you're happy, simply copy the staging journal file over top of the production journal file. (This assumes that you have not made independent changes in production, and that you know how to navigate around in the /var/is tree.)
But in order to merges for staging, the staging and production branches must be in the same domain, due to the way cache files are shared. More domains also means more network bandwidth consumption, as machines copy the same packages over the WAN again and again.
It's worth repeating that in order to use branch merges or comparisons, inventory or asset management, or any other current or future ISconf features, all of your machines that need to be managed as a corporate set need to be in the same domain.
