Using vimage jails in FreeBSD 8.0+

description

There are two common approaches to managing jails: one is to alias multiple IP addresses to a single interface and bind jails to these addresses and the other is to clone the loopback interface for each jail and use NAT to direct traffic to them. Neither approach treats applications running in jails as having unique traffic and security needs.

The virtualized network stack, which is stable enough for general use in FreeBSD 8.0, solves this and many other common networking problems. You may want to use vimage to provide a more complete hosting environment, to test and troubleshoot entire networks before you deploy them, or to replace stacks of networking equipment with a single box with several routing contexts. Try to imagine every jail having its own firewall and a traffic shaper. However, other than two lines in the detailed release announcement, there is not a whole lot of documentation available for this feature. This article will show you how to use this feature by example.

stage 0: compile kernel

I trust you compiled your own kernel before, so just add options VIMAGE to it and remove options SCTP. SCTP is the Stream Control Transmission Protocol and support for it is coming soon. You might also want to compile in multicast routing support by adding options MROUTING while you are at it. I do not recommend compiling firewall into the kernel. The handbook is actually wrong on this one. You don't need options IPDIVERT to use NAT; it can be loaded dynamically. If you haven't compiled your own kernel before, you can find good documentation on it in the handbook (http://www.freebsd.org/doc/en/books/handbook/kernelconfig.html).

stage 1: prepare jails

File-based jails allow easy space quotas and portability between servers. Here is a way to set this up without wasting too much space. Adjust the commands to your own unique configuration (vi is your friend). This is tested to work with UFS and ZFS as the underlying filesystem. If using ZFS, consider turning on compression for your jail volume. 4G file full of zeros does not take up 4G when compressed. Mounting devfs is optional but recommended.

First, install the base jail.

cd /usr/src
mkdir -p /usr/jail/base
make buildworld installworld distribution DESTDIR=/usr/jail/base

Then prepare images and directories for the upcoming jails.

mkdir /usr/jail/img
mkdir /usr/jail/dir
cd /usr/jail
dd if=/dev/zero of=img/one bs=1M count=4096
dd if=/dev/zero of=img/two bs=1M count=4096
dd if=/dev/zero of=img/three bs=1M count=4096
dd if=/dev/zero of=img/four bs=1M count=4096
dd if=/dev/zero of=img/five bs=1M count=4096
mkdir dir/one
mkdir dir/two
mkdir dir/three
mkdir dir/four
mkdir dir/five

Attach the files to device nodes and format them.

mdconfig -a -t vnode -f img/one -u 1
mdconfig -a -t vnode -f img/two -u 2
mdconfig -a -t vnode -f img/three -u 3
mdconfig -a -t vnode -f img/four -u 4
mdconfig -a -t vnode -f img/five -u 5
newfs -U /dev/md1
newfs -U /dev/md2
newfs -U /dev/md3
newfs -U /dev/md4
newfs -U /dev/md5

Mount everything.

mount /dev/md1 dir/one
mount /dev/md2 dir/two
mount /dev/md3 dir/three
mount /dev/md4 dir/four
mount /dev/md5 dir/five
mount_unionfs -o below base dir/one
mount_unionfs -o below base dir/two
mount_unionfs -o below base dir/three
mount_unionfs -o below base dir/four
mount_unionfs -o below base dir/five
mount -t devfs devfs dir/one/dev
mount -t devfs devfs dir/two/dev
mount -t devfs devfs dir/three/dev
mount -t devfs devfs dir/four/dev
mount -t devfs devfs dir/five/dev

Copy the resolv.conf file from the host system and start the jails.

cp /etc/resolv.conf dir/one/etc/resolv.conf
cp /etc/resolv.conf dir/two/etc/resolv.conf
cp /etc/resolv.conf dir/three/etc/resolv.conf
cp /etc/resolv.conf dir/four/etc/resolv.conf
cp /etc/resolv.conf dir/five/etc/resolv.conf
jail -c vnet host.hostname=one.domain.tld path=/usr/jail/dir/one persist
jail -c vnet host.hostname=two.domain.tld path=/usr/jail/dir/two persist
jail -c vnet host.hostname=three.domain.tld path=/usr/jail/dir/three persist
jail -c vnet host.hostname=four.domain.tld path=/usr/jail/dir/four persist
jail -c vnet host.hostname=five.domain.tld path=/usr/jail/dir/five persist

If you don't want full jails (routing context only) and have all the software you need installed on the host system, you could skip the above part and just do the following.

jail -c vnet host.hostname=one.domain.tld path=/ persist
jail -c vnet host.hostname=two.domain.tld path=/ persist
jail -c vnet host.hostname=three.domain.tld path=/ persist
jail -c vnet host.hostname=four.domain.tld path=/ persist
jail -c vnet host.hostname=five.domain.tld path=/ persist

stage 2: configure network interfaces

This assumes that you have one working interface already. I will use lagg0 (a link aggregation interface) as an example. In your case it can be re0, or fxp0, or xl0, or something entirely else. Once again, adjust this to your needs. I assume that you are behind NAT and your local network is 192.168.2.0/24 and you have a gateway at 192.168.2.254.

Assign an address to your initial interface and create a few virtual ones. A bridge interface acts like a switch and an epair is a pair of end to end connected interfaces.

ifconfig lagg0 192.168.2.100 netmask 255.255.255.0 broadcast 192.168.2.255
ifconfig bridge create
ifconfig epair create
ifconfig epair create
ifconfig epair create
ifconfig epair create
ifconfig epair create

Add the appropriate interfaces to the bridge and assign some initial IP addresses. I use a different network for these because they serve a different logical function and it all still works.

ifconfig bridge0 addm lagg0 addm epair0a addm epair1a addm epair2a addm epair3a addm epair4a
ifconfig bridge0 10.1.0.1
ifconfig epair0a 10.0.0.1
ifconfig epair1a 10.0.0.2
ifconfig epair2a 10.0.0.3
ifconfig epair3a 10.0.0.4
ifconfig epair4a 10.0.0.5

Share the other ends of epairs with the jails. You can put more than one in a jail if you wanted a more complex topology.

ifconfig epair0b vnet 1
ifconfig epair1b vnet 2
ifconfig epair2b vnet 3
ifconfig epair3b vnet 4
ifconfig epair4b vnet 5

And finally configure the network settings for the jails.

jexec 1 ifconfig epair0b 192.168.2.101
jexec 2 ifconfig epair1b 192.168.2.102
jexec 3 ifconfig epair2b 192.168.2.103
jexec 4 ifconfig epair3b 192.168.2.104
jexec 5 ifconfig epair4b 192.168.2.105
jexec 1 route add default 192.168.2.254
jexec 2 route add default 192.168.2.254
jexec 3 route add default 192.168.2.254
jexec 4 route add default 192.168.2.254
jexec 5 route add default 192.168.2.254

stage 3: create hierarchical jails

Here I will show you how to set up hierarchical jails. Once you can do that, you can make a star, a ring, a mesh, a spaghetti, or be really creative. You can set up separate firewalls and traffic shapers. You can also connect ancestor jails to children of their children directly.

First, modify a jail so that it can have children.

jail -m jid=1 children.max=2

Create the relevant interfaces.

jexec 1 ifconfig bridge create
jexec 1 ifconfig epair create
jexec 1 ifconfig epair create

Start the jails.

jexec 1 jail -c vnet host.hostname=one.one.domain.tld path=/ persist
jexec 1 jail -c vnet host.hostname=two.one.domain.tld path=/ persist

And configure everything like you did in the previous example. The names of virtual interfaces keep incrementing, but you can rename them using ifconfig oldnameN name newnameN.

jexec 1 ifconfig bridge1 addm epair0b addm epair5a addm epair6a
jexec 1 ifconfig bridge1 10.1.0.2
jexec 1 ifconfig epair5a 10.0.1.1
jexec 1 ifconfig epair6a 10.0.1.2
jexec 1 ifconfig epair5b vnet 6
jexec 1 ifconfig epair6b vnet 7
jexec 6 ifconfig epair5b 192.168.2.111
jexec 7 ifconfig epair6b 192.168.2.112
jexec 6 route add default 192.168.2.254
jexec 7 route add default 192.168.2.254

stage 4: use proper routing

All those bridges can get messy. How about setting one jail up as a router? I will give a short and very simple example of multicast jail routing with PIM-SM.

First, enable IP forwarding in the kernel.

sysctl net.inet.ip.forwarding=1

Then install XORP (eXtensible Open Routing Platform). You can opt for something else, but there are not many multicast IPv4 options.

cd /usr/ports/net/xorp
make install clean

Create some interfaces.

ifconfig bridge create
ifconfig epair create
ifconfig epair create
ifconfig epair create
ifconfig epair create

Create some jails.

jail -c vnet host.hostname=router.domain.tld path=/ persist
jail -c vnet host.hostname=one.domain.tld path=/ persist
jail -c vnet host.hostname=two.domain.tld path=/ persist
jail -c vnet host.hostname=three.domain.tld path=/ persist

Plug everything in.

ifconfig bridge0 addm lagg0 addm epair0a
ifconfig epair0b vnet 1
ifconfig epair1a vnet 1
ifconfig epair2a vnet 1
ifconfig epair3a vnet 1
ifconfig epair1b vnet 2
ifconfig epair2b vnet 3
ifconfig epair3b vnet 4

Configure some IP addresses.

ifconfig epair0a 192.168.2.101
jexec 2 ifconfig epair1b 192.168.3.1
jexec 3 ifconfig epair2b 192.168.4.1
jexec 4 ifconfig epair3b 192.168.5.1

Now the tricky part is configuring the router. I don't advise you to start it as a daemon right away as diagnostic messages can be helpful. The default config file location is /usr/local/config.boot and you should just leave it there. Check out http://www.xorp.org for official documentation. If you want something to get you started, check out a sample config relevant to this example at http://lifanov.com/doc/config.boot.txt. XORP also comes with a firewall module (a front-end to IPFW) that you can use.

Once you have your router configured, it's time to start it and configure default routes.

jexec 1 xorp_rtrmgr

Wait for it to start. It can take up to a minute depending on your network. Then switch to another terminal.

jexec 2 route add default 192.168.3.254
jexec 3 route add default 192.168.4.254
jexec 4 route add default 192.168.5.254

Want something simpler or just don't care for XORP? It is super-easy to set up RIP. Configure your IP addresses as desired and then execute jexec 1 routed. If you still want multicast, install net/mrouted and execute jexec 1 mrouted. In both cases, routing will be enabled on all capable interfaces on default. You might also consider IPv6. Describing it here would overshadow the rest of this article, but you can install net/mcast-tools and read the description if you are curious.

stage 5: miss 7.2?

Some people prefer the vimage behavior before all of its features were merged in jail. But there still is the legacy vimage interface in the sources. If you compiled your kernel earlier, you should already have them.

cd /usr/src/tools/tools/vimage
make install clean

You can read man 8 vimage once you have it installed to find out how it works.

stage 6: that's it

You are basically done. You can now ping all jails from all jails, get to the rest of your network from your jails and get to your jails from the rest of your network. You might want something different (for example, a jail that provides local services only). You can accomplish that using the same method as well. If you want to boot one of the jails for real, issue jexec 1 /bin/sh /etc/rc, but make sure you have devfs mounted before you do. Feel free to paste what you just did into a text file and source it as a part of your boot process. I rarely reboot my systems and prefer services not to start automatically, but it's up to you. It should be pretty easy to adjust for an Internet-facing server, such as a VPS.