03/2005 by Zadig.
Tools needed:
Introduction:
This text will present how to crypt the datas that are stored on your partitions. The aim is not to show how a file can be encrypted, but how to protect the content of a whole partition.
All modern OSes allow users to crypt their data. Unfortunatelly such a feature is not (yet) available on BeOS. Crypting data is usefull if you want to ensure that nobody will read your "secret projects". Moreover the last years, usb mass storage dongles has been more and more used. Since you usually transport these devices in a pocket and use them everywhere, it can be very easy to loose them and let someone read its content. Just for this reason you should not store uncrypted private datas on such devices.
Such a feature must be available without any cipher support in the filesystems. So let's see how this can be achieved.
II- Implementation in beos/dano.
III- Integration with the tracker and drive setup.
A first solution is to crypt the datas of the files. Doing this way, nobody will be able to read your datas. To do this we must hook the filesystem read, write, and attribute functions (since attributes are datas, we must also crypt them). The main drawback of this is that metadatas are not encrypted. With metadatas you can get a lot of interesting information: The directory listing and file position in the partition. You can even mount the partition. Only file contents will not be usable.
Another way to crypt the datas is to do it at the lowest level: When reading/writing to the device. Now all the datas are crypted and the partition can not be mounted without decryption. This is much better and is what is used on linux for example. However it has a potential vulnerablility: Parts of the metadatas are predictive. If someone wants to crack you datas he will have both the crypted and clear datas, which may ease the task to find the cypher key. However if the algo used is secured enought these 2 information should not allow to find the cipher key. This was heavily discussed on linux mailing lists concerning cryptoloop security (see I.4).
For paranoid people (or realistic :p), we could encrypt a 2 different levels:
first crypt the data, and then crypt the metadata. However crypting only
metadata is not easy (and probably impossible) without modifications in each
filesystem.
A solution to crypt at both layers is to crypt all file datas, and then crypt
all low level datas. By doing this the file datas will be crypted twice which
should bring more security. Obviously the key used and eventually the algos
should be different.
There are two existing crypting implementations in linux: Cryptoloop and
Dm-crypt. Cryptoloop is an extention of the loop driver: The driver exports
some devices (dev/loopxx) that are linked to another device or file. When you
write to the loop file, the request is forwarded to another one, and is
crypted/decrypted. There has been a lot of discussions about this
implementation, and especially the crypto implementation. See here for more details on this.
Dm-crypt is the new way to crypt: It is based on the device mapper feature.
This solution is considered more secured than crypto loop due to its buffer
managment and key/iv/crypto managment.
The choosed solution is similar to cryptoloop: Datas are crypted all at low level (it is easier to implement) through a loop device. So a loop driver that exports several loop devices is needed. However a high level crypting layer can be added later.
The first problem to handle is the loop devices configuration: Drivers are only loaded in memory if they have at least 1 device opened (There is no option to force a driver to stay in memory). However we must save the target of the loop, the crypt algo used and the cipher key. This has to be saved in non swappable memory (using driver settings to save the key on the disk would not be very secured!). The solution is to use one device to manage all the others: This one is always opened and used to configure all other devices. The configuration is done through ioctls to this device: set/get loop target, set/get crypt algo, and set passphrase.
A command line tool is used to easily configure the loop devices: losetup. It needs the crypto algo and the loop target as parameters. It then asks for the passphrase before configuring the driver. So the general design is:
_____________ ____________
U | | | |
s | RAM | | losetup |
e |_____________| |____________|
r /|\ /|\
|/dev/misc/loop/loop0 |/dev/misc/loop/loop_man
==============================================================
|___________________________ |
K | |
e _____________ _|____|_
r | Crypt | | Loop |
n | module |<---------------->| driver |
e |_____________| |________|
l
The crypto implementation is obviously the critical part of the driver. Since it is very hard to write a good implementation of crypto algos, an existing library has to be used. Cryptlib is used although there are some drawbacks to this choice:
- used in a lot of projects, thus it is safe.
- supports a lot of different algos.
- the API is very easy to use.
- There is a lot of code (the final library size is 900kB ).
- The default build uses threads, this has to be removed (we are running in kernel land).
- There are a lot of useless stuff for us(asymetric algos, certificates...).
Using a dumb loop driver that just forward requests to its target, we see that mounting partitions is not possible: BeOS says that the device does not have the correct type or size. This is because the filestsyem "stat" the source device before mounting a partition. In our case the return values are the ones of our loop device. To be able to mount a partition we must hook the system stat calls. For example if there is a call to stat("/dev/misc/loop/loop0") and loop0 is linked to "/boot/home/test" then stat must return the infos of "/boot/home/test". To do this we must first find where the stat functions (stat, fstat, and lstat) end to.
By disassembling the kernel we find several stat references: filesystem ones (rootfs_statxx, devfs_statxx), and others. The function sys_rstat looks interesting. It is called by: sys_readdir, getcwd (3 times), fstat, lstat, stat, dev_for_path, and user_dup2. Looking at these calling places we see that sys_rstat takes 5 parameters. Each caller gives some information about these params:
fstat: sys_rstat(1, int filedes, 0, struct stat *buf, 1)
lstat: sys_rstat(1, 0xFFFFFFFF, const char* path, struct stat *buf, 0)
stat: sys_rstat(1, 0xFFFFFFFF, const char* path, struct stat *buf, 1)
So it seems that the prototype is:
sys_rstat(int unknown_param, int filedes, const char *path, struct stat *buf,
int get_link_info);
Note that only one of param 2 and 3 is used at time. All we have to do is replace the filled stat struct with the loop target. The easiest way to do this, it to call our hook at the end of sys_rstat. Our hook will be called with the stat buffer as an argument. Doing this way we just have to test if the stat struct is one of our loop device (we can check this by using the st_dev and st_ino fields), and modify it with the loop target if necessary. However you should note that it must not break the lstat call: To handle this call properly, the get_link_info param should also be analyzed by the hook.
All the operations to configure the loop devices are accessible via a command
line tool: losetup. This must also be configurable with the usual tools:
drive setup to enable crypting on a partition, and the tracker for easy
mounting. All these tool will use a common config file that contains the list
of crypted partitions: This file will be created/modified by drive setup and
read by the tracker:
When the user asks to mount a partition, the tracker will check if this
partition is crypted. If it is, the tracker will configure a new loop device,
and ask the user for the crypt algo used and the passphrase. Then it will
mount the device. When unmounting the partition, it will free the hook
device.
Porting the code as is to haiku should be quite simple. However there is the
problem of hooking "stat": This is fine for a hack, but not a nice way to
code! It may be possible to create a hook module that could hook any kernel
space function but it is just "less worse".
There may be several other solutions:
As you saw the cryptoloop design will probably change in the future. However the crypto part should be kept in future designs. This means that "binary compatibility" will be kept. The most omportant work will probably be to integrate crypt mounting to the system: Where to store partition crypt infos, how to guess that a part is crypted...