Using Terraform with libvirt in 2022

2022/12/24

Introduction

Following on with the theme of “looking for things to use my new server for” I decided to look and see what other options were out there for managing and provisioning VMs. While Vagrant and Ansible together are fantastic at this, Vagrant really isn’t a good option for “production” VMs (production in quotes here because… it’s my lab server and production is quite relative in that context). I use a lot of Terraform in my day job so I thought I’d look to see if there was a libvirt or kvm provider for it, turns out there is but it’s a little… well, we’ll call it undermaintained as it’s not quite dead but the development of it seems to have slowed considerably in the last year. You can find the provider here.

Now, credit where credit is due. This provider works very well, it’s just that it appears to be a side project for a single developer and we all know we have infinite time for these kinds of things so it’s probably just something that’s fallen on the back burner for one reason or another. That said, with a little tinkering and outside the box thinking, we can still use this provider today, here’s how.

Getting Started

You’ll need a couple of things to get started:

Terraform

I’ve come to like tfenv for installing and managing Terraform executables on my system. I recommend installing it and then using it to install the latest Terraform version. Installation steps are in the README.md found at the link above but here’s a summary for the lazy folks who are also on MacOS.

$ brew install tfenv

I forget whether or not this adds it to your ~/.bash_profile automatically or not so read the message brew gives you when the install is finished. You’ll need to add or confirm that the following line was added to it.

$ grep tfenv ~/.bash_profile

If not, add it

$ test -d $HOME/.tfenv || mkdir $HOME/.tfenv
$ echo 'export PATH="$HOME/.tfenv/bin:$PATH"' >> ~/.bash_profile

Now install the latest Terraform version and activate it:

$ tfenv install
$ tfenv use 1.3.6

If you’re following this tutorial in the future, you may have to run tfenv list to see what “latest” version was that was installed and substitute that version for the one shown above.

Terraform-provider-libvirt

The libvirt provider is configured and used just like any other Terraform provider. Here is an example directly from the project README:

terraform {
  required_providers {
    libvirt = {
      source = "dmacvicar/libvirt"
    }
  }
}

provider "libvirt" {
  # Configuration options
}

You’ll notice, however, that if you go through the examples in the repo that they’re organized by Terraform version and that the newest version they have listed is 0.13… well, we’ve moved a bit beyond that so you’re going to find that some of the examples don’t work perfectly, thankfully, yours truly has spent some time with this and figure out at least one issue and how to work around it:

Inside of the 0.13 Ubuntu example there are two template files used, one for the network config and one for the cloud-init config. The example uses the, now deprecated, template_file data type. We can safely remove these data objects and replace the references to them in the later libvirt_cloudinit_disk resource with the following:

resource "libvirt_cloudinit_disk" "commoninit" {
  name           = "commoninit.iso"
  user_data      = templatefile("${path.module}/cloud_init.cfg", {})
  network_config = templatefile("${path.module}/network_config.cfg", {})
  pool           = libvirt_pool.ubuntu.name
}

This should now allow the Ubuntu 0.13 example to work with >= 1.0.

Libvirt permissions problems

The last issue I encountered with the Ubuntu example was:

libvirt_pool.ubuntu: Refreshing state... [id=1422d683-633e-4c85-abfc-c182bbd46b43]
libvirt_cloudinit_disk.commoninit: Refreshing state... [id=/tmp/terraform-provider-libvirt-pool-ubuntu/commoninit.iso;12cd7610-9528-439c-8884-c702aa29984a]
libvirt_volume.ubuntu-qcow2: Refreshing state... [id=/tmp/terraform-provider-libvirt-pool-ubuntu/ubuntu-qcow2]

when attempting to apply. Everything appeared to get created properly on the target server but the domain wouldn’t start up. Well, it turns out this is not a libvirt provider issue but, instead, an issue with the qemu configuration shipped with Ubuntu 22.04 by default. I’m not 100% sure if this is because I don’t have selinux or apparmor installed on the server and the qemu package for Ubuntu expects that it’s enabled or if this is legitimately a bug in the pre-packaged qemu installation for Ubuntu, but either way, it’s an easy fix.

If you don’t have apparmor or selinux installed on your system, simple edit /etc/libvirt/qemu.conf and search for the security_driver directive. If you have selinux installed, set it to “selinux”. If you have apparmor installed, set it to “apparmor” and if you have neither installed, set it to “none”. Once you’ve set that parameter, save and close the file and restart the libvirtd service:

$ sudo systemctl restart libvirtd.service

Once the service is restarted, issuing virsh start <domain> should along with subsequent terraform apply runs made to create additional new domains.

Additional Notes and Conclusion

One additional thing to make note of, and I’m not exactly sure yet if this is a shortcoming of the provider or if I just need to look around for something I haven’t yet found in the docs, is that terraform destroy did not completely clean up the created domain. I found that after terraform apply ran to create a domain, subsequent destroy and apply commands would fail with the system complaining that the target domain already existed. I’ve yet to find a way to have Terraform fully clean this up and have had to either use Ansible or manually ssh to the KVM host and undefine the domain.

And in conclusion, I think the libvirt provider for Terraform needs some work. I am certainly glad it exists but with how little attention the project seems to be receiving, currently, and how many issues I ran in to it while trying to configure a simple example domain, I am not sure I’d want to use it for production. The right answer here is probably to use Packer to build a static VM image, Ansible to build and deploy the domain and image and Cloud-init to do any first-run steps that can’t be built in to the vm image during build with packer.

<< How to use Vagrant on an M1 Macbook with a remote kvm server with Vagrant-libvirt