Use the ProfitBricks Cloud API with Python Part 4 – Import VMs

ProfitBricks not only offers an easy-to-use GUI – the Data Center Designer (DCD) – we also provide an API that is as capable as the DCD – the Cloud API (formerly named REST API). Via this API, you can easily automate tasks such as the resource management or the provisioning of entire data centers.

Preceding parts of this tutorial series demonstrated how to retrieve information about your virtual environment and how to manage your virtual resources.

This article explains how the Cloud API can ease the migration of your virtual machines from other virtualization solutions to ProfitBricks.

This article is accompanied by a sample script which is available in the ProfitBricks' github repository.

VM Formats

Profitbricks allows to upload the following disk image formats:

  • *.vmdk - Vmware HDD Images

  • *.vhd - HyperV HDD Images

  • *.cow, *.qcow, *.qcow2 - Qemu HDD-Images

  • *.raw - binary HDD-Image

  • *.vpc - VirtualPC HDD-Image

  • *.vdi - VirtualBox HDD-Image

When exporting virtual machines (VMs) from another virtual environment, these images are usually accompanied by additional files which contain metadata. That metadata describes the VM’s properties like RAM or even connected network interfaces.

Besides solution specific internal formats a de-facto standard has been established: Open Virtualization Format (OVF), published by DMTF. OVF is an open standard, that is supported, for example, by the following products (for a more extensive list see https://en.wikipedia.org/wiki/Open_Virtualization_Format):

  • VirtualBox

  • VMware

  • Amazon Elastic Compute Cloud (AWS EC2)

  • IBM Cloud

A so-called OVF package consists of a directory with an XML based description file (*.ovf), typically one or more disk images and maybe some additional files. An OVF package may be provided as an OVA package, which simply is a tar archive of the directory (but with extension *.ova).

In the following example we will see how we can import a VMware VM from an OVF package and use the metadata to ease the process with an example script pb_importVM.py.

Import OVF packaged VMs

In this example we are using the following files:

  • ACME-VM.ovf

  • ACME-VM-disk1.vmdk

  • ACME-VM-disk2.vmdk

  • ACME-VM-disk3.vmdk

First of all, the accompanying script pb_importVM.py does not upload the VM’s disk files. This step has to be done manually as described in ProfitBricks’ online help. When you upload the disk files to the location where you want to create the VM later on you should not change the names of the uploaded images.

When the upload is finished, the script pb_importVM.py can be used to create the VM according to the data in ACME-VM.ovf. The script offers the following options for the import:

('-t', dest='metatype', default="OVF", help='type of VM meta data')
('-m', dest='metafile', help='metadata file')
('-d', dest='datacenterid', help='datacenter of the new server')
('-D', dest='dcname', help='new datacenter name')
('-l', dest='location', help='location for new datacenter')

You can either use options –D <dcname> and –l <location> for creating a new data center or option –d <datacenterid> to create the VM in an existing data center.

The script parses the specified metadata file and creates the VM in several steps similar to the procedure in the previous part of this series:

  • Create the data center if necessary

  • Create the server

  • Create the network interfaces

  • Create the volumes

  • Attach the volumes to the server

The class OVFData is responsible for parsing the OVF file using Python module xml.etree.ElementTree and storing the needed information.

To parse an OVF file, you only need to call OVFdata’s parse() method:

if args.metatype == 'OVF':
    metadata = OFVData(args.metafile)

The collected data is saved in the following fields of OVFData:

self.name = None
self.licenseType = "OTHER"
self.cpus = None
self.ram = None
self.disks = []
self.lans = dict()
self.nics = []

Collect System Data

First, parsing collects the system data from the following parts of the OVF file:

<VirtualSystem ovf:id="ACME-VM">
  <Info>A virtual machine</Info>
  <OperatingSystemSection ovf:id="101" vmw:osType="centos64Guest">
    <Info>The kind of installed guest operating system</Info>
      <rasd:AllocationUnits>hertz * 10^6</rasd:AllocationUnits>
      <rasd:Description>Number of Virtual CPUs</rasd:Description>
      <rasd:ElementName>2 virtual CPU(s)</rasd:ElementName>
      <vmw:CoresPerSocket ovf:required="false">2</vmw:CoresPerSocket>
      <rasd:AllocationUnits>byte * 2^20</rasd:AllocationUnits>
      <rasd:Description>Memory Size</rasd:Description>
      <rasd:ElementName>3072MB of memory</rasd:ElementName>

The VM’s name is simply read from element <Name>:

virtsys = self.root.find('ovf:VirtualSystem', self._ns)
self.name = virtsys.find('ovf:Name', self._ns).text

In Element <OperatingSystemSection> the VM’s OS is specified by ovf:id. How this ID refers to a specific OS can be taken from the DMTF documentation. ID 101 for example refers to ‘Linux 64-Bit’. You can see an additional attribute vmw:osType, which is a VMware extension to OVF to describe the VM’s OS name. We don’t use these vendor specific extension but OFVData contains three dictionaries, that implement the mapping already. These are also used to get the correct licenseType for the VM:

virtos = virtsys.find('ovf:OperatingSystemSection', self._ns)
self.osid = virtos.get(self._nsattr('id', 'ovf'))
if self.osid in self.osTypeLinux:
    self.licenseType = "LINUX"
    osname = self.osTypeLinux[self.osid]
    if self.osid in self.osTypeWindows:
        self.licenseType = "WINDOWS"
        osname = self.osTypeWindows[self.osid]
        osname = self.osTypeOther[self.osid]
print("VM '{}' has {}-type OS '{}'(id:{})"
      .format(self.name, self.licenseType, osname, self.osid))

The number of CPUs and amount of RAM can be taken from the corresponding <Item>s in <VirtualHardwareSection>. Each hardware category has a specific ResourceType, that we can look up in the DMTF documentation. To collect the data we need only to know that ResourceTypes 3 and 4 map to ‘Processor’ and ‘Memory’ respectively. OVFdata contains the dictionary resourceTypes for your information. When collecting the data by

virtcpu = virtsys.find('./ovf:VirtualHardwareSection/ovf:Item/[rasd:ResourceType="3"]', self._ns)
self.cpus = virtcpu.find('rasd:VirtualQuantity', self._ns).text
virtmem = virtsys.find('./ovf:VirtualHardwareSection/ovf:Item/[rasd:ResourceType="4"]', self._ns)
self.ram = virtmem.find('rasd:VirtualQuantity', self._ns).text

there are some points to be aware of:

  • VMware also allows to specify a number of cores per processor in <vmw:CoresPerSocket>, which is currently ignored as a vendor specific extension. So the number of CPUs may be wrong.

  • In contrast, the CPU type (Intel or AMD) is not specified in OVF.

  • <rasd:AllocationUnits> contains the unit for the quantities, e.g. ‘byte * 2^20’ for RAM. The script currently assumes that the amount of RAM is given in MB (and for hard disks in GB).

Collect Disk Data

The next step in parsing is to collect the data for the hard disks from the following parts of the OVF file:

  <File ovf:href="ACME-VM-disk1.vmdk" ovf:id="file1" ovf:size="2217720832" />
  <File ovf:href="ACME-VM-disk2.vmdk" ovf:id="file2" ovf:size="515232768" />
  <File ovf:href="ACME-VM-disk3.vmdk" ovf:id="file3" ovf:size="586605568" />
  <Info>Virtual disk information</Info>
  <Disk ovf:capacity="10" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk1" ovf:fileRef="file1" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="3909550080" />
  <Disk ovf:capacity="6" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk2" ovf:fileRef="file2" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="988151808" />
  <Disk ovf:capacity="8" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk3" ovf:fileRef="file3" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="2240479232" />
<VirtualSystem ovf:id="ACME-VM">
      <rasd:ElementName>Hard Disk 1</rasd:ElementName>
      <vmw:Config ovf:required="false" vmw:key="backing.writeThrough" vmw:value="false" />
      <rasd:ElementName>Hard Disk 2</rasd:ElementName>
      <vmw:Config ovf:required="false" vmw:key="backing.writeThrough" vmw:value="false" />
      <rasd:ElementName>Hard Disk 3</rasd:ElementName>
      <vmw:Config ovf:required="false" vmw:key="backing.writeThrough" vmw:value="false" />

The exported disk files are external to the OVF file and thus are listed in <References>. This is easy to parse:

filerefs = self.root.findall('./ovf:References/ovf:File', self._ns)
files = dict()
for ref in filerefs:
    name = ref.get(self._nsattr('href', 'ovf'))
    id = ref.get(self._nsattr('id', 'ovf'))
    files[id] = name

The <DiskSection> lists all virtual disks and refers to the files in <References>. <Disk> uses the attribute ovf:capacity for the capacity of the disk which, in this example, is measured in GB. As mentioned earlier, the script currently assumes the capacity is given in GB. <Disk> also defines the attribute ovf:diskId which is needed for relating a disk to a specific VM (OVF also allows multiple VMs in one OVF file).

diskrefs = self.root.findall('./ovf:DiskSection/ovf:Disk', self._ns)
disks = dict()
for ref in diskrefs:
    # Note: we assume ovf:capacityAllocationUnits="byte * 2^30" == GiB
    capacity = ref.get(self._nsattr('capacity', 'ovf'))
    # reference to file references above
    fref = ref.get(self._nsattr('fileRef', 'ovf'))
    # the virt. HW section refers to '/disk/vmdisk1' not 'vmdisk1'
    diskid = 'ovf:/disk/'+ref.get(self._nsattr('diskId', 'ovf'))
    # we resolve fref here, we only need the name from now on
    disks[diskid] = {'capacity' : capacity, 'file' : files[fref]}

A VM's hard disks can be taken from the corresponding <Item>s in <VirtualHardwareSection>;. In this case we have to look after ResourceType 17. <rasd:HostResource> contains the reference to the disks in <DiskSection> with a constant prefix of “ovf:/disk/” to the referenced id. The disk order is given in <rasd:AddressOnParent>. In this example the disk’s parent is a SCSI controller not listed here, which we can simply ignore.

In the code we have to resolve the nested references to get a sorted list of disks:

virtsys = self.root.find('ovf:VirtualSystem', self._ns)
virthds = virtsys.findall('./ovf:VirtualHardwareSection/ovf:Item/[rasd:ResourceType="17"]', self._ns)
devices = dict()
for hdd in virthds:
    diskref = hdd.find('rasd:HostResource', self._ns).text
    address = hdd.find('rasd:AddressOnParent', self._ns)
    if address is None:
        # we use the instanceId as fallback in this case
        devNr = hdd.find('rasd:InstanceId', self._ns).text
        devNr = address.text
    devices[devNr] = disks[diskref]
self.disks = [devices[devNr] for devNr in sorted(devices)]

With this procedure we end up with a sorted list of disks like the following:

[{'file': 'ACME-VM-disk1.vmdk', 'capacity': '10'},
 {'file': 'ACME-VM-disk2.vmdk', 'capacity': '6'},
 {'file': 'ACME-VM-disk3.vmdk', 'capacity': '8'}]

Collect Network Data

The last step in parsing is to collect the network data from the following parts of the OVF file:

  <Info>The list of logical networks</Info>
  <Network ovf:name="TC-WAN">
    <Description>The TC-WAN network</Description>
  <Network ovf:name="VM Network">
    <Description>The VM Network network</Description>
<VirtualSystem ovf:id="ACME-VM">
      <rasd:Description>E1000 ethernet adapter on "TC-WAN"</rasd:Description>
      <rasd:ElementName>Ethernet 1</rasd:ElementName>
      <rasd:Connection>VM Network</rasd:Connection>
      <rasd:Description>E1000 ethernet adapter on "VM Network"</rasd:Description>
      <rasd:ElementName>Ethernet 2</rasd:ElementName>

<NetworkSection> contains only the names of the used networks. The network interfaces are specified in <Item>s with ResourceType 10. The Items include a reference to the network in <rasd:Connection> and can be sorted by <rasd:AddressOnParent>.

When we collect the networks, an increasing lanid is added to each network because using a LAN name is not a unique criterion:

vnets = self.root.findall('./ovf:NetworkSection/ovf:Network', self._ns)
lanid = 1
for net in vnets:
    self.lans[net.get(self._nsattr('name', 'ovf'))] = lanid
    lanid += 1

The lanid of the referenced network is also added to the parsed interfaces:

virtsys = self.root.find('ovf:VirtualSystem', self._ns)
virtnics = virtsys.findall('./ovf:VirtualHardwareSection/ovf:Item/[rasd:ResourceType="10"]', self._ns)
devices = dict()
for nic in virtnics:
    nicname = nic.find('rasd:ElementName', self._ns).text
    connection = nic.find('rasd:Connection', self._ns).text
    address = nic.find('rasd:AddressOnParent', self._ns)
    if address is None:
        devNr = nic.find('rasd:InstanceId', self._ns).text
        devNr = address.text
    devices[devNr] = {'nic': nicname, 'lan': connection,
                      'lanid': self.lans[connection]}
self.nics = [devices[devNr] for devNr in sorted(devices)]

With this procedure we end up with a sorted list of interfaces like the following:

[{'nic': 'Ethernet 1', 'lanid': 1, 'lan': 'TC-WAN'},
 {'nic': 'Ethernet 2', 'lanid': 2, 'lan': 'VM Network'}]

Create the VM

As mentioned before we create the VM in several steps. After parsing the OVF file, the data center is created if necessary or the location is determined for an existing data center.

Then all images are checked to get a unique image for the disks and add its ID to the disk data:

for disk in metadata.disks:
    disk_name = disk['file']
    images = get_disk_image_by_name(pbclient, location, disk_name)
    if len(images) == 0:
        raise ValueError("No HDD image with name '{}' found in location {}"
                         .format(disk_name, location))
    if len(images) > 1:
        raise ValueError("Ambigous image name '{}' in location {}"
                         .format(disk_name, location))
    disk['image'] = images[0]['id']

The remaining tasks are to create the server:

server = Server(name=metadata.name,
                cores=metadata.cpus, ram=metadata.ram)
response = pbclient.create_server(dc_id, server)
srv_id = response['id']

create the NICs:

for nic in metadata.nics:
    dcnic = NIC(name=nic['nic'], lan=nic['lanid'])
    response = pbclient.create_nic(dc_id, srv_id, dcnic)

create the volumes:

for disk in metadata.disks:
    dcvol = Volume(name=disk['file'], size=disk['capacity'],
    response = pbclient.create_volume(dc_id, dcvol)

and attach the volumes to the server:

for disk in metadata.disks:
    response = pbclient.attach_volume(dc_id, srv_id, disk['volume_id'])

The resulting VM will then look like this:


Configure the VM

After importing the VM some further configuration tasks should be performed.

If your VM does not boot successfully, this is most likely caused by missing virtio drivers. The tutorials Migrate a VMware Virtual Machine Running Linux to ProfitBricks and Migrate a VMware Virtual Machine Running Windows to ProfitBricks provide a detailed description on how to fix this.

If you want to import some more VMs you should make sure that the uploaded image names are unique. If there are conflicts, you can achieve this by deleting the already used ones or by renaming the disk images in DCD and in the OVF file.

As you can see in the screenshot above, the NICs are not connected to other resources. You should configure public and private LANs, DHCP settings and connect additional VMs.

Depending on the VM's OS it might be necessary to configure the interfaces in the OS due to the changed MAC addresses.

As pointed out, the number of cores might be wrong if the original VM has an assignment of 2 or more cores per processor.

You are freely able to adjust the values for Cores, RAM, and more as you like on ProfitBricks.

List of references