Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: race condition with using for_each in incus_resource #93

Open
tregubovav-dev opened this issue Jun 27, 2024 · 1 comment
Open

Bug: race condition with using for_each in incus_resource #93

tregubovav-dev opened this issue Jun 27, 2024 · 1 comment
Labels
Bug Confirmed to be a bug

Comments

@tregubovav-dev
Copy link

tregubovav-dev commented Jun 27, 2024

Issue

Race condition may appear if multiple instances being created with on different target nodes in cluster using the same image name.
Some instances creation fails with error: Failed instance creation: Failed creating image record: Failed saving main image record: UNIQUE constraint failed: images.project_id, images.fingerprint.

Steps to reproduce

  1. Deploy incus cluster with 3+ nodes and use shared storage (I use 7's nodes RPI4 cluster with ceph storage).
  2. Deploy configuration below
locals {
instances = {
  "inst-01" = {
    target = "node-01"
  },
  "inst-02" = {
    target = "node-02"
  },
  "inst-03" = {
    target = "node-03"
  },
  "inst-04" = {
    target = "node-04"
  },
  "inst-05" = {
    target = "node-05"
  }
}
}

resource "incus_instance" "app_dns_instance" {
  for_each = local.instances

  project = "test"
  image = "images:alpine/edge"
  target = each.value.target
  name = each.key
  wait_for_network = false

  device {
    type = "disk"
    name = "root"
    properties = {
      path = "/"
      pool ="remote"
    }
  }        
}

In my environment two-to-four instance deployments fails with error: Failed instance creation: Failed creating image record: Failed saving main image record: UNIQUE constraint failed: images.project_id, images.fingerprint.

Below is the worsest case output where image was deleted before it cloned for inst-03 root device:

Plan: 5 to add, 0 to change, 0 to destroy.
incus_instance.app_dns_instance["inst-03"]: Creating...
incus_instance.app_dns_instance["inst-02"]: Creating...
incus_instance.app_dns_instance["inst-01"]: Creating...
incus_instance.app_dns_instance["inst-04"]: Creating...
incus_instance.app_dns_instance["inst-05"]: Creating...
╷
│ Error: Failed to create instance "inst-05"
│
│   with incus_instance.app_dns_instance["inst-05"],
│   on main.tf line 21, in resource "incus_instance" "app_dns_instance":
│   21: resource "incus_instance" "app_dns_instance" {
│
│ Failed instance creation: Failed creating image record: Failed saving main image record: UNIQUE constraint failed: images.project_id, images.fingerprint
╵
╷
│ Error: Failed to create instance "inst-04"
│
│   with incus_instance.app_dns_instance["inst-04"],
│   on main.tf line 21, in resource "incus_instance" "app_dns_instance":
│   21: resource "incus_instance" "app_dns_instance" {
│
│ Failed instance creation: Failed creating image record: Failed saving main image record: UNIQUE constraint failed: images.project_id, images.fingerprint
╵
╷
│ Error: Failed to create instance "inst-03"
│
│   with incus_instance.app_dns_instance["inst-03"],
│   on main.tf line 21, in resource "incus_instance" "app_dns_instance":
│   21: resource "incus_instance" "app_dns_instance" {
│
│ Failed instance creation: Failed creating instance from image: Failed to run: rbd --id admin --cluster ceph --image-feature layering clone
│ lxd/image_45ec164abe54425db3622fada8f2bd639313efff8cdc14298a9d16cbab0dd835_ext4@readonly lxd/container_test_inst-03: exit status 2 (2024-06-27T14:10:19.732-0700 ffff82a0a3c0 -1
│ librbd::image::OpenRequest: failed to find snapshot readonly
│ 2024-06-27T14:10:19.733-0700 ffff7560a3c0 -1 librbd::image::CloneRequest: 0xaaaaf0c92640 handle_open_parent: failed to open parent image: (2) No such file or directory
│ rbd: clone error: (2) No such file or directory)
╵
╷
│ Error: Failed to create instance "inst-02"
│
│   with incus_instance.app_dns_instance["inst-02"],
│   on main.tf line 21, in resource "incus_instance" "app_dns_instance":
│   21: resource "incus_instance" "app_dns_instance" {
│
│ Failed instance creation: Failed creating image record: Failed saving main image record: UNIQUE constraint failed: images.project_id, images.fingerprint
╵
╷
│ Error: Failed to create instance "inst-01"
│
│   with incus_instance.app_dns_instance["inst-01"],
│   on main.tf line 21, in resource "incus_instance" "app_dns_instance":
│   21: resource "incus_instance" "app_dns_instance" {
│
│ Failed instance creation: Failed creating instance from image: Error inserting volume "45ec164abe54425db3622fada8f2bd639313efff8cdc14298a9d16cbab0dd835" for project "default" in pool
│ "remote" of type "images" into database "UNIQUE constraint failed: index 'storage_volumes_unique_storage_pool_id_node_id_project_id_name_type'"

@stgraber
Copy link
Member

That's an Incus issue more than an issue with the provider but that should make it pretty easy to reproduce.

@stgraber stgraber added the Bug Confirmed to be a bug label Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Development

No branches or pull requests

2 participants