Infinite Loop Symptoms on Prefect Server #15607

a14e · 2024-10-08T00:12:24Z

Bug summary

I ran into an issue with Prefect while working locally that looks like an endless loop happening on the backend.

I've observed the following behavior locally several times:

After running several dozen manual script executions, the workflows stop displaying correctly in the web UI. Specifically, the green squares on new flows are no longer visible.
At the same time, I see a large stream of logs in the PostgreSQL terminal.
Restarting Prefect resolves the issue.

Logs:

prefect-postgres-1  | 2024-10-07 23:57:37.075 UTC [2893] STATEMENT:  INSERT INTO task_run_state (task_run_id, type, timestamp, name, message, state_details, data, id, created, updated) VALUES ($1::UUID, $2::state_type, $3::TIMESTAMP WITH TIME ZONE, $4::VARCHAR, $5::VARCHAR, $6, $7::JSON, $8::UUID, $9::TIMESTAMP WITH TIME ZONE, $10::TIMESTAMP WITH TIME ZONE) ON CONFLICT (id) DO NOTHING
prefect-postgres-1  | 2024-10-07 23:57:37.082 UTC [2893] ERROR:  duplicate key value violates unique constraint "uq_task_run_state__task_run_id_timestamp_desc"
prefect-postgres-1  | 2024-10-07 23:57:37.082 UTC [2893] DETAIL:  Key (task_run_id, "timestamp")=(3649e07c-9883-42ac-8819-76c821f5f8eb, 2024-10-07 23:45:17.096643+00) already exists.
prefect-postgres-1  | 2024-10-07 23:57:37.082 UTC [2893] STATEMENT:  INSERT INTO task_run_state (task_run_id, type, timestamp, name, message, state_details, data, id, created, updated) VALUES ($1::UUID, $2::state_type, $3::TIMESTAMP WITH TIME ZONE, $4::VARCHAR, $5::VARCHAR, $6, $7::JSON, $8::UUID, $9::TIMESTAMP WITH TIME ZONE, $10::TIMESTAMP WITH TIME ZONE) ON CONFLICT (id) DO NOTHING
....

Version info (`prefect version` output)

Version:             3.0.4
API version:         0.8.4
Python version:      3.12.6
Git commit:          c068d7e2
Built:               Tue, Oct 1, 2024 11:54 AM
OS/Arch:             linux/x86_64
Profile:             ephemeral
Server type:         server
Pydantic version:    2.9.2

Additional context

PostgreSQL version: 16

The text was updated successfully, but these errors were encountered:

a14e · 2024-10-08T00:18:08Z

Here is an example of the error in the UI. You can see that tasks are not displayed, but there are saved artifacts. The Task Runs list is also empty.

Artifacts

Task Runs

a14e · 2024-10-08T15:01:54Z

Workaround:
Execute a command on the PostgreSQL server of Prefect

DROP INDEX IF EXISTS uq_task_run_state__task_run_id_timestamp_desc;
CREATE INDEX uq_task_run_state__task_run_id_timestamp_desc
    ON task_run_state (task_run_id, timestamp DESC);

desertaxle · 2024-10-08T16:18:24Z

Thanks for the bug report @a14e! Do you have a flow that you can reliably reproduce this with? I suspect that there's something strange happening with the task run recorder, but it will be easier to confirm with a clear way to reproduce the issue.

a14e · 2024-10-08T20:37:20Z

@desertaxle Thank you for your reply!
Yes, I can reproduce the issue, but there is a significant amount of randomness involved. Below, I've provided the flow code, and on my machine, it took around 420 flow runs to reproduce the issue twice (I was running it from the UI).

My flow looks simplified like this:

from prefect import flow, task, serve
from typing import List


class DirectusLoaderTasks:

    @task(log_prints=True, name="Directus. Load Topic")
    async def get_topic_by_id(self, topic_id: str) -> str:
        return "123"

class GrammarDBTasks:

    @task(log_prints=True, name="DB. Load items for topic")
    async def load_all_items_from_db(self, topic_id: str) -> list[str]:
        return []

    @task(log_prints=True, name="DB. Load item groups for topic")
    async def load_all_groups_from_db(self, topic_id: str) -> list[str]:
        return []


@task(log_prints=True, name="Simple filter")
async def simple_filter(from_db: List[str]) -> List[str]:
    return []



class OpenAiTasks:
    @task(name="Open AI. filter3", retries=3)
    async def filter3(self,
                      groups: List[str],
                      topic: str) -> List[str]:
        return []

    @task(name="Open AI. filter1", retries=3)
    async def filter1(self,
                      groups: List[str],
                      topic: str) -> List[str]:

        return []

    @task(name="Open AI. filter2", retries=3)
    async def filter2(self,
                      groups: List[str],
                      groups_in_db: List[int],
                      topic: str) -> List[str]:

        return []

    @task(name="Open AI. Normalize", retries=3)
    async def normalize(self,
                        groups: List[str],
                        topic: str) -> List[str]:
        return []




@flow(log_prints=True,name="Item Generation flow")
async def generate_items_flow():

    directus_tasks = DirectusLoaderTasks()
    grammar_tasks = GrammarDBTasks()
    open_ai_tasks = OpenAiTasks()

    topic_id = await directus_tasks.get_topic_by_id("123")

    item_groups_from_db = await grammar_tasks.load_all_groups_from_db(topic_id)
    items_from_db = await grammar_tasks.load_all_items_from_db(topic_id)
    new_item_groups = await simple_filter(items_from_db)

    deduplicated_response: List[str] = await open_ai_tasks.filter1(
        new_item_groups,
        topic_id
    )

    filtered_response: List[str] = await open_ai_tasks.filter2(
        deduplicated_response,
        item_groups_from_db,
        topic_id
    )

    filtered_response = await open_ai_tasks.filter3(filtered_response,
                                                    topic_id)

    normalize_forms = await open_ai_tasks.normalize(filtered_response,
                                                    topic_id)

    return


if __name__ == "__main__":
    example_deploy = generate_items_flow.to_deployment(
        "Generate Grammar Item Groups",
        tags=["tag1", "tag2", "tag3"]
    )
    serve(example_deploy)

Here’s a video showing what the PostgreSQL logs look like in the terminal:

Recording.2024-10-08.232607.mp4

a14e added the bug Something isn't working label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite Loop Symptoms on Prefect Server #15607

Infinite Loop Symptoms on Prefect Server #15607

a14e commented Oct 8, 2024

a14e commented Oct 8, 2024 •

edited

Loading

a14e commented Oct 8, 2024 •

edited

Loading

desertaxle commented Oct 8, 2024

a14e commented Oct 8, 2024 •

edited

Loading

Infinite Loop Symptoms on Prefect Server #15607

Infinite Loop Symptoms on Prefect Server #15607

Comments

a14e commented Oct 8, 2024

Bug summary

Version info (prefect version output)

Additional context

a14e commented Oct 8, 2024 • edited Loading

Artifacts

Task Runs

a14e commented Oct 8, 2024 • edited Loading

desertaxle commented Oct 8, 2024

a14e commented Oct 8, 2024 • edited Loading

Version info (`prefect version` output)

a14e commented Oct 8, 2024 •

edited

Loading

a14e commented Oct 8, 2024 •

edited

Loading

a14e commented Oct 8, 2024 •

edited

Loading