Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC-557] add example for triggering job with GraphQL endpoint #25288

Merged
merged 4 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 152 additions & 16 deletions docs/docs-beta/docs/guides/automation.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,26 @@ Dagster offers several ways to automate pipeline execution:
1. [Schedules](#schedules) - Run jobs at specified times
2. [Sensors](#sensors) - Trigger runs based on events
3. [Asset Sensors](#asset-sensors) - Trigger jobs when specific assets materialize
4. [GraphQL Endpoint](#graphql-endpoint) - Trigger materializations and jobs from the GraphQL endpoint

## How to choose the right automation method

Consider these factors when selecting an automation method:

1. **Pipeline Structure**: Are you working primarily with assets, ops, or a mix?
2. **Timing Requirements**: Do you need regular updates or event-driven processing?
3. **Data Characteristics**: Is your data partitioned? Do you need to update historical data?
4. **System Integration**: Do you need to react to external events or systems?

Use this table to help guide your decision:

| Method | Best For | Works With |
| ---------------------- | -------------------------------------- | ------------------- |
| Schedules | Regular, time-based job runs | Assets, Ops, Graphs |
| Sensors | Event-driven automation | Assets, Ops, Graphs |
| Declarative Automation | Asset-centric, condition-based updates | Assets only |
| Asset Sensors | Cross-job/location asset dependencies | Assets only |
| GraphQL Triggers | Event triggers from external systems | Assets, Ops, Jobs |

## Schedules

Expand Down Expand Up @@ -75,23 +95,139 @@ For more examples of how to create asset sensors, see the [How-To Use Asset Sens

{/* TODO: add content */}

## How to choose the right automation method

Consider these factors when selecting an automation method:

1. **Pipeline Structure**: Are you working primarily with assets, ops, or a mix?
2. **Timing Requirements**: Do you need regular updates or event-driven processing?
3. **Data Characteristics**: Is your data partitioned? Do you need to update historical data?
4. **System Integration**: Do you need to react to external events or systems?

Use this table to help guide your decision:

| Method | Best For | Works With |
| ---------------------- | -------------------------------------- | ------------------- |
| Schedules | Regular, time-based job runs | Assets, Ops, Graphs |
| Sensors | Event-driven automation | Assets, Ops, Graphs |
| Declarative Automation | Asset-centric, condition-based updates | Assets only |
| Asset Sensors | Cross-job/location asset dependencies | Assets only |
## GraphQL Endpoint

It is possible to trigger asset materializations in a job from external services using the GraphQL endpoint.

### When to use the GraphQL endpoint

- You want to integrate Dagster with an external system or tool
- You need to trigger a materialization or job over an HTTP endpoint
- You are creating a custom script for batching operations

### Triggering a job

To trigger a job to run using the GraphQL endpoint in Dagster, you can use the `launchRun` mutation. Here's an example using the `requests` library:

```python
import requests


graphql_endpoint = "http://localhost:3000/graphql"

query = """
mutation LaunchRunMutation(
$repositoryLocationName: String!
$repositoryName: String!
$jobName: String!
$runConfigData: RunConfigData!
) {
launchRun(
executionParams: {
selector: {
repositoryLocationName: $repositoryLocationName
repositoryName: $repositoryName
jobName: $jobName
}
runConfigData: $runConfigData
}
) {
__typename
... on LaunchRunSuccess {
run {
runId
}
}
... on RunConfigValidationInvalid {
errors {
message
reason
}
}
... on PythonError {
message
}
}
}
"""

response = requests.post(
graphql_endpoint,
json={
"query": query,
"variables": {
"repositoryLocationName": "<replace-with-code-location-name>",
"repositoryName": "__repository__", # default if using `Definitions`
"jobName": "<replace-with-job-name>",
"runConfigData": {},
},
},
)
```

### Triggering an asset materialization
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added both asset materialization via LaunchPipelineExecution and run request with LaunchRunMutation.

How do we feel about the presentation of these two options?


To trigger an asset materialization using the GraphQL endpoint in Dagster, you can use the `LaunchPipelineExecution` mutation. Here's an example using the `requests` library:

```python
import requests


graphql_endpoint = "http://localhost:3000/graphql"

query = """
mutation LaunchPipelineExecution(
$executionParams: ExecutionParams!
) {
launchPipelineExecution(executionParams: $executionParams) {
... on LaunchRunSuccess {
run {
id
pipelineName
__typename
}
__typename
}
... on PipelineNotFoundError {
message
__typename
}
... on InvalidSubsetError {
message
__typename
}
... on RunConfigValidationInvalid {
errors {
message
__typename
}
__typename
}
}
}
"""

response = requests.post(
graphql_endpoint,
json={
"query": query,
"variables": {
"executionParams": {
"mode": "default",
"runConfigData": "{}",
"selector": {
"assetCheckSelection": [],
"assetSelection": [{"path": ["<replace-with-asset-key>"]}],
"pipelineName": "__ASSET_JOB",
"repositoryLocationName": "<replace-with-code-location-name>",
"repositoryName": "__repository__",
},
}
},
},
)
```

## Next steps

Expand Down
6 changes: 5 additions & 1 deletion docs/docs-beta/docs/guides/sensors.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ sidebar_position: 20

Sensors enable you to trigger Dagster runs in response to events from external systems. They run at regular intervals, either triggering a run or explaining why a run was skipped. For example, you can trigger a run when a new file is added to an Amazon S3 bucket or when a database row is updated.

:::tip
An alternative to polling with sensors is to push events to Dagster using the [Dagster API](/guides/automation#graphql-endpoint).
:::

<details>
<summary>Prerequisites</summary>

Expand Down Expand Up @@ -74,4 +78,4 @@ By understanding and effectively using these automation methods, you can build m

- Run pipelines on a [schedule](/guides/schedules)
- Trigger cross-job dependencies with [asset sensors](/guides/asset-sensors)
- Explore [Declarative Automation](/concepts/automation/declarative-automation) as an alternative to sensors
- Explore [Declarative Automation](/concepts/automation/declarative-automation) as an alternative to sensors
1 change: 1 addition & 0 deletions docs/docs-beta/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ const sidebars: SidebarsConfig = {
'guides/schedules',
'guides/sensors',
'guides/asset-sensors',
'guides/automation',
//'guides/declarative-automation',
],
},
Expand Down
Loading