Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport post-setup healthcheck from agent to alloy #213

Merged
merged 2 commits into from
Jun 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions roles/alloy/handlers/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,8 @@
name: "{{ service_name }}"
state: restarted
become: true
listen: "Restart alloy"

- name: Check alloy is started properly
ansible.builtin.include_tasks: ga-started.yml
listen: "Restart alloy"
29 changes: 29 additions & 0 deletions roles/alloy/tasks/ga-started.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
- name: Health check Grafana Alloy
ansible.builtin.uri:
url: "{{ _alloy_healthcheck_endpoint }}"
follow_redirects: none
method: GET
register: _result
failed_when: false
until: _result.status == 200
retries: 3
delay: 5
changed_when: false
when: not ansible_check_mode

- name: Check system logs if Grafana Alloy is not started
when: not ansible_check_mode and _result.status != 200
block:
- name: Run journalctl
ansible.builtin.shell:
cmd: "journalctl -u {{ service_name }} -b -n20 --no-pager"
register: journal_ret
changed_when: false
- name: Output Grafana Alloy logs
ansible.builtin.debug:
var: journal_ret.stdout_lines
- name: Raise alerts
ansible.builtin.assert:
that: false
fail_msg: "Service {{ service_name }} hasn't started."
5 changes: 4 additions & 1 deletion roles/alloy/tasks/service.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
ansible.builtin.template:
src: alloy.service.j2
dest: /etc/systemd/system/{{ service_name }}.service
mode: '0644'
mode: "0644"
become: true
notify: Restart alloy

Expand All @@ -11,6 +11,9 @@
daemon_reload: yes
become: true

- name: Flush handlers
ansible.builtin.meta: flush_handlers

- name: Ensure alloy service is enabled and running
ansible.builtin.service:
name: "{{ service_name }}"
Expand Down
2 changes: 2 additions & 0 deletions roles/alloy/vars/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Server http address, used in self health check after start
_alloy_healthcheck_endpoint: "http://{{ alloy_flags_extra['server.http.listen-addr'] if alloy_flags_extra['server.http.listen-addr'] is defined else '127.0.0.1:12345' }}/-/ready"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to just do

Suggested change
_alloy_healthcheck_endpoint: "http://{{ alloy_flags_extra['server.http.listen-addr'] if alloy_flags_extra['server.http.listen-addr'] is defined else '127.0.0.1:12345' }}/-/ready"
_alloy_healthcheck_endpoint: "http://{{ alloy_flags_extra.server.http.listen-addr | default('127.0.0.1:12345') }}/-/ready"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, that check fails on some specific ansible version that way. Found 'is defined' to be more reliable.

Loading