Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL input plugin that only updates recent data #16013

Open
wolfgangr opened this issue Oct 12, 2024 · 0 comments
Open

SQL input plugin that only updates recent data #16013

wolfgangr opened this issue Oct 12, 2024 · 0 comments
Labels
feature request Requests for new plugin and for new features to existing plugins

Comments

@wolfgangr
Copy link

Use Case

e.g. migrate an existing setup of time series collection to (my)SQL to influx

I have a number of Databases running for years, where measurements (Climate, Energy, Photovoltaics, Heating, ...) are written to MySQL tables. Some of them in a remote location, with MySQL binary log based replication as a reliable bridge even for temporary connection outage.

The Idea was to setup a telegraf input.SQL to monitor the new data and send them to influx.
Then all the visualisation, aggregation, downsampling etc might performed there.

Expected behavior

I hoped that input.SQL would manage some mechanism to discern between data already captured and data still to transfer.

Compare my old SQL aggregation:

SET @last_intvl := IFNULL( (SELECT MAX(idx_hr) FROM `hourly`), '0000-00-00');
REPLACE into `hourly`
   ....
SELECT FROM raw ....
WHERE idx >= @last_intvl  

Actual behavior

telegraf is always collecting all the rows of a configured query and sends them to influxDB.
In the end, they are discarded there, and the final result is ok.

However, just a simple 30 day period of weather data yields in some 200000 rows. At an update interval of 300 seconds, this puts considerable CPU and memory load on a 4-core machine. I'd expect it to knockle down at intervals as long as 30 seconds and/or source data extending over years.

I don't see any hook to query influx target bucket for the last update and limit the query to all data after that point.
I don't even see a hook to keep track of data sent through the telegraf plugin.

As a cruede temporary workaround, I added a "LIMIT 10" clause to the end of my SQL source query.
At least, this works well as long as all machines and connections are up and available.
It does not replay any backlog that has occured for whatever reasons.

Additional info

No response

@wolfgangr wolfgangr added the feature request Requests for new plugin and for new features to existing plugins label Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

No branches or pull requests

1 participant