Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Create loggers.py and add Elasticsearch handler #49

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ke-zhang-rd
Copy link
Contributor

@ke-zhang-rd ke-zhang-rd commented Jan 9, 2019

Create loggers for carproto, bluesky and ophyd to Elasticsearch

@codecov-io
Copy link

codecov-io commented Jan 9, 2019

Codecov Report

Merging #49 into master will decrease coverage by 0.09%.
The diff coverage is 30.55%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #49     +/-   ##
=========================================
- Coverage   42.46%   42.37%   -0.1%     
=========================================
  Files          10       11      +1     
  Lines         584      616     +32     
=========================================
+ Hits          248      261     +13     
- Misses        336      355     +19
Impacted Files Coverage Δ
nslsii/common/ipynb/animation.py 18.75% <0%> (ø) ⬆️
nslsii/common/ipynb/info.py 55% <100%> (ø) ⬆️
nslsii/loggers.py 23.33% <23.33%> (ø)
nslsii/__init__.py 6.72% <50%> (+0.73%) ⬆️
nslsii/_version.py 44.4% <0%> (+1.8%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ff403e...94b3de8. Read the comment docs.

@danielballan
Copy link
Contributor

To test, we need access to a beamline machine (so it can see elasticsearch.cs.nsls2.local). I would do the following.

ssh xf23id1-srv1
# Create and activate a new conda environment.
conda create -n test-logger python=3.6
conda activate test-logger
# Install the packages we need with pip, including a development version of nslsii.
pip install bluesky ophyd caproto ipython
git clone https://github.com/ke-zhang-rd/nslsii
cd nslsii
git checkout logger
pip install -e .
cd ..
# Start IPython.
ipython

Then in IPython:

import nslsii.loggers
from bluesky import RunEngine
RE = RunEngine({})
RE([])  # generates log messages that should go to Elasticsearch

Notice that I have not used bsui or any IPython profiles, so this will not interfere with any beamline operations.

Next steps:

  • We should POST the logs from bluesky, ophyd, and caproto to separate indexes in elastic, replacing /test/ in the URL with specific names as outlined by Stuart (see Add loggers aimed at Elasticsearch #48 (comment)). This means we'll need a separate ElasticHandler instance for each.
  • We should think about what log level we want for each library. The DEBUG logs from caproto, for instance, are probably too much information. They would require a lot of storage, and writing them imposes a significant performance overhead. But the DEBUG logs from bluesky are not quite so heavy and may be worth keeping.

@danielballan
Copy link
Contributor

Also, we wondered during our Slack call about the difference between PUT and POST. If you use PUT, you must specific a unique ID for that log entry. Thus, PUT is idempotent, because PUT-ing it a second time with the same ID is redundent.) If you use POST, you leave it up to Elastic to generate a unique ID for you. Thus, POST is not idempotent, because POST-ing a second time will generate a second entry with a new, separate unique.

I learned that from this documentation page: https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

It suggest that you should use PUT if and only if there is a "natural" obvious unique ID. Otherwise, you use POST and let Elastic generate an internal unique ID for you. For these log messages*, I think there is no natural unique ID, so we should use POST. Do you agree?

*There is one possible exception to this statement. The log messages from bluesky that say, "A Document has been emitted with uid=..." could use the document's uid as part of the unique ID for the log message. I am not sure that's the right thing to do here...maybe something to come back to later.

def __init__(self, url, level=logging.INFO):
self.url = url
self.level = level
self.response = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is superfluous now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think url and level are still useful, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member

@mrakitin mrakitin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather separate the style/pep8 checking and the loggers feature, so that the irrelevant files are not touched. But since there are only a few of them, they can be kept. Please notice a critical problem 1000**3 -> 1000 * 3, which should be fixed. See my comment below.

nslsii/ad33.py Outdated Show resolved Hide resolved
@mrakitin mrakitin changed the title MNT: Create loggers.py MNT: Create loggers.py and add Elasticsearch handler Jan 11, 2019
.flake8 Outdated Show resolved Hide resolved
Copy link
Member

@mrakitin mrakitin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caproto/_headers.py,caproto/tests/test_pyepics_compat.py,caproto/_version.py are not relevant. Please remove them.

[flake8]
ignore = E402,W504
exclude = .git,__pycache__,doc/source/conf.py,build,dist,versioneer.py,caproto/_headers.py,caproto/tests/test_pyepics_compat.py,caproto/_version.py
max-line-length = 115
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here is that the line length here is 115, but in .travis.yml it's set to 79 via command-line arg (which most likely has a priority). I think we should have a single source of truth here, so that only one configuration is controlling the settings in both development environments and in the CI systems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why it was set to 79 in .travis.yml since that is the default anyway. @ke-zhang-rd, I think the right path here is to remove --max-line-length=79 from this line as part of this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the plan.

@danielballan
Copy link
Contributor

The handler itself looks done to me. Next steps:

  1. A little thing: we should add requests to requirements.txt.
  2. We should get the index name (currently 'test') set to something real. I suggest moving all the code below the ElasticHandler definition into a function (configure_elastic, perhaps, to follow the pattern set by configure_base and configure_olog), and make that function expect a beamline (like 'csx') as a string. Then the index can be csx.bluesky as planned in the Kafka topics document.
  3. We need to set the levels of each of our handlers. I think bluesky should be set to DEBUG, and ophyd and caproto should be set to INFO (because DEBUG generates more information than we can afford to store, and because POST-ing all of that data to Elastic would cause data acquisition to slow down significantly).
  4. We should then set the levels of all the default handlers, such as bluesky.current_handler to 'WARNING'.

Copy link
Contributor

@danielballan danielballan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good for now. Let's let it sit (i.e. do not merge) until the caproto PR is merged, in case details change, such as the attributes on record.

@mrakitin
Copy link
Member

@jklynch, can it be useful and complimentary to your recent work in #80?
@ke-zhang-rd, what was the blocking issue here?

@jklynch
Copy link
Contributor

jklynch commented Mar 19, 2020

We decided this is out of scope for #80. I do not want to do any further logging development until we have had time to evaluate the new state of logging on the floor.

@mrakitin
Copy link
Member

OK, I see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants