Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource uids are not unique #112

Open
gwbischof opened this issue Sep 27, 2019 · 2 comments
Open

resource uids are not unique #112

gwbischof opened this issue Sep 27, 2019 · 2 comments

Comments

@gwbischof
Copy link
Contributor

gwbischof commented Sep 27, 2019


In [2]:     oldclient = pymongo.MongoClient("mongodb://rsoxs-ca:27017/")             
   ...:     old_assets_db = oldclient["rsoxs-assets-store"]                              
   ...:     old_meta_db = oldclient["rsoxs-metadata-store"]                                                                                   

In [4]: list(old_assets_db.resource.find({'uid': '1c43af30-27db-437e-83c0-38b1cc528f06'}))                                                    
Out[4]: 
[{'_id': ObjectId('5d540d4cfb414042494a3a0e'),
  'spec': 'AD_TIFF',
  'resource_path': 'data/2019/08/14',
  'root': '/DATA/images',
  'resource_kwargs': {'template': '%s%s_%6.6d.tiff',
   'filename': 'e1f402f1-10bd-4685-a4d4',
   'frame_per_point': 1},
  'path_semantics': 'posix',
  'uid': '1c43af30-27db-437e-83c0-38b1cc528f06',
  'run_start': '4fa42813-0b56-4eee-a577-2125cd9f6c24'},
 {'_id': ObjectId('5d540d7bfb414042494a3a5e'),
  'spec': 'AD_TIFF',
  'resource_path': 'data/2019/08/14',
  'root': '/DATA/images',
  'resource_kwargs': {'template': '%s%s_%6.6d.tiff',
   'filename': 'e1f402f1-10bd-4685-a4d4',
   'frame_per_point': 1},
  'path_semantics': 'posix',
  'uid': '1c43af30-27db-437e-83c0-38b1cc528f06',
  'run_start': '4fa42813-0b56-4eee-a577-2125cd9f6c24'},
 {'_id': ObjectId('5d540db9fb414042494a3aa2'),
  'spec': 'AD_TIFF',
  'resource_path': 'data/2019/08/14',
  'root': '/DATA/images',
  'resource_kwargs': {'template': '%s%s_%6.6d.tiff',
   'filename': 'e1f402f1-10bd-4685-a4d4',
   'frame_per_point': 1},
  'path_semantics': 'posix',
  'uid': '1c43af30-27db-437e-83c0-38b1cc528f06',
  'run_start': 'd1fc0b67-0fcc-4239-b1c5-cdbc15a27e2b'}]```
@gwbischof
Copy link
Contributor Author

Multiple runs can refer to resources with the same uid, this means that these runs also share datum, because datum looked up by resource uid. The name uid implies that it is unique.

@tacaswell
Copy link
Contributor

But if there is not an index enforcing it.... I think there is an index enforcing that the datum_ids in the Datum collection are unique. It is allowed for events from multiple runs to refer to the same datum if they really do share data (the primary driver here is for backgrounds).

This is a bit of an edge case because we want:

  • runs to be able share references to the same data on disk (primarily for background / calibration reasons)
  • ensure that all foreign keys are included in the event stream if you jump it at a Start document (so a consumer never has to know what oracle to go to to resolve them)
  • the Resource and Datum documents to be primarily generated by Ophyd objects (as that is where the knowledge about the file writing lives)

Previously, ophyd objects directly inserted the resource/datum into data broker (which got point 3 and 1). To fix this we changed how those documents got routed, added the collect_asset_docs which fixed point 2 and allowed us to tack the run_start onto the resource documents to make it easier to collect them later without affecting 1 or 3.

There is something nice about having the duplicated Resource documents share a resource uid as that is the key we use to determine if we need another handler and these all should be able to share said handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants