Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design decision: JSON object design #4

Open
avirshup opened this issue Sep 16, 2016 · 4 comments
Open

Design decision: JSON object design #4

avirshup opened this issue Sep 16, 2016 · 4 comments

Comments

@avirshup
Copy link
Contributor

This is the big question: how is data laid out inside the JSON structure?

The best working example currently comes from @jchodera on slack:

{name:'molecule name',
 type:'molecule',
 provenance:{
    rcsbid:'A123'
 },
 topology:{
   [TOPOLOGY BLOCK containing atoms, bonds, residues, chains, etc.]
 }
 forcefield:{
  [OPTIONAL FORCEFIELD BLOCK]
 }
 states:{
  [ONE OR MORE DYNAMICAL STATE BLOCKS WITH PROPERTIES ATTACHED]
 }
}
@avirshup
Copy link
Contributor Author

From @jchodera on slack:

I think we need to figure out what top-level structures "everybody" can agree on are important.

The list you provide (topology, properties, forcefields, wavefunctions, geometry/dynamics) sounds like it is a bit too heterogeneous in the level of abstraction.
Instead, something simpler, like

  • topology (anything static)
  • state information (anything dynamical)
  • tool definitions and input parameters (which could include forcefields, QM levels of theory)
  • tool-computed input properties for specific states (which could include wavefunctions as well); could even be contained within a state definition, since it is associated with a given state
  • tool-computed input properties for the topology (which would include cheminformatics stuff, or stuff that doesn't depend on a specific state)

@egonw
Copy link

egonw commented Sep 17, 2016

You can take advantage of standardization that has been done in the past. What about a JSON format based on the Chemical Markup Language specification? If you go JSON-LD then you have full semantics and full interoperability.

@avirshup
Copy link
Contributor Author

I like the idea of incorporating parts of CML - for instance, CompChem dictionary, has a lot of good descriptive fields for QM computed properties.

@egonw - Thanks for pointing out JSON-LD, that actually seems like the solution to problem that we haven't created an issue for yet.

Also, would you mind pointing to some use cases for CML? I've been aware of it for a while, but haven't ever really done anything with it - it would be great to get a feel for the current use cases.

@egonw
Copy link

egonw commented Sep 24, 2016

It's used in Bioclipse as it is the most verbose (explicit) file format, allowing us to store information we cannot store in other formats (like atom type info, which may be particularly useful when using custom force/new fields!).

While this never really picked up momentum, the original CML being XML, it also makes it really easy to use in other XML-documents (using the XML namespace standards), e.g. with CMLRSS (10.1021/ci034244p, green OA version).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants