Requirements on a common data representation

As part of making the Validator, FhirPath & CQL engine useable and performant across POCO and ITypedElement-based data, we are investigating new/better ways of getting data into these engines. Currently, all data must be in ITypedElement form for the validator and FhirPath and in POCO form for the CQL engine. It is not immediately clear from the definitions of POCOs and ITypedElement which features are essential for the engines to function, so we will discuss these below, as input to a possible re-design. Note that neither ITypedElement not the POCOs actually support all these, hence we currently have sub-optimal (aka hacks) in place to make this work at all.

Requirement: Navigate down a tree


Why	To get to all data in a resource, we need to be able to traverse the tree
How	Via `GetElementPairs()` we can currently traverse down properties, containing either other complex data, list of data or (at the leaves) atomic .NET data types.
Used	Everywhere, essential.
Remarks	There are several ways to get the children currently, but all of them can be based on `GetElementPairs()`.

Requirement: Navigate to the parent in the tree


Why	When being passed an element, being able to find its parent.
How	NEW - A `Parent` property on `Base`
Used	To construct the Location of a node within the tree, to find nearest resource, to find containers to resolve internal references.
Remarks	Keeping the Parent property useful and up to date is hard since we need to keep it updated under changes. This means that getters/setters need to maintain it, but also adding/removing from a list. We even may need the List itself to be a parent (to be able to derive an index for an element), which means the List type in the generated POCO needs to change.

Requirement: Convert values to Cql types


Why	Comparisons and math should be done on the Cql types.
How	NEW - Implement `ICqlConvertible` on FHIR primitives, FHIR.Quantity
Used	To carry out math and comparisons in FhirPath (and in the future, maybe CQL).
Remarks	Currently, this logic is duplicated: the POCO types have comparisons on FHIR Primitives, which is not used by the FhirPath engine. The logic is also present on the CQL Types, so it is duplicated. Preferably, the operators on the FHIR primitive should delegate to the CQL operators on the applicable types.

Requirement: Know the element name of a node


Why	Sometimes, logic depends on the name of the node.
How	A POCO does not know its name, but when listing the children, their names are listed with the actual children, so known at that point.
Used	To generate a Location, to filter elements in summaries, for general debug purposes, to relate definitions to instances, etcetera.
Remarks	The fact that a node itself does not know its name (nor position in the list) means we may have to derive it by looking back up at the parent and then finding ourselves within its children (where the name is known). This would be acceptable (but slow) if this is only required for diagnostic messages, which I think it is, but we need to confirm.

Requirement: Store incorrect data


Why	It is important to be able to capture the parsed data as it was sent to us, even incorrect parts, to make sure we do not lose data and to reason about it.
How	A POCO has limited flexibility to store incorrect data, although the FHIR primitives have an `ObjectValue` that captures the raw, unparsed input string. We can add specific resources and datatypes called `DynamicResource` and `DynamicDataType` that do not have fixed properties but use dictionaries. Of course, to participate in the ecosystem, they will have to implement all interfaces to meet the other requirements formulated here.
Used	Roundtripping, reporting errors during validation, go "as far as we can" with incorrect input.
Remarks	Instead of these new resource types, we might introduce `IResource` and `IDataType` and let that be implemented in our existing `ElementNode` (and add an `ElementDataTypeNode`) that would implement both `ITypedElement` and those new interfaces.

Requirement: Represent collection elements


Why	The model has both elements and repeating elements, and these need to be distinguished and are best handled using the familiar .NET collections.
How	Element properties must be lists
Used	Serialization, navigation through the tree, indexing, cardinality validation, fhirpath map/select etc.
Remarks	Experience with `ITypedElement` (which mimics the XML) shows that it is useful to keep lists of stuff as lists.

Requirement: Indicate null/no data


Why	Need to pick a value to use when an element exists, but is not present in the representation / has no data
How	Use null
Used	Everywhere
Remarks	We would now prefer to use `null` over an empty collection for repeating elements.

Requirement: Need to know the instance type in the (FHIR) model


Why	Processing logic may depend on the type of (FHIR) data, especially on choice types
How	Each node should carry a (string based) typename
Used	Serialization (choice types), FhirPath `ofType()`, validation
Remarks	These must be runtime types, so should not be abstract types (as found in the StructureDefinitions sometimes). The POCO's have a naming convention for backbone types, which we could stick to. Cql primitives may be named by their url. Based on current practice, names that are not canonicals should be considered FHIR types, so anything else is from another model (e.g. CDA, if that's every going to be applicable). Unclear what to use if deserialization cannot determine an actual type, but it is probably better to pick a sentinel name for it, rather than leave it as null.

Requirement: Locate the nearest parent resource


Why	Find the container of an element.
How	Navigate up in the tree and then check if a node is a Resource/`IResource`
Used	Resolution of contained resources, `%resource` in FhirPath, summary generation
Remarks

Requirement: Resolve internal references


Why	FHIR offers references between resources in the same resource/bundle
How	Navigate up in the tree and then check if a node is a Resource/`IResource`. Special handling is needed for contained resources and Bundles.
Used	Implement `resolve()`, implement `Resolve()` on a FHIR reference datatype.
Remarks	It would be nice to have this functionality in the POCOs, now it is only present in the ScopedNode.

Requirement: Determine Equality


Why	Need to know whether two instances are "the same"
How	Since there are different notions of what it means to be "the same" we might need several implementations of `IEqualityComparer<T>`, which would probably need children, names, types etc to determine equality.
Used	Comparisons in FhirPath, equality in set operators etc.
Remarks

Requirement: Deep copy


Why	Need to make duplicates that can me modified independently
How	The POCOs currently have functions for making deep copies through the `IDeepCloneable` interface. Might be done using IDictionary too, which would require less boilerplate in the POCOs
Used	Snapshot generator, presumably user code.
Remarks

Requirement: Make annotations


Why	Useful to add user-definable annotations to each node of a tree for processing or informative purposes.
How	We currently have an interface `IAnnotated` and `IAnnotatable`.
Used	User code, TypedElement stack
Remarks	Unclear why `IAnnotatable` is not a derived interface from `IAnnotated`.

Requirement: Provide binding facilities


Why	Some datatypes can be used in bindings, need a uniform way to extract the code from it.
How	There is an `ICoded<T>` interface which may be useful
Used	Validator, CQL
Remarks	CQL actually requires every resources to be able to return its "code", which is often one of the coded element that classifies the resource. So this is different from being able to extract a code from a bindeable datatype. But maybe there is overlap.

Sketchpad

For FhirPath

Need to be able to navigate through the tree of elements
Need to be able to get the value of a node as CQL/System type
Need to know the element name of a node
Need to be able to identify lists, and enumerate the elements. Preferably performant access based on index.
Need to detect null/empty values
Need to know the type of data to implement as() and ofType() and check the root node's type.
Need to be able to refer to the %resource, %rootResource and %context
Need to be able to resolve contained resources and bundled resources by id, starting from %rootResource (or %context?)
Need to be able to convert data from FHIR Quantity types to System.Quantity
Might need to be able to obtain full reflection type info (to implement https://build.fhir.org/ig/HL7/FHIRPath/#reflection)
Might need equality and comparison operators on non-system types.
Might need general conversion operators from non-system types to other types.
Might need to be able to read annotations
Might need to know the location of the node for a trace() message.

For the Validator

Need to be able to navigate through the tree of elements
Need to be able to get to the value of a node as CQL/System type, although a serialized form is acceptable too
Need to know the element name of a node, although a suffixed ([x]) form is acceptable too
Need to be able to identify lists and enumerate the elements
Need to detect null/empty values
Need to know the type of data only when this is not known from the definition (e.g. at contained, at root or a choice type)
Need to be able to resolve an internal reference
Need to know the location (instance path) of an element for use in diagnostic messages
Need to know the definition path (including slice) of an element for use in diagnostic messages
Need to know that data is bindeable or orderable
Need to be able to convert data to FHIR code/coding/codeableconcept for use with the terminology service
Needs to represent the data as a string for debug purposes
Needs to be able to represent persistent, serializable values (for use in Fixed/Patterns)
Might need to be serializable to fhir
Might need to be able to set annotations

CQL Engine

Really depends on the POCO currently, not easy to switch to another abstraction since Linq Expressions and code generation all depend on POCO's being present. To replace this, we'd need to fall back to e.g. the dynamic runtime and generate code against a DynamicMetaObject. Possible, but ambitious.

Others

Need to know the nearest parent resource in MaskingNode
StructureDefinition information (or ISDSummary) for element model and serialization
Generic "resolve" function currently uses id, ContainedResources, BundledResources, lots of ScopedNode members.
ScopedNode is public, and there are dependencies on its methods in other public parts of the API, so ScopedNode (as a wrapper of ITypedElement) will be around for a while, whatever new representation we might choose.
Simplifier uses ITypedElement extensively, and FS as well (though it uses ISourceNode more) from what I understood, so using the validator and FhirPath with ITypedElement should remain possible. This is probably also true for a lot of other non-firely users.
Parsers need to store incorrect data, preferably enabling losless round-tripping.
Attribute validation?
Summary serialization needs "in summary", min cardinality/mandatory and "is modifier".
XML serialization needs the absolute order of an element.
Serialization needs to know that an element is a choice element.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly