Add support for procedure annotations to the assembler #1434

bobbinth · 2024-08-07T07:21:27Z

There are several use cases for adding support for procedure annotations in MASM. For example:

Directives to the assembler on how to assembler a program.
- One example of this is inline annotation which could inform the assembler that the annotated procedure should be inlined.
- Another example is an annotation indicating whether a given procedure is expected to be call-ed, syscall-ed, or exec-ed.
Information about procedure signature which could be useful for the compiler (and maybe debugger).
User defined annotation. One example of these is the storage_offset annotation which we need in miden base to define storage offsets for account procedures.

I think @bitwalker had some ideas about how annotations could work. Let's discuss these in this issue.

The text was updated successfully, but these errors were encountered:

bitwalker · 2024-08-07T19:07:09Z

In #1436, I proposed some syntax changes in Miden Assembly that I think would solve for our issues with kernel procedure visibilities, and defining/maintaining important procedure metadata like the type signature.

However, there are a number of other things, such as inlining for which annotations would be preferable to dedicated syntax:

Inlining
Interacting with external tooling, such as linter/formatter
Decorators, such as storage offsets, automatically emitting debug events when a given procedure is called, etc.

The following is my proposed syntax for annotations:

# Grammar Rules
#
# ANNO := "@" "[" <META> "]" => <>
#
# META := <name:ID> => Annotation::from(name)
#       | <name:ID> "(" <tokens:Token> ")" => Annotation::from((name, tokens))
#       | <name:ID> "=" <value:META_EXPR> => Annotation::from((name, value))
#
#  META_EXPR := <name:ID> => Meta::Path(name.into())
#             | <string:String> => Meta::String(string)
#             | <int:u64> => Meta::Int(int)

# Following is an example of how this would be used to solve various things from our list:

# Inline this procedure if we're looking for inline opportunities
@[inline]
proc.add_unchecked
  ...
end

# Force this procedure to be inlined
@[inline(always)]
proc.intrinsic
   ...
end

# Mark this procedure as deprecated, and tell the linter to ignore any warnings generated by its body
@[allow(warnings::all)]
@[deprecated(since = "0.10.0")]
proc.deprecated
   ...
end

# Specify a storage offset
@[storage_offset = 1024]
proc.get_item
   ...
end

I think this syntax provides us with enough flexibility to support essentially any type of attribute we might feasibly need, and supports arbitrary extension for user-defined attributes.

I think we should require registering callbacks with the parser for user-defined attributes, dispatched based on the attribute name. Then, if we parse a module which contains unhandled attributes, we can raise a diagnostic. The user-provided callback would be given the opportunity to parse the attribute value, and return that value or a diagnostic if the value is invalid. This makes the system very extensible, while still keeping the API quite simple.

It would be important to store attributes in the procedure metadata of compiled libraries as well, so that this information can be used downstream. For attributes which only apply during parsing/compilation, they can be stripped, but some of these cannot: inline, deprecated at least, maybe storage_offset?. In any case, user-defined attributes would certainly need to be preserved, as we cannot know how they are meant to be used (or at the least, we provide a mechanism to indicate whether to preserve them or not as part of "registering" an attribute).

I think that about sums it up for now. I'm using "attribute" and "annotation" interchangeably here, but they are the same thing IMO.

Fumuran · 2024-09-02T17:35:01Z

Probably we also need to add an annotation for a dead code, which should be compiled anyway.

We came across this use case in the 0xPolygonMiden/miden-base#803 PR: almost all procedures in the kernel will be called only by their hash, so it makes sense to change them from export to proc, but in that case we cannot get the hashes of these procedures, since they were not compiled and were not added to the MastForest.

As @bobbinth mentioned in his comment, we should add an annotation for such procedures, which should be compiled even if they are considered a dead code.

bobbinth · 2024-09-02T17:52:57Z

The above is essentially for procedures which could be invoked dynamically but we don't want to export them from a give module. So, maybe "dead code" is not the right term here. Maybe we could use something like dynamic attribute like so:

@[dynamic]
proc.foo
    ...
end

The assembler would treat such a procedure as if it was an export procedure but it would not add it to the list of exports for the module.

Another option is to use a visibility modifier as discussed in #1436 (comment) and the related thread. Maybe something like:

proc(dynamic).foo
    ...
end

This commit introduces the concept of attributes/annotations to Miden Assembly procedures. These can be used to represent interesting/useful bits of metadata associated with specific procedures in three different forms: * Marker attributes, e.g. `#[inline]`, a name with no associated data * List attributes, e.g. `#[inline(always)]`, i.e. a parameterized marker; multiple values can be provided as comma-delimited values. * Key-value attributes, e.g. `#[key = <value>]`, where `<value>` can be a "meta expression" of three possible types: identifier, string, or integer (in either decimal or hexadecimal format). Attributes will provide the foundation for upcoming changes that will rely on being able to attach metadata to procedures. For now, attributes may _only_ be attached to procedure definitions, not re-exports or any other syntactic construct. NOTE: This does not yet act on any attributes, nor store them anywhere when assembling to MAST. For now, they are simply parsed, made available in the AST, and ignored. Future PRs will introduce these as needed. Closes #1434

This commit introduces the concept of attributes/annotations to Miden Assembly procedures. These can be used to represent interesting/useful bits of metadata associated with specific procedures in three different forms: * Marker attributes, e.g. `@inline`, a name with no associated data * List attributes, e.g. `@inline(always)`, i.e. a parameterized marker; multiple values can be provided as comma-delimited metadata expressions. * Key-value attributes, e.g. `@props(<key> = <value>, ...)`, where `<key>` must be a valid identifier, and `<value>` must be a valid metadata expression. Multiple key-value attributes can be set at once, or you can specify the same key-value attribute multiple times, so long as each instance does not have any keys that conflict with previous instances. Metadata expressions come in three possible types: * bare identifier, e.g. `foo` * quoted string, e.g. `"some text"` * integer value, either decimal or hexadecimal format, e.g. `1` or `0x01` Attributes will provide the foundation for upcoming changes that will rely on being able to attach metadata to procedures. For now, attributes may _only_ be attached to procedure definitions, not re-exports or any other syntactic construct. NOTE: This does not yet act on any attributes, nor store them anywhere when assembling to MAST. For now, they are simply parsed, made available in the AST, and ignored. Future PRs will introduce these as needed. Closes #1434

bobbinth · 2024-09-27T05:37:25Z

#1510 added parsing of annotations to the assembler. I think what remains propagating this information through to MastForest. Should we create a different issue for this?

bitwalker · 2024-09-27T05:46:51Z

Probably we also need to add an annotation for a dead code, which should be compiled anyway.

We came across this use case in the 0xPolygonMiden/miden-base#803 PR: almost all procedures in the kernel will be called only by their hash, so it makes sense to change them from export to proc, but in that case we cannot get the hashes of these procedures, since they were not compiled and were not added to the MastForest.

The assembler would treat such a procedure as if it was an export procedure but it would not add it to the list of exports for the module.

I would argue that any procedure that is callable from outside the object produced by the assembler, should be marked export, regardless of whether it is meant to be called indirectly or not. These visibility modifiers are the only meaningful way to convey information about whether something is semantically exported or not.

However, I do think we need to ensure that indirect calls to library-internal procedures is supported. In that case, the indirect call isn't to an "exported" procedure, and more like capturing the address of a function and calling it. To do so, we can track which procedures have had their "address taken" via procref, and ensure that those procedures are indirectly callable. We'll need to do the tracking part any way in order to support dead code elimination - which is sort of what you are getting at, except for I don't think we need an annotation for it: procedures can either be marked export (which will prevent them from being DCE'd), or have their address taken via procref (which will indicate that they are used, and thus not candidates for DCE).

IMO, the more interesting option for an annotation, would be something like @derive(dyncall), which would instruct the assembler to generate two variants of the procedure, one which is a dyncall-able wrapper around the original procedure that handles the inconsistency introduced by the hash argument remaining on the operand stack. Call sites would then be modified to call the appropriate procedure based on what invoke instruction is used (i.e. exec gets foo, while dynexec gets foo_dyn). Ideally the dyn* instructions could be modified so as to not require this bifurcation of callees, but as a workaround, generating the boilerplate for them would be a major convenience IMO.

bitwalker · 2024-09-27T06:05:09Z

#1510 added parsing of annotations to the assembler. I think what remains propagating this information through to MastForest. Should we create a different issue for this?

Probably a good idea - there are a few open questions around how this information should be handled that I think we'll want to pin down:

The whole "can we support annotating re-exported procedures" question. I think an easy "yes" can be given for some annotations. For example, if we add support for lints/deprecations, and wanted to mark a re-export as @deprecated("0.2.0", "use bar instead"), that's fine, as the re-export being deprecated when the original definition is not has value; similarly, they could both be marked deprecated differently, and resolving that conflict is obvious, as the one closest to the call site takes precedence. It's a completely different situation for annotations that aren't just "soft" metadata though, and may imply something about the behavior of the callee, or worse, change the behavior of either the callee or the caller. Propagating attributes in such a way that we can "do the right thing" at a call site for a given procedure can be a non-trivial question, if every re-export can potentially add new annotations that may override/conflict with those of the actual definition. We don't necessarily need to answer these questions now, but the answer will affect how we propagate this information in the assembler.
We probably should reserve some annotations that we're likely to need, and raise diagnostics if they are used (until we actually implement handling for them). For example, I'd probably reserve right out of the gate: warn, allow, forbid, deprecated, error, warning, test, derive, inline, type, doc, storage, feature, cfg, global, const, constant, dynamic, and whatever else is vaguely keyword like, or that we already anticipate using in the future or think we might want to use in the future.
We might want to flag the annotations we need with the highest priority, ideally with different categories of semantics, and look at how we'd support those, and use that process to make sure whatever implementation we come up with for propagating annotation information is solid enough to support all of them without too many contortions.

bobbinth · 2024-09-28T01:01:53Z

#1510 added parsing of annotations to the assembler. I think what remains propagating this information through to MastForest. Should we create a different issue for this?

Probably a good idea -

Created #1517 for this.

bobbinth · 2024-09-28T01:14:03Z

I would argue that any procedure that is callable from outside the object produced by the assembler, should be marked export, regardless of whether it is meant to be called indirectly or not. These visibility modifiers are the only meaningful way to convey information about whether something is semantically exported or not.

Generally, I agree, though there are a couple of nuances in this specific case.

These procedures are not meant to be called directly - they need a wrapper and without this wrapper they won't function properly. So, there is a publicly exposed wrapper and this wrapper then calls these procedures dynamically.
These procedures are exported from the kernel module. So, since they are marked as export, they get included into the Kernel struct. This shouldn't happen and we either need to post-process the struct to remove them (which would be quite hacky). Though, maybe this will get resolved with procedure visibility modifiers discussed in #1436.

bobbinth added the assembly Related to Miden assembly label Aug 7, 2024

bobbinth added this to the v0.11.0 milestone Aug 7, 2024

bobbinth mentioned this issue Aug 7, 2024

Allow multi-module kernels #1436

Open

bobbinth assigned bitwalker Aug 18, 2024

bobbinth mentioned this issue Sep 2, 2024

Dynamic kernel procedure invocation 0xPolygonMiden/miden-base#803

Merged

2 tasks

Fumuran mentioned this issue Sep 2, 2024

Change kernel procedures from export to proc 0xPolygonMiden/miden-base#852

Open

bitwalker mentioned this issue Sep 5, 2024

Binding against Miden procedures from Rust 0xPolygonMiden/compiler#304

Open

bobbinth mentioned this issue Sep 14, 2024

Develop a way to create and a package format for smart contracts 0xPolygonMiden/miden-base#550

Open

bitwalker mentioned this issue Sep 23, 2024

feat: implement procedure annotation syntax #1510

Merged

bobbinth mentioned this issue Sep 28, 2024

Propagate procedure attributes from AST to MastForest #1517

Open

bobbinth closed this as completed Sep 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for procedure annotations to the assembler #1434

Add support for procedure annotations to the assembler #1434

bobbinth commented Aug 7, 2024

bitwalker commented Aug 7, 2024

Fumuran commented Sep 2, 2024

bobbinth commented Sep 2, 2024

bobbinth commented Sep 27, 2024

bitwalker commented Sep 27, 2024 •

edited

Loading

bitwalker commented Sep 27, 2024

bobbinth commented Sep 28, 2024

bobbinth commented Sep 28, 2024

Add support for procedure annotations to the assembler #1434

Add support for procedure annotations to the assembler #1434

Comments

bobbinth commented Aug 7, 2024

bitwalker commented Aug 7, 2024

Fumuran commented Sep 2, 2024

bobbinth commented Sep 2, 2024

bobbinth commented Sep 27, 2024

bitwalker commented Sep 27, 2024 • edited Loading

bitwalker commented Sep 27, 2024

bobbinth commented Sep 28, 2024

bobbinth commented Sep 28, 2024

bitwalker commented Sep 27, 2024 •

edited

Loading