Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New opcode: modexp #6139

Open
mangoplane opened this issue Sep 19, 2024 · 12 comments · May be fixed by #6140
Open

New opcode: modexp #6139

mangoplane opened this issue Sep 19, 2024 · 12 comments · May be fixed by #6140
Assignees
Labels
new-feature-request Feature request that needs triage

Comments

@mangoplane
Copy link

mangoplane commented Sep 19, 2024

Problem

I believe modexp (modular exponentiation) should be available as an opcode, as it seems like an important crypto primitive. It is also present in many other chains like EVM. While I have implemented its functionality in Puya BigNumber, I'm afraid its cost is prohibitively expensive for most real-world applications like RSA signature verification. This is why I propose it be made an opcode.

This opcode would pave the way for several novel use cases, such as supporting RSA, which is still a popular cryptographic function seen in many places like JWT verification and DKIM in email.

Solution

I offer to make a PR adding support for the opcode, updating several source files in data/transactions/logic to integrate it. Before creating the PR, I thought it best to raise an issue to provide context and see whether it's something others would like to see added. The API design would be something typical, likely modexp(base: Bytes, exponent: Bytes, modulus: Bytes). The opcode cost would be a function of the input size, similar to base64_decode.

Dependencies

There are no known dependencies.

Urgency

I am aware of several developers, including myself, who wish to support RSA in some form, such as for DKIM in email or privacy-preserving JWT verification. In that sense, it seems like an important opcode.

@mangoplane mangoplane added the new-feature-request Feature request that needs triage label Sep 19, 2024
@emg110
Copy link
Contributor

emg110 commented Sep 19, 2024

As a builder in ecosystem, I humbly think that will be a very useful OpCode. I humbly suggest if you can make that PR and do it, go for it. Demand for such contributions is very high right now.

@jannotti
Copy link
Contributor

jannotti commented Sep 19, 2024

I'd support that, I believe it is fairly trivial, since Go's big.Int supports it.

I'm not completely sure about the cost. Can the computational cost be expressed by multiplying the lengths of the inputs by some factors and summing?

It should probably be called bmodexp to fit the pattern of other opcodes that operate on bignums/bytes.

@jannotti
Copy link
Contributor

Separately, I would also support an rsa_verify though my impression is that RSA has more competing formats than ecdsa or ed25519. Anyway, if someone wants RSA, it might make just as much sense to get the entire operation into a single opcode.

@FrankSzendzielarz
Copy link

FrankSzendzielarz commented Sep 19, 2024

Separately, I would also support an rsa_verify though my impression is that RSA has more competing formats than ecdsa or ed25519. Anyway, if someone wants RSA, it might make just as much sense to get the entire operation into a single opcode.

It's probably worth doing as more interest across chains in auth seems to be happening, and passkey related solutions still have to contend with Windows Hello as a platform authenticator, and that uses RS256 (cose algo -257). AFAIK no other chain is dealing with passkey delegated smart account control using anything other than ecdsa or ed25519 type stuff, so an RS256 Windows Hello verifier might give you guys some kind of edge

@mangoplane
Copy link
Author

mangoplane commented Sep 19, 2024

Separately, I would also support an rsa_verify though my impression is that RSA has more competing formats than ecdsa or ed25519. Anyway, if someone wants RSA, it might make just as much sense to get the entire operation into a single opcode.

I'm not opposed to the idea, and see value in implementing this. However as you touched on, rsa_verify comes in many flavors that may introduce too much complexity to support universally, with the current standard being defined by RFC8017.

What might be better is to expect developers implement it themselves with the modexp opcode. In Puya-RSA, I have already implemented RSASSA-PKCS1-V1_5-VERIFY from RFC8017 , which supports RS256. The upcoming modexp opcode would be a drop-in replacement for the TEAL version I currently have.

If it were to be added as an opcode, it'd make sense to support the widely used RSASSA-PKCS1-V1_5-VERIFY function. This signature verification algorithm can verify RS256 signatures if SHA256 is used to create the message digest.

@mangoplane
Copy link
Author

I'd support that, I believe it is fairly trivial, since Go's big.Int supports it.

I'm not completely sure about the cost. Can the computational cost be expressed by multiplying the lengths of the inputs by some factors and summing?

It should probably be called bmodexp to fit the pattern of other opcodes that operate on bignums/bytes.

Thanks for the suggestions. I'll do some research into the cost, and propose something in the PR which we can review together.

@mangoplane
Copy link
Author

Preliminary work into figuring out a suitable function to estimate the cost is complete. Inspired by EIP-2565, it appears that fitting a line to x = exponent_length * max(base_length, modulus_length)**2, with y as the average ns/op (after scaling by the average Algorand gas cost per ns as estimated with b%), yields a good fit. My test data is a search grid over the range of 8-128 bytes, in increments of 8 bytes made with Golang bench similar to other benchmarks.

Note the data looks bimodal, as I think my computer didn't maintain constant speed on this task due to sleep or multitasking. It should be a good starting point to refine.

Jupyter Notebook of Results: https://colab.research.google.com/drive/1f9evrZLmLFPg6YgbkLHptZZGiNm_G6KX?usp=sharing

For comparison, a constant or linear function is a highly inaccurate estimate. Suggesting it makes sense to implement a custom cost function implementing the function estimated from the notebook in data/transactions/logic/opcodes.go. Interested to hear everyone's input.

@mangoplane
Copy link
Author

mangoplane commented Sep 20, 2024

In the meantime, I'll push what what I have as a PR for review and leave the cost as a placeholder with TODO comment, until I hear back about next steps for the cost.

@mangoplane mangoplane linked a pull request Sep 21, 2024 that will close this issue
@mangoplane
Copy link
Author

mangoplane commented Sep 22, 2024

Regarding the cost model for bmodexp, @jannotti:

I have conducted a more in-depth analysis, comparing three classes of models with automated model selection. It appears that the log-polynomial model is by far the most accurate, while the linear model is unfortunately highly inaccurate. The cost for the maximum input length is too high to set the cost as a constant. The relative performances of the models are assessed both in-sample and out-of-sample, using the known cost for the extreme input case as the out-of-sample test point. The models have fixed intercept through 0, because for zero length input the cost should be zero (although we will add a minimum cost to that). The coefficient for the model parameters are forced to be positive, based on the intuition that any increase in the input sizes should increase the cost, and to ensure there's no weird negative case to deal with.

Colab: https://colab.research.google.com/drive/1f9evrZLmLFPg6YgbkLHptZZGiNm_G6KX
PDF: https://drive.google.com/file/d/1Z6HjmV8scdztfz7ZM6f2VUNRP8nkysYd/view?usp=sharing

The data may not be available in the Colab, as it isn't permanently uploaded. Should you have issues viewing the results, please see the PDF.

@mangoplane
Copy link
Author

mangoplane commented Sep 22, 2024

Based on the analysis, it appears that the log-polynomial model is the most suitable choice. This means that a custom function for calculating the cost should be added to data/transactions/logic/opcodes.go, specifically for bmodexp. However, I need clarification regarding the data-derived specification you mentioned in your PR feedback, as I'm unsure how to proceed with that aspect. Other than that, I should be ready to implement the log-polynomial model, depending on your thoughts regarding the above points.

@johnalanwoods
Copy link

@mangoplane - nice work, can we set up a quick call to chat?

John

CTO - Algorand Foundation

@mangoplane
Copy link
Author

Thanks @johnalanwoods! I’d be happy to set up a call. I'm available from tomorrow onward, and since I'm based in Sydney, Australia, any time between 8 AM and 9 PM AEST works well for me. Please feel free to send the invite to my email: [email protected]. I look forward to our conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature-request Feature request that needs triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@jannotti @emg110 @johnalanwoods @FrankSzendzielarz @mangoplane and others