Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft: LSP21 Metadata Discovery (Zero Data Key) #194

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions LSPs/LSP-21-Metadata-Discovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
lip: 21
title: Metadata Discovery
author: Jean Cavallera, Samuel Videau, Hugo Masclet, Callum Grindle
discussions-to: shorturl.at/fnwKS
status: Draft
type: LSP
created: 2023-03-17
requires: ERC725Y, LSP2
---

<!--You can leave these HTML comments in your merged LIP and delete the visible duplicate text guides, they will not appear and may be helpful to refer to if you edit it again. This is the suggested template for new LIPs. Note that an LIP number will be assigned by an editor. When opening a pull request to submit your LIP, please use an abbreviated title in the filename, `lip-draft_title_abbrev.md`. The title should be 44 characters or less.-->

## Simple Summary
<!--"If you can't explain it simply, you don't understand it well enough." Provide a simplified and layman-accessible explanation of the LIP.-->
This standard defines the **zero data key** `0x0000000000000000000000000000000000000000000000000000000000000000` as an entry point for an ERC725Y smart to make its metadata publicly discoverable and retrievable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This standard defines the **zero data key** `0x0000000000000000000000000000000000000000000000000000000000000000` as an entry point for an ERC725Y smart to make its metadata publicly discoverable and retrievable.
This standard defines the **zero data key** `0x0000000000000000000000000000000000000000000000000000000000000000` as an entry point for an ERC725Y smart contract to make its metadata publicly discoverable and retrievable.



## Abstract
<!--A short (~200 word) description of the technical issue being addressed.-->
This standard addresses the issue of making the different schemas used by an ERC725Y contract discoverable for users or applications that interact with the contract for the first time. This is useful for applications that have no prior knowledge of the different JSON schemas used for the metadata, and that do not know where this schema can be obtained off-chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This standard addresses the issue of making the different schemas used by an ERC725Y contract discoverable for users or applications that interact with the contract for the first time. This is useful for applications that have no prior knowledge of the different JSON schemas used for the metadata, and that do not know where this schema can be obtained off-chain.
This standard addresses the issue of making the different schemas used by an ERC725Y contract discoverable for users or applications that interact with the contract for the first time. This is useful for applications that have no prior knowledge of the different JSON schemas used for the metadata, and do not know where these schemas can be obtained off-chain.


## Motivation
<!--The motivation is critical for LIPs that want to change the Lukso protocol. It should clearly explain why the existing protocol specification is inadequate to address the problem that the LIP solves. LIP submissions without sufficient motivation may be rejected outright.-->
The LSP2 standard provides a schema that enables to read and interpret the metadata of an ERC725Y smart contract in a human friendly. This is also useful for tools to automate encoding and decoding of standard entries in the storage of a ERC725Y smart contract.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first sentence.
in a human friendly. sounds strange.
Maybe it should be as human friendly or even:

The LSP2 standard provides a JSON schema that makes the metadata of an ERC725Y smart contract human-readable. 
  • Changed schema to JSON schema as the first one is too broad of a term.
  • Changed human friendly to human-readable. I think it's more appropriate as friendly sounds like the metadata was previously aggressive 😄

In the second sentence.
(1)
... useful for tools to automate encoding and decoding ...
->
... useful for automation of encoding and decoding ...

(2)

  • What do you mean by standard entries?

(3)
a ERC725Y -> an ERC725Y

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I would simply get rid of human friendly as it's not ahah, it just allows programs to interpret the data. So I would change to:
The LSP2 standard introduces a schema format that facilitates the definition, reading, and interpretation of data stored on an [ERC725Y](https://github.com/ERC725Alliance/ERC725/blob/develop/implementations/contracts/ERC725YCore.sol) smart contract.


### Current Problem

Despite the benefits that LSP2 provides, a problem around metadata remains:

> _how does someone that does not know the set of ERC725Y JSON schemas used by a smart contract can read the data from the contract storage in the first place?_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> _how does someone that does not know the set of ERC725Y JSON schemas used by a smart contract can read the data from the contract storage in the first place?_
> _how does someone that does not know the set of ERC725Y JSON schemas used by a smart contract read the data from the contract storage in the first place?_


With no prior knowledge of the schemas, the contract metadata cannot be fetched as the schema helps to construct the `bytes32` data key, so that the contract can be queried to fetch data from it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1)
.. the contract metadata .. -> .. the contract's metadata ..

(2)
.. the schema helps to construct .. -> .. the schema is required to construct ..

(3)
.. the bytes32 data key, so that the contract can be queried to fetch data from it.
->
.. the bytes32 data key used to fetch the data.
or
.. the bytes32 data key used to fetch the data from the contract.

The final result looks like this:

With no prior knowledge of the schemas, the contract's metadata cannot be fetched as the schema is required to construct the bytes32 data key used to fetch the data.


### Existing Solutions

Currently, the JSON schemas can be obtained through various ways, including public/private Github repositories, documentation websites, README, packages or Gist. There is no standard "rules" or recommendations on where and how these schemas should be shared, which leads to a need for a "LSP2 JSON sharing" model, a way to store the link of the JSON Metadat where the schemas can be retrieved from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently, the JSON schemas can be obtained through various ways, including public/private Github repositories, documentation websites, README, packages or Gist. There is no standard "rules" or recommendations on where and how these schemas should be shared, which leads to a need for a "LSP2 JSON sharing" model, a way to store the link of the JSON Metadat where the schemas can be retrieved from.
Currently, the JSON schemas can be obtained through various ways, including public/private Github repositories, documentation websites, README, packages or Gist. There is no standard "rules" or recommendations on where and how these schemas should be shared, which leads to a need for a "LSP2 JSON sharing" model, a way to store the link of the JSON Metadata where the schemas can be retrieved from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no standard "rules" -> change is to are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"LSP2 JSON sharing" -> "LSP2 JSON schema sharing"

Because we are sharing schemas not just JSON (though, technically it's just JSON).


The only way to be able to read all the data of an ERC725Y contract without prior knowledge of it is to be aware of all the schemas available. In the previous "link sharing model", users and participants are aware of the schemas through third party services, where the schemas are hosted and published. To accomplish this without a trusted party, the schemas must be publicly discoverable, and we need a system for participants to agree on a single method to retrieve these schemas and the metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and we need a system for participants to agree on a single method to retrieve these schemas and the metadata.

... and we need a system for participants to agree on that provides a single method to retrieve schemas and metadata.


One approach can be to store the external URL inside the smart contract on-chain.

A common solution is to introduce a state variable inside the smart contract that can be publicly queried. However, using this method creates several limitations and inconsistencies:
Comment on lines +41 to +43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are talking here about Existing solutions I suggest the following rephrasing:

One of the approaches is to store the external URL inside the smart contract on-chain.
A common implementation is the introduction of a state variable inside the smart contract that can be publicly queried. ...


1. different smart contracts implementations can use different variable names or getter functions
2. different smart contracts implementations can store the Metadata Schema URL at different slots in the storage.

This lead to a non-standard way to retrieve this important information, as there is no "standard rule" for retrieving these schemas.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lead .. -> This leads ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even suggest replacing this sentence with the following:

All of the above just emphasizes the lack of standardization and none of the solutions provides a consistent way of retrieving schemas used by a smart contract.


We need a way for users and tools:
- to know **how to retrieve** the JSON Schemas of the publicly available metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it really fits here but instead of how to retrieve I want to say how to consistently retrieve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON Schemas -> schemas
Or at least JSON schemas.

And on the next line schemas should be updated then to JSON schemas.

- to know **how to remember** how the schemas can be retrieved.

### Proposed Solution

For our purpose, we use a single unique and easy to remember `bytes32` data key: the **zero data key**: `0x0000000000000000000000000000000000000000000000000000000000000000`.
Copy link
Contributor

@samuel-videau samuel-videau Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For our purpose, we use a single unique and easy to remember `bytes32` data key: the **zero data key**: `0x0000000000000000000000000000000000000000000000000000000000000000`.
For our purpose, we use a standardized, unique and easy to remember `bytes32` data key: the **zero data key**: `0x0000000000000000000000000000000000000000000000000000000000000000`.


The advantage of the **zero data key** over other data keys is that it is unique and easy to remember, while the hash of the data key requires to remember the hash of the data that was hashed to obtain the `bytes32` data key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... the hash of the data key requires to remember the hash of the data that was hashed ...

Probably you meant: ... the hash of the data key requires remembering the data that was hashed ...


## Specification
<!--The technical specification should describe the syntax and semantics of any new feature. The specification should be detailed enough to allow competing, interoperable implementations for any of the current Ethereum platforms (go-ethereum, parity, cpp-ethereum, ethereumj, ethereumjs, and [others](https://github.com/ethereum/wiki/wiki/Clients)).-->

The following schema defines the **zero data key** to make schemas and metadata of an ERC725Y smart contract publicly discoverable.

```json
{
"name": "LSP21MetadataDiscovery",
"key": "0x0000000000000000000000000000000000000000000000000000000000000000",
"keyType": "Singleton",
"valueType": "string",
"valueContent": "<JSON|JSONURL>"
}
```
Comment on lines +65 to +73
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the requirement for the case when we store a link to JSON file gave me an idea that it could be useful to have actually 2 keys:

  • 1 for storing raw JSON;
  • 1 for storing link to a JSONURL.

You may change the external file frequently and post once in a while updates to the first key that holds JSON on-chain.
The standardised JSON file may have the following format that is used by both on and off-chain files:

{
    "date": 1679149644213,
    "schemas": [
        ...
    ]
}

The date key will allow you to guess which one is more relevant and pick schemas from the more recent one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative we can write a suggestion that says the following:

To ensure the best experience and reduce dependency on 3rd parties you should replace JSONURL with raw JSON once you are certain it is no longer going to change or won't change anytime soon. This way anyone you'll make sure that others will be able to read your contract's metadata.

But I'm not sure yet how that will be done by non-tech people like some future users of UP.


The data stored under the **zero data key** can be one of the following two options:
- **on-chain**: a `JSON` file as utf8 encoded string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really like this idea of on-chain encoded JSON containing the schema. Could be a bit expensive, though it's for a full contract, so might be worth it in some usecases. Even though I'm not a big fan of that, lot of people are "on-chain maximalists" ahah, and would probably appreciate this "feature". On Ethereum, I saw a lot of on-chain NFT projects, where even the NFT visuals where stored on-chain (either pixels stored on chain or even vectors)

Copy link
Member Author

@CJ42 CJ42 Mar 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, JSON does not exist as a valueContent in LSP2.

@frozeman

I think we should add JSON for valueContent in the LSP2 standard because of this proposal + @samuel-videau points.

With the requirements that JSON valueContent is a stringified JSON data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About adding new value content: since serialized JSON is just a JSON string represented as UTF-8 bytes I think we can add both JSON and String. The former will just hint that the content can be decoded as a JSON object.

- **off-chain**: a `JSONURL` linking to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion: ... linking to a JSON file.


_Requirements_

Whether the Schemas are stored on or off-chains, the JSON data MUST adhere to the following requirements:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a typo Schemas. I think it should be uncapitalized.

- MUST be an array of Metadata JSON schemas that comply with the [LSP2 JSON Schema object format](./LSP-2-ERC725YJSONSchema.md#specification).

### When the Schemas are stored on-chain

- _What are the requirements_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- _What are the requirements_
- Requirements: The JSON data should be stored as a utf8 encoded string within the smart contract itself, ensuring the data is permanently available on-chain.

- _Put an example here_


### When the Schemas are stored off-chain

- _What are the additional requirements_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- _What are the additional requirements_
- Requirements: The smart contract should store a `JSONURL` pointing to the off-chain location where the JSON data can be found. This off-chain location should be accessible and reliable.

- _Put an example here_
Comment on lines +92 to +93
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably obvious but still I'll write it down:

This method is cheaper but less secure. Recommended for cases when a smart contract changes very frequently.

Your responsibility is:

  • To make sure the link is publicly available;
  • To make sure the link leads to a JSON file, not a directory of JSON files or anything else;
  • To find the best hosting service you can that will store this file;
  • To regularly check if the file is still available. Some automation might help here.



## Rationale
<!--The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale may also provide evidence of consensus within the community, and should discuss important objections or concerns raised during discussion.-->
The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale may also provide evidence of consensus within the community, and should discuss important objections or concerns raised during discussion.-->
Comment on lines +97 to +98
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is duplicated.


The link to the schema can point to either a decentralised storage networks like IPFS or centralised servers like private Google Drive, according to the preference of the user.

So when a user, tool or an other smart contract want to read and interact with the ERC725Y Metadata of a contract that it does not know, it can just fetch that JSON Schema from the zero data key to discover the metadata publicly available and then extract the data from the store.

Making the metadata publicly discoverable through the **zero data key**

- if you interact with an unknown ERC725Y, you’ll still be able to fetch data
- if a project get abandoned or whatever, the ERC725Yjson will still be there, and will not be lost

The **zero data key** also enable custom data keys (_e.g: `CollectionDescription`, `CollectionImage`_) that are custom to a specific user (e.g: a Universal Profile) to be publicly discoverable.

Some objections that the metadata key raise are that through this method, any metadata can now be more publicly visible and accessible. This can be a debate as users or smart contracts might not necessarily want to make all there metadata publicly known and discoverable. Some might not necessarily want anyone to be able to look up certain metadata. For instance, a user might want to attach some specific data to his Universal Profile but keep it hidden from the publicly, so that only the Universal Profile owner (the actual user) know how to access it because he knows the data key. Therefore, some users might want not include the schema of metadata they want to keep secret inside the JSON linked to the **zero data key**.

## Backwards Compatibility
<!--All LIPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The LIP must explain how the author proposes to deal with these incompatibilities. LIP submissions without a sufficient backwards compatibility treatise may be rejected outright.-->

## Test Cases
<!--Test cases for an implementation are mandatory for LIPs that are affecting consensus changes. Other LIPs can choose to include links to test cases if applicable.-->

## Implementation
<!--The implementations must be completed before any LIP is given status "Final", but it need not be completed before the LIP is accepted. While there is merit to the approach of reaching consensus on the specification and rationale before writing code, the principle of "rough consensus and running code" is still useful when it comes to resolving many discussions of API details.-->

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).