Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] How to check whether two SyntaxNodes are the same? #154

Open
FFdhorkin opened this issue Jul 2, 2023 · 4 comments
Open

[question] How to check whether two SyntaxNodes are the same? #154

FFdhorkin opened this issue Jul 2, 2023 · 4 comments
Labels

Comments

@FFdhorkin
Copy link

FFdhorkin commented Jul 2, 2023

I'm attempting to extract attributes of a certain type from a webpage, then reconstruct a JS object showing where they appeared in the original tree.

For example, I want to go from this input:

<div data-test='a'>
    <div data-test='b'></div>
    <div data-test='c'>
        <div data-test='d'></div>
    </div>
</div>

to

{
    a: {
        b: null,
        c: {
             d: null
        }
    }
}

The language in question is https://github.com/phoenixframework/tree-sitter-heex and I'm using this query:

const query = new Parser.Query(PhoenixHeexParser, `(
    (attribute
        (attribute_name) @attribute-name
        (quoted_attribute_value
            (attribute_value) @attribute-value
        )
    )
    (#eq? @attribute-name "data-test")
)`);
const captures = query.captures(tree.rootNode)
  .filter((match) => match.name === 'attribute-value');
console.log(captures.map(c => tree.getText(c.node)));

This correctly outputs the ['a', 'b', 'c', 'd'], but there doesn't appear to be any obvious way to gather their relative positions to each other.

What I've been trying to do is look at each matching capture's node and repeatedly use .parent and attempt to find a parent of one match that is the node from another match. But I can't figure out how to verify I've found the right node. And I suspect there's a better way... but documentation on tree-sitter is very sparse, especially on this node implementation. Comparing the text doesn't work, since the node text will have extraneous information (and it isn't guaranteed to be unique)

It might even be possible to do this with the original query itself, but I didn't see anything in the documentation that suggested this would be possible.

@sogaiu
Copy link

sogaiu commented Jul 2, 2023

Regarding:

whether two SyntaxNodes are the same

Just some ideas:

  1. At least one of the test files for this repository appears to directly compare nodes for equality. Not sure if that can be relied on.

  2. Does comparing the pair of startPosition and endPosition of a node with those of another node work?

    Assuming the root node is not under consideration, is it possible for two nodes to have the same startPosition and endPosition pair?

    startPosition: Point;
    endPosition: Point;


On a side note, I think the underlying C library has a way to directly compare two nodes for equality, but I didn't find any mention of ts_node_eq in the node-tree-sitter codebase.

@FFdhorkin
Copy link
Author

FFdhorkin commented Jul 7, 2023

Thanks, @sogaiu ! I switched to web-tree-splitter, which apparently is not ideal for Node, but it exposes the .equals method you mentioned, plus some other stuff missing from node-tree-sitter.

@ahlinc ahlinc added the question label Jul 8, 2023
@RedCMD
Copy link

RedCMD commented Jan 3, 2024

you can specify start and end points to captures(), to gather all nodes that intersect the range

query.captures(tree.rootNode, startPosition, endPosition);

I'm interested in what you came up with
did you manage to find a clean design?

@sogaiu
Copy link

sogaiu commented Aug 9, 2024

@liaodalin19903 AFAIU, ordinary participants (me included) are not given the ability to label an issue (except when we create the issue, I think some labels may apply automatically).

P.S. ahlinc no longer participates in this repository, may be better not to ping / tag him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants