Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Add an alternatnive, more explicit way of using the library. #95

Open
MinekPo1 opened this issue Mar 19, 2022 · 5 comments
Open

Comments

@MinekPo1
Copy link

MinekPo1 commented Mar 19, 2022

Motivation

  1. Readability

The abstract meta-programing features, require the reader to be acquainted to the library. While acceptable in most environments, in some this is a down side.

  1. Third party tools

While somewhat an extension of readability, I feel like its also important to mention. In some environments, these third party tools can be required and long comments disabling parts of these tools are not only looked down on, but also can prevent the tools from checking the code the developer wrote.

  1. Personal preference

For some, more explicit aliases can be preferable. Like how some people prefer tabs over spaces (haha).

Potential solution

sly.explicit

This new optional sub-module would include explicit aliases for existing meta-programing syntaxes.

  • sly.explicit.TokenType()

Replaces:

class SomeLexer(Lexer):
    ...
    ABC_TOKEN = r'[abc]'
    ABC_TOKEN['a'] = 'A_TOKEN'

With:

class SomeLexer(Lexer):
    ...
    ABC_TOKEN = TokenType(r'abc')
    ABC_TOKEN['a'] = TokenType(name='A_TOKEN')
  • sly.explicit.add_action(), sly.explicit.add_rule()

Aliases for _.

Replaces:

class SomeLexer(Lexer):
    ...
    @_(r'\d')
    def NUMBER(self,t):
        ...
...
class SomeParser(Parser):
    ...
    @_("A_TOKEN LETTER")
    def rule(self, p):
        ...

With

class SomeLexer(Lexer):
    ...
    @add_action(r'\d')
    def NUMBER(self,t):
        ...
...
class SomeParser(Parser):
    ...
    @add_rule("A_TOKEN ABC_TOKEN")
    def rule(self, p):
        ...

The main difference between the two is if the positional argument is of type Token or YaccProduction.

Extensive type annotation for interface functions.

This would not only apply to the new explicit interface, but also to the existing one, improving self documentation.

Aside from the members of sly.explicit this would also include:

  • Parser.parse() and Lexer.tokenize()
  • Parser.error()
  • Members such as Parser.tokens, Token.value etc.

Final thoughts

I understand, that meta-programing in this libraries spirit and that the problems I laid out in the motivation paragraph are known of and considered low (or even lower) priority. I however think, that this proposed alternative interface, would allow for whom these problems are important to solve them.

Potentially, the note outlined in Contributing.md could be reformatted to allow changes within this submodule.

I look forward to any suggestions and hopefully the go-ahead for me to implement this.

@MinekPo1
Copy link
Author

MinekPo1 commented Mar 19, 2022

Also, an after thought:

Possibly

SOME_TOKEN[r'pattern'] = TokenType(name='OTHER_TOKEN')

could be simplified to

SOME_TOKEN[TokenType(r'pattern',name='OTHER_TOKEN')]

This would however require modification of the underlying meta-syntaxes.
However, if implemented with a slice, it would add this syntax:

SOME_TOKEN[r'pattern':'OTHER_TOKEN']

Since TokenStr, does not have a __getitem__ method, this will not cause any collisions.

Edit:

Also implementing __getitem__ could allow the user to delete the del keyword from

del SOME_TOKEN['KEYWORD']

@jpsnyder
Copy link

I give no opinion on the proposed solution. But I do think having an option to do things more explicitly if the need arises would be nice. I think an explicit api could solve other things like having inherited classes or improving customization.

Although I imagine that would be a huge undertaking, so I wouldn't hold my breath.

@MinekPo1
Copy link
Author

Although I imagine that would be a huge undertaking, so I wouldn't hold my breath.

I would guess that with the implementation I have in my head it could be made in under 30 lines + small changes in existing code.

The hard part of those changes (figuring out the types) I have done in #96.

I'll try publishing an example implementation as a gist later today.

@dabeaz
Copy link
Owner

dabeaz commented Mar 25, 2022

I've been thinking about this. Bottom line: I don't want to provide an alternate API on SLY. The whole point of the project was to create a DSL for specifying parsers using sneaky metaprogramming features. I acknowledge that this sort of thing isn't for everyone. However, there are numerous other Python parsing tools that can solve the same problem as SLY using a variety of different APIs.

This said, I HAVE been thinking about a refactoring of SLY that more cleanly isolates the LALR(1) parsing engine from the top-level user interface. I might also break the parsing engine out into its own library that could be shared between SLY and PLY. So, perhaps one could (eventually) code something on top of that.

I'm also not opposed to someone taking SLY, modifying it to have a different interface, and releasing it as a different package. People did this kind of thing with the PLY project and it doesn't bother me at all. I'd just ask that you send me a link so that I could tell people about it on the SLY README file.

@MinekPo1
Copy link
Author

Thinking about it, TokenType is not really necessary as TokenStr could just be used.
(Granted the name kwarg is not supported)

Maybe the decorators could be exposed allowing them to be used directly, instead of the _, replacing the proposed aliases?

If this is not something you see be acceptable I might create a secondary module with the aliases I described.

Also, what do you think about the TOKEN["pattern":"name"] and TOKEN["keyword"] syntaxes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants