Skip to content
This repository has been archived by the owner on Jun 17, 2020. It is now read-only.

Rholang Lexer/Parser with Diagnostic API and informative errors #1015

Open
golovach-ivan opened this issue Oct 31, 2018 · 8 comments
Open

Rholang Lexer/Parser with Diagnostic API and informative errors #1015

golovach-ivan opened this issue Oct 31, 2018 · 8 comments
Assignees

Comments

@golovach-ivan
Copy link

golovach-ivan commented Oct 31, 2018

RhoLP - RhoLang Lexer/Parser

Сurrent state: Interpreter/Web-Compliler with automatically generated front-end (lexer, parser) from BNFC has no diagnostic API and often generate non informative errors.

Idea: NOT replace cup/jflex interpreter front-end with hand-written but in case of an error from cup/jflex front-end - additionally run handmade lexer/parser (not full interpreter, only front-end) for informative erros.

This bounties issue created for development epic (RHOL-1027) = RHOL-1029 + RHOL-1030 + RHOL-1031.

Project RhoLP sources.

Part I: Lexer (36 codepoints)

  • Lexer sceleton: Diagnostics API (12 codepoints)
    • Standard error format, error codes
    • Error/warn messages database
    • One scan - multiple diagnostic messages
  • Non-existed literals handling (12 codepoints)
    • Int problems: too big integer literals, absent Hex/Binary format ('0xFF', '0b1010')
    • Floating-point literals: '42.42e-42f'
    • Char literals: 'A', '\uFFFF'
  • Non-existed token types (12 codepoints)
    • Absent operators: '->', '%', '&', '&&', '^', etc
    • Absent keywords: 'do', 'int', 'this', etc
    • Absent UTF support

Part II: Parser

TBD

Benefit to RChain

1. Interpreter, Web-Compliter will be more user friendly in error situations
2. This hand made lexer/parser can resolve next issues

  • Confusing error message around ellipsis: RHOL-501
  • Rholang interpreter errors should have a uniform structure: RHOL-488
  • better diagnostics for large integers please: RHOL-575
  • Error message for incorrect usage of % vs %% is not helpful: RHOL-592
  • Need consistent error messages around method invocation: RHOL-497
  • "Errors received during evaluation" not useful: RHOL-662
  • Parser does not understand floating point numbers: RHOL-256
  • compiler error message usability: RHOL-301

Example/Demo

import net.golovach.rholp.*;
import net.golovach.rholp.log.*;
import java.util.List;

public class Demo {
    public static void main(String[] args) {
        String content =
                "type T = Functor[({ type λ[α] = Map[Int, α] })#λ]";
        DiagnosticListener listener = new DiagnosticCollapsedPrinter();
        RhoLexer lexer = new RhoLexer(content, listener);
        List<RhoTokenType> tokens = lexer.scanAll();
    }
}
NOTE
  Error code: lexer.note.identifier-like-absent-keyword
  Message: identifier 'type' like absent keyword, may cause confusion
  Line/Column: [1, 1]
  ----------
  type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
  ^^^^

ERROR
  Error code: lexer.err.non-existent.unicode.identifiers
  Messages:
    there is no Unicode support: 'λ', codepoint = 955, char[] = '\u03BB'
    there is no Unicode support: 'α', codepoint = 945, char[] = '\u03B1'
  Line/Column: [1, 26], [1, 28], [1, 42], [1, 48]
  ----------
  type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
                           ^ ^             ^     ^ 
ERROR
  Error code: lexer.err.non-existent.operator
  Message:    there is no operator '#'
  Line/Column: [1, 47]
  ----------
  type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
                                                ^ 

Budget and Objective

Estimated Budget of Task: $[5400] for Part I (Lexer)
Estimated Timeline Required to Complete the Task: [3 weeks]
How will we measure completion? [example: commited library ready to integrate with Interpreter+Web-Compliler]

@golovach-ivan golovach-ivan self-assigned this Oct 31, 2018
@golovach-ivan golovach-ivan added the core-dev guide: @dckc, @JoshyOrndorff with Medha; see #273 label Oct 31, 2018
@Barkov-F
Copy link

Barkov-F commented Nov 2, 2018 via email

@golovach-ivan
Copy link
Author

@Barkov-F
Can you be code reviewer for this issue?

@dckc
Copy link
Contributor

dckc commented Nov 7, 2018

This looks like it could be useful stuff, but not without detailed peer review.

I have asked many times that you find collaborators, at least as far back as August 3: #836 (comment)
Again Sep 19 and Sep 28 #945 (comment)
@JoshOrndorff reached out Oct 3 #991 (comment)

Integrating it with rchain.cloud looks interesting, but that wouldn't be core-dev. I think @tschoffelen would be the main point of contact there.

Also, the core-dev label is reserved for Bounties for Development work selected by Medha for the core dev team. The measure of completion is that a PR is accepted in https://github.com/rchain/rchain . (see #273 and Bounty Task Guides)

As for a budget, I don't see how to do that in the current climate; see #1012.

@dckc
Copy link
Contributor

dckc commented Nov 7, 2018

Oh... @allancto tells me that @KellyatPyrofex is trying to get a relevant PR reviewed. That would qualify it for the core-dev label. Normally the PR has to get merged during the pay period, and October is over. But maybe it could work out.

@JoshOrndorff
Copy link

I'd really love to learn how this works. Please LMK when you can give a tour. I've tried to build on my own and posted the problems I encountered on discord.

@glenbraun
Copy link

I think it is interesting to write a parser. It is certainly useful to have a way to get the RhoTypes protobufs for any given Rholang code (assuming that's the data model this parser would use).
I would like to point out a way that you can use RChain itself to get the protobufs for any valid Rholang.
@"parser"!(
{
new c, stdout(rho:io:stdout) in {
contract c(x) = {
stdout!(*x)
}
}
})
Using a client we can listenForDataAtName "parser" and receive the protobufs for the Rholang in the curly brackets. That is, just wrap any valid Rholang in curly brackets, send it on a name and then listen for that using a client, you'll get the protobufs graph of the Rholang. For example, the code above looks like this:
{ "news": [ { "bindCount": 2, "p": { "receives": [ { "binds": [ { "patterns": [ { "exprs": [ { "eVarBody": { "v": { "freeVar": 0 } } } ], "connectiveUsed": true } ], "source": { "exprs": [ { "eVarBody": { "v": { "boundVar": 1 } } } ], "locallyFree": "Ag==" }, "freeCount": 1 } ], "body": { "sends": [ { "chan": { "exprs": [ { "eVarBody": { "v": { "boundVar": 1 } } } ], "locallyFree": "Ag==" }, "data": [ { "exprs": [ { "eVarBody": { "v": { "boundVar": 0 } } } ], "locallyFree": "AQ==" } ], "locallyFree": "Aw==" } ], "locallyFree": "Aw==" }, "persistent": true, "bindCount": 1, "locallyFree": "Aw==" } ], "locallyFree": "Aw==" }, "uri": [ "rho:io:stdout" ] } ] }
I know we won't be able to send on public names in the future and will have to use a private name, but the concept is the same.

@allancto
Copy link

allancto commented Nov 8, 2018

@dckc @KellyatPyrofex where in rchain would work best? I'm suggesting: github.com/rchain/rchain/rholang-parser. @glenbraun , @JoshOrndorff any opinions?

@dckc
Copy link
Contributor

dckc commented Nov 22, 2018

@golovach-ivan when last we chatted, I got the impression you were going to

Now I see this was submitted as rchain/rchain#1898 . That PR cites a JIRA ticket, but not one that is part of the core dev team's plans. I don't expect the core dev team to expand their scope of work without lots of clear customer demand. Perhaps you could use rchain-community as a mechanism to explore the level of customer demand?

Until I see confirmation from @KellyatPyrofex I'm taking the core-dev label off.

cc @ArturGajowy @KentShikama

@dckc dckc removed the core-dev guide: @dckc, @JoshyOrndorff with Medha; see #273 label Nov 22, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants