Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMPP → Telegram quote processing #153

Open
1 task
ForNeVeR opened this issue Apr 23, 2022 · 1 comment
Open
1 task

XMPP → Telegram quote processing #153

ForNeVeR opened this issue Apr 23, 2022 · 1 comment

Comments

@ForNeVeR
Copy link
Member

XMPP users are used to quote the messages they answer to.

Unfortunately, nobody I know seem to use XEP-0201 for that purpose, so the quotes they make aren't easy to properly recognize. But still, we have to do something.

XMPP users are more flexible in the ways they quote the messages. There are two types of quotations they make:

  1. Full quotation: when an XMPP user quotes a Telegram message completely (preceding any quotation line with a special mark, such as >> or »), and adds their response after the message. For example, this:
    <xmpp_user> >> @telegram_user
    >> full text
    >> of a (possibly multiline)
    >> message
    
    I second that!
    
  2. A partial quotation: when an XMPP user quotes parts of another user's message, optionally inserting their reply after or before each part.

Due to the nature of Telegram replies, I propose we only recognize full quotes and not partial ones. But, we should properly recognize the quotes of XMPP users as well as Telegram ones (e.g. when an XMPP user quotes another XMPP user).

Special care (but not too much, because the case is a rare one) should be put around the case when an XMPP user double-quotes an already existing quote: we should avoid (if possible) to misattribute it as an answer to a second level reply. For example:

<xmpp user 1> my text
<xmpp user 2> >> my text
I second that!
<xmpp user 3> >> <xmpp user 1>
>> >> my text
>> I second that!
I triplicate

Here, a message from xmpp user 3 shouldn't be misattributed as a reply to a message from xmpp user 1.

(I don't think we should support a sequence of "reply → partial reply" here, let the replying folks sort it out themselves)

So, I propose the following.

  1. This will partially obsolete our work on Parse and clean up quotes from XMPP #116, but let's first do Parse and clean up quotes from XMPP #116 since it's easier, and them improve on that work.
  2. Let's keep all the Telegram messages (both the incoming one and the outgoing ones, i.e. the bot messages) from the last 48 hours somewhere (either in a database or in memory).
  3. When a new XMPP message arrives, parse it and check if it starts with a quotation. For quotation, strip the user name from it, if available (but keep as a hint if it was possibly a Telegram user), and then search in our database for the same message. The search should mostly ignore the whitespace and be fluent enough for practical cases (may be tuned up later if required, so don't put too much brain into that "fluency").
  4. If a message contains a full quotation of any cached message, then mark it as that on the Telegram side, and completely strip the quotation from its actual body.
@jt3k
Copy link

jt3k commented Nov 15, 2022

❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants