XMPP → Telegram quote processing #153

ForNeVeR · 2022-04-23T15:45:22Z

Depends on Parse and clean up quotes from XMPP #116, since we'll require quotation detection code before starting any work on this one.

XMPP users are used to quote the messages they answer to.

Unfortunately, nobody I know seem to use XEP-0201 for that purpose, so the quotes they make aren't easy to properly recognize. But still, we have to do something.

XMPP users are more flexible in the ways they quote the messages. There are two types of quotations they make:

Full quotation: when an XMPP user quotes a Telegram message completely (preceding any quotation line with a special mark, such as >> or »), and adds their response after the message. For example, this:
```
<xmpp_user> >> @telegram_user
>> full text
>> of a (possibly multiline)
>> message

I second that!
```
A partial quotation: when an XMPP user quotes parts of another user's message, optionally inserting their reply after or before each part.

Due to the nature of Telegram replies, I propose we only recognize full quotes and not partial ones. But, we should properly recognize the quotes of XMPP users as well as Telegram ones (e.g. when an XMPP user quotes another XMPP user).

Special care (but not too much, because the case is a rare one) should be put around the case when an XMPP user double-quotes an already existing quote: we should avoid (if possible) to misattribute it as an answer to a second level reply. For example:

<xmpp user 1> my text
<xmpp user 2> >> my text
I second that!
<xmpp user 3> >> <xmpp user 1>
>> >> my text
>> I second that!
I triplicate

Here, a message from xmpp user 3 shouldn't be misattributed as a reply to a message from xmpp user 1.

(I don't think we should support a sequence of "reply → partial reply" here, let the replying folks sort it out themselves)

So, I propose the following.

This will partially obsolete our work on Parse and clean up quotes from XMPP #116, but let's first do Parse and clean up quotes from XMPP #116 since it's easier, and them improve on that work.
Let's keep all the Telegram messages (both the incoming one and the outgoing ones, i.e. the bot messages) from the last 48 hours somewhere (either in a database or in memory).
When a new XMPP message arrives, parse it and check if it starts with a quotation. For quotation, strip the user name from it, if available (but keep as a hint if it was possibly a Telegram user), and then search in our database for the same message. The search should mostly ignore the whitespace and be fluent enough for practical cases (may be tuned up later if required, so don't put too much brain into that "fluency").
If a message contains a full quotation of any cached message, then mark it as that on the Telegram side, and completely strip the quotation from its actual body.

The text was updated successfully, but these errors were encountered:

jt3k · 2022-11-15T22:39:42Z

❤️

ForNeVeR added kind:feature status:blocked labels Apr 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XMPP → Telegram quote processing #153

XMPP → Telegram quote processing #153

ForNeVeR commented Apr 23, 2022

jt3k commented Nov 15, 2022

XMPP → Telegram quote processing #153

XMPP → Telegram quote processing #153

Comments

ForNeVeR commented Apr 23, 2022

jt3k commented Nov 15, 2022