-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
…anges in the other Python modules
- Loading branch information
Showing
8 changed files
with
52 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,7 @@ | |
|
||
# Author: Leonel Figueiredo de Alencar - Federal University of Ceará | ||
# [email protected] | ||
# Date: June 27, 2018 | ||
# Date: July 2, 2018 | ||
""" | ||
This module annotates enclitic or mesoclitic pronouns in entries in the MBR format | ||
|
@@ -21,7 +21,7 @@ | |
degustares-lhe degustar+V.ele.DAT.3.SG+SBJF+2+SG | ||
Tag conversion is performed by the AnnotateClitic function from | ||
the module ConvertDELAF.py. Ambiguity of clitic "nos" is also handled. | ||
module ConvertDELAF.py. Ambiguity of clitic "nos" is also handled. | ||
For more details, see the respective module documentation. | ||
""" | ||
import sys | ||
|
@@ -34,8 +34,9 @@ def main(): | |
if HasClitic(entry): | ||
parts=ParseEntry(entry,r"\t|\+") | ||
word,lemma,cat,feats=parts[0],parts[1],parts[2],parts[4:] | ||
print AnnotateClitic(word,lemma,cat,feats).encode("utf-8") | ||
sys.stdout.write("%s\n" % AnnotateClitic(word,lemma,cat,feats).encode("utf-8")) | ||
else: | ||
print entry.encode("utf-8") | ||
sys.stdout.write("%s\n" % entry.encode("utf-8")) | ||
|
||
if __name__ == '__main__': | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
#! /usr/bin/env python2.7 | ||
# -*- coding: utf-8 -*- | ||
|
||
# Author: Leonel Figueiredo de Alencar - Federal University of Ceará | ||
# [email protected] | ||
# Date: July 2, 2018 | ||
|
||
""" | ||
This module correct DELAF entries with the V+PRO tag from standard input | ||
by inserting the missing hyphen separating the clitic pronoun from | ||
the verb form in entries like the following: | ||
abluirlhe,abluir.V+PRO:U1s | ||
The output are correct entries, e.g.: | ||
abluir-lhe,abluir.V+PRO:U1s | ||
Usage: cat INFILE | SeparateHyphen.py > OUTFILE | ||
The module uses the SeparateClitic function from module ConvertDELAF. | ||
Clitic separation is performed using PATTERN1, which presupposes that the | ||
input entries contain the V+PRO tag. | ||
""" | ||
|
||
import sys | ||
from ConvertDELAF import * | ||
|
||
|
||
def main(): | ||
entries=ExtractEntries(sys.stdin) | ||
for entry in entries: | ||
sys.stdout.write("%s\n" % SeparateClitic(entry).encode("utf-8")) | ||
|
||
if __name__ == '__main__': | ||
main() |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.