Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I am doing something totally wrong #1820

Open
the42 opened this issue May 10, 2023 · 3 comments
Open

I am doing something totally wrong #1820

the42 opened this issue May 10, 2023 · 3 comments

Comments

@the42
Copy link

the42 commented May 10, 2023

I define my index as follows:

germanTextFieldMapping := bleve.NewTextFieldMapping()
germanTextFieldMapping.Analyzer = de.AnalyzerName


err = index.AddCustomAnalyzer("thesaurus_german_wortlautmapping_2",
	map[string]interface{}{
		"type":      custom.Name,
		"tokenizer": unicode.Name,
		"token_filters": []interface{}{
			// de.NormalizeName,
			de.StopName,
			de.LightStemmerName,
			lowercase.Name,
		},
	})
if err != nil {
	panic(err)
}
thesausurs_german_wortlaut2FieldMapping := bleve.NewTextFieldMapping()
thesausurs_german_wortlaut2FieldMapping.Analyzer = "thesaurus_german_wortlautmapping_2"

wortlautmapping := bleve.NewDocumentMapping()
wortlautmapping.AddFieldMappingsAt("wortlaut", thesausurs_german_wortlaut2FieldMapping, germanTextFieldMapping /*, ngramTextFieldMapping*/)
index.AddDocumentMapping("wortlaut", wortlautmapping)

When I search like

qry := bleve.NewMatchQuery(wortlaut)
qry.SetField("wortlaut")
req := bleve.NewSearchRequest(qry)

// https://gist.github.com/mschoch/19767c439466bcbeb138008f6c3ac0b3
req.Size = hits
req.SortBy([]string{"-validfrom"})
req.Fields = []string{"*"}
result, err := model.thindex.Search(req)

I get somewhat relevant results. However, If I use the bleve command line tool like

bleve query . I get for the same query many more results.

What's the difference?

@the42
Copy link
Author

the42 commented May 11, 2023

Edit: It seems like my index definition extends the "standard" way of searching with bleve, and a NewMatchQuery restricted to this index definitions on the field, exclusively using this extended logic. How can I merge results of the edge logic with the exposed "standard" logic I seem to get when using the command line tool?

Is this something I have to do in business logic on my own by issuing two queries and merging results on my own or does bleve offer this functionality through Conjunction Queries?

@abhinavdangeti
Copy link
Member

  • Would you share your exact command line query? If you're not setting the field - it'd go to the composite field (_all) over which the index mapping's default analyzer is applied. The outcome can be adjusted by simply setting the DefaultAnalyzer of your index mapping at inception ..
    DefaultAnalyzer string `json:"default_analyzer"`
index.DefaultAnalyzer = "thesaurus_german_wortlautmapping_2"
  • Bleve does offer compound queries - boolean, conjunction and disjunction.

    bleve/query.go

    Lines 42 to 46 in ec1a82f

    // NewConjunctionQuery creates a new compound Query.
    // Result documents must satisfy all of the queries.
    func NewConjunctionQuery(conjuncts ...query.Query) *query.ConjunctionQuery {
    return query.NewConjunctionQuery(conjuncts)
    }

    Here's an example conjunction query, via the query string syntax -
bleve query path_to_index "+field1:x +field2:y"

@the42
Copy link
Author

the42 commented May 16, 2023

I use the command line like

bleve query -f wortlaut . Bäcker

but now I understand that the -f flag is ignored as the default query type is search_string which ignores this flag.

What are the possible values to -t type string?

I get consistent result between command line if I search like bleve query . "wortlaut:Bäcker"

In my program I would like to join what a bleve query . "wortlaut:Bäcker" and bleve query . "Bäcker" would yield and this seems to be right way:

edgeqry := bleve.NewMatchQuery(wortlaut)
edgeqry.SetField("wortlaut")

phraseqry := bleve.NewQueryStringQuery(wortlaut)

qry := bleve.NewDisjunctionQuery(
	edgeqry,
	phraseqry,
)

req := bleve.NewSearchRequest(qry)
req.Size = 5
req.SortBy([]string{"-validfrom"})
req.Fields = []string{"*"}
result, err := index.Search(req)

When doing so I get the following error from index.Search without any further information: parse error: runtime error: slice bounds out of range [1:0]

EDIT: When I change
<< phraseqry := bleve.NewQueryStringQuery(wortlaut)
to
>> phraseqry := bleve.NewMatchPhraseQuery(wortlaut)

The query doesn't error. The more I think about it, I guess it is because my input contains maybe unmatching " or other unintended special characters which bleve tries to interpret according to the https://blevesearch.com/docs/Query-String-Query/ documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants