05 Jan 12:18

nan-wang

90d6782

🎉 Jina 0.9.0

Jina 0.9.0

We are excited to release Jina 0.9.0. Jina is the easier way to do neural search in the cloud. Highlights of this release include:

Support for delete/update operations
Add native AsyncIO support and unlock native support for running Jina in Jupyter notebooks and IPython
Add MultiModalDocument as primitive types to support multimodal search in a Pythonic way
Refactor Pea and introduce Runtime to improve code readability and maintainability

Release 0.9.0

⬆️ Major Features and Improvements

Completeness

To support for updating and deletion operations, we introduced update and delete method for BaseKVIndexer and BaseVectorIndexer; update and delete APIs are introduced in Flow APIs. #1380, #1415, #1550, #1460
Refactoring to native asyncio. This unlocks support for running Jina in Jupyter notebooks and IPython. AsyncClient and AsyncFlow were added to let users manage eventloop and make Jina more reliable in Jupyter notebooks and IPython #1348, #1408, #1410, #1428, #1450, #1453, #1463, #1562

Click to see example

from jina import AsyncFlow
with AsyncFlow().add(uses='_logforward') as f:
    await f.index_lines(lines=['hello', 'jina'], on_done=print)

Ease of Use

Add MultiModalDocument as primitive type. This lets users build a multimodal search system in a Pythonic way. #1335, #1385, #1390, #1368, #1395, #1399, #1401

Click here for example code

from jina import Document, MultimodalDocument
chunk_img = Document(modality='dummy_image', embedding=np.random.rand(1, 4))
chunk_text = Document(modality='dummy_text', embedding=np.random.rand(1, 10))
multimodal_doc = MultimodalDocument(chunks=[chunk_img, chunk_text])

Introduce Runtime as a member of Pea, defined as "a procedure that blocks the main process once running, therefore must be put into a separated thread/process. The new architecture greatly improves the readability and maintainability of the code. #1426, #1473, #1487, #1539, #1577

⚠️ Breaking Changes

Introduce UniqueId, ChunkSet, DocumentSet, MatchSet; Remove add_chunk and add_match; Refactor Document with newly introduced classes. #1343

Click here for example code

0.8.0

0.9.0

from jina import Document()
with Document() as d:
    c = Document(id=f'1:0>16')
    d.chunks.append(c)

with Document() as d:
     c = d.chunks.append()
     c.id = f'1:0>16'

from jina import Document()
with Document() as d:
    c = d.chunks.add_chunks()
    c.id = f'1:0>16'

Refactor YAML file parsing backend from ruamel.yaml to pyyaml and introduce jina.jaml for parsing YAML files. The dependency on ruamel.yaml is deprecated. #1495, #1516, #1524, #1533, #1547, #1581
Add _merge_matches and _merge_chunks for merging messages in different ways. Remove _merge_all. #1406 #1418
PyClient renamed to Client for simplicity #1450

📗 Documentation

Update Korean Readme #1364 @doomdabo
Add code review guide #1397
Fix typos in helloworld.html. #1405 @harry-stark
Add documentation for recursive data structure. #1394
Fix redundant translation in Chinese Readme #1443 @smy0428
Fix missing CLI content. #1481
Fix typos in README.md. #1500 @Kavan72
Improve docs.jina.ai. #1513, #1514, #1586
Improve TorchDevice docstring #1499 @tadej-redstone
Fix typos in Russian Readme #1544, #1572 @git-webmaster
Fix typos in CLI interface #1578 @xinbinhuang
Add Spanish Readme #1579 @PabloRN

🐞 Bug Fixes and Other Changes

Flow

Fix issue terminating RemotePea #133
Refactor Pea closing logic #1379, #1398, #1457
Refactor peapods code base #1421
Add versioning for Flow YAML config files. Introduce method field for Flow YAML configurations. #1442
Add env filed for Flow and Pod YAML configuration so that shared environment variables can be set. #1446, #1448
Rename Flow output argument to on_done. #1476
Fix client top_k malfunctioning bug. #1522
Add return_list option for Flow API and introduce Response as new primitive type. When return_list=True, return results are a list of Response objects to make it easy to interpret. #1541
Fix CORS behavior bug for REST API #1568 @yk

Executors

Change default metric of NumpyIndexer to cosine #1393
Remove deprecated jina/executors/encoders/helper.py #1563 @tadejsv
Introduce batching_multi_input decorator to add batching support for rankers #1467 @deepampatel
Allow Indexers to have separate workspaces. #1383
Fix bug when shards are empty #1340, #1396

Drivers

Add op_name for Matches2DocRankDriver #1409
Add batch_size argument for EncodeDriver to enable batching on driver level #1483
Make DocIdCache capable of detecting collisions on content level #1510
Enable AggregateMatches2DocRankDriver for keeping chunks of matches #1494

Types

Add NamedScore as new primitive type. #1430
Support + and += operations for Document. #1555
Move extract_content() to DocumentSet. Instead of using docs = DocumentSet(random_docs(2)); extract_content(docs), docs.all_contents() makes it easier to get contents from a set of Documents. #1387
Refactor random_id and introduce content_hash field in Document. #1440

Tests

Improve unit tests for test_hello_world #1305
Refactor unit tests for queryset #1336
Refactor unit tests for evaluation #1339
Refactor unit tests for index remote #1346
Fix integration tests for jinad #1367, #1388, #1407
Refactor random_docs() in unit tests #1356
Add unit tests for convert functions in Document #1389
Fix callbacks in unit tests. callback failures had chance of being not captured by tests #1391
Fix integration tests for evaluation #1411
Refactor doctrings in unit tests of QueryLangSet #1417
Fix bug failing to capture errors of callbacks during unit tests. #1419, #1536
Refactor unit tests for types #1435
Refactor unit tests for request #1445
Add unit tests for corner cases in calculating similarity metrics #1434
Add evaluation option for hello-world #1465, #1488, #1508, #1501,
Add test for loading customized drivers #1474
Refactor unit test for drivers #1452
Set default value of eval_at in PrecisionEvaluator and RecallEvaluator to None #1552
Fix unit tests of test_hub_usage when GITHUB_TOKEN is used. #1560
Refactor unit tests for drivers #1559
Refactor unit tests in hubio to use BuildTestLevel #1361
Fix naming for test_rankingevaluation_driver #1573

HubIO

Fix Jina Hub automated updates and add GA for updating Jina Hub images. Check out more details at hub-updater #1298, #1345, #1360, #1456
Redefine naming convention of Docker images in Jina Hub. Naming follows {repository}/{type}.{kind}.{name}:{version}-{jina_version} #1341
Avoid overwriting Docker image in Jina Hub when tag already exists. #1365
Clean up hubio imports. #1381
Fix hubio version checking and add --no-overwrite option for jina hub --push #1403
Fix hubio test levels #1361
Add --timeout-ready option for hubio #1525
Fix typo in error message #1531
Fix access to token credential file for jina hub push #1492
Switch to hubapi for retrieving Docker login information #1429, #1589

Others

Adapt to new remote log APIs #1300
Adapt to Docker SDK 4.4.0 in ContainerPea #1334
Move log parser from jinad to core. #1342
Use load_config directly as a classmethod #1352, #1354
Fix bug during completing file path for errors #1353
Fix top-k setting bug #1359
Fix newlines for autocompletion in bash. #1425 @lsgrep
Fix latency check during CI #1437
Add client-side exception handlers #1458, #1462,
Add GA for automated comments on lint failures. #1486, #1507, #1519
Introduce ArgNamespace in jina.helper to manage all namespace-related operations #1489
Introduce training. #1518
Introduce jina.jaml for parsing YAML files. #1533, #1547, #1581
Fix bug in parsing config source files #1583

🙏 Thanks to our Contributors

This release contains contributions from Amritpal Singh, Bithiah Yuan, CatStark, Deepam Patel, Deepankar Mahapatro, Florian Hönicke, Han Xiao, Harry Stark, Hidan, Joan Fontanals, Nan Wang, Pratik Bhavsar, Rutuja Surve, Sergey M, Siyuan Shi, Szymon Skorupinski, Tadej Svetina, Wang Bo, Yannic Kilcher, Yusup, cristian, florian-hoenicke

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our [website](https://jina.ai/...

Assets 2

03 Jan 23:26

github-actions

v0.8.22

3178fef

🎉 Release v0.8.22

Release Note (`0.8.21`)

Release time: 2021-01-03 16:37:08

🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, 🙇

🏁 Unit Test and CICD

[a431537f] - fix github-push-action default branch (Han Xiao)
[b2430fbd] - fix tag release order (Han Xiao)

🍹 Other Improvements

[b2689933] - hotfix release (Han Xiao)

Assets 2

03 Jan 16:37

github-actions

v0.8.21

b268993

🎉 Release v0.8.21

Release Note (`0.8.18`)

Release time: 2021-01-01 15:44:56

🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇

🐞 Bug fixes

[4f50e802] - russian translation 101 (#1572) (Florian Hönicke)

🚧 Code Refactoring

[9b81559d] - redesign Runtime, Pea, Pod, Parser (#1539) (Han Xiao)

📗 Documentation

[04d7d9da] - versioning docs (Han Xiao)
[33590bd6] - fix parser module path (Han Xiao)

🍹 Other Improvements

[8b72e175] - hotfix release (Han Xiao)
[804a0a4c] - version: the next version will be 0.8.18 (Jina Dev Bot)

Assets 2

03 Jan 16:07

github-actions

v0.8.20

6109fa4

🎉 Release v0.8.20

Release Note (`0.8.18`)

Release time: 2021-01-01 15:44:56

🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇

🐞 Bug fixes

[4f50e802] - russian translation 101 (#1572) (Florian Hönicke)

🚧 Code Refactoring

[9b81559d] - redesign Runtime, Pea, Pod, Parser (#1539) (Han Xiao)

📗 Documentation

[04d7d9da] - versioning docs (Han Xiao)
[33590bd6] - fix parser module path (Han Xiao)

🍹 Other Improvements

[8b72e175] - hotfix release (Han Xiao)
[804a0a4c] - version: the next version will be 0.8.18 (Jina Dev Bot)

Assets 2

03 Jan 15:44

github-actions

v0.8.19

74cda5c

🎉 Release v0.8.19

Release Note (`0.8.18`)

Release time: 2021-01-01 15:44:56

🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇

🐞 Bug fixes

[4f50e802] - russian translation 101 (#1572) (Florian Hönicke)

🚧 Code Refactoring

[9b81559d] - redesign Runtime, Pea, Pod, Parser (#1539) (Han Xiao)

📗 Documentation

[04d7d9da] - versioning docs (Han Xiao)
[33590bd6] - fix parser module path (Han Xiao)

🍹 Other Improvements

[8b72e175] - hotfix release (Han Xiao)
[804a0a4c] - version: the next version will be 0.8.18 (Jina Dev Bot)

Assets 2

23 Nov 09:47

nan-wang

v0.8.0

7e52c60

🎉 Jina 0.8.0

We are excited to release Jina 0.8.0. Jina is an easier way to do neural search on the cloud. Highlights of this release include:

Introduce jinad to improve experience of using remote Flows/Pods/Peas
Add support for multimodal search SparseArray
Add jina.types module to offer Pythonic interface to access and manipulate protobuf objects.

Release 0.8.0

⬆️ Major Features and Improvements

Ease of Use

We introduce two new ways of using Jina Pods remotely:
- Create a remote Pod via SSH #1275
- Create a remote Pod via jinad. Jinad is a daemon process working together with jina on remote machines. Jinad makes it even easier to deploy Jina Flows/Pods/Peas on remote machines. Find out more details in the README #1182, #1203, #1254, #1297, #1299, #1307, #1312, #1324

Click here for example code

RemoteSSHPod	Jinad API
_{jina pod --host [email protected] --remote-access SSH}	_{jina pod --host 11.22.33.44 --port-expose 8000 --remote-access JINAD}

With jinad, you can create and use Pods directly from the Flow as well: Start the Docker container equipped with jinad on the remote machine as follows:

sudo docker run --rm -d --network host jinaai/jinad

Now you can directly create and use the remote pods from your local machine:

f = (Flow()
     .add(name='p1', uses='_logforward')
     .add(name='p2', host='10.11.22.33', port_expose='8000', uses='_logforward')
with f:
     f.search_lines(lines=['jina', 'is', 'cute'], output_fn=print)

We've added jina.types module, which offers a Pythonic interface to access and manipulate protobuf objects. The main types include Request, QueryLang, NdArray, Message, and Document. With the help of Jina types, you can construct inputs to Jina much more easily than before. #1283, #1284, #1289, #1323

Click here for example code

	v0.7.0	v0.8.0
Document	_{from jina.proto import jina_pb2 d = jina_pb2.DocumentProto() d.text = 'hello world'}	_{from jina import Document d = Document() d.text = 'abc'}
Request	_{from jina.proto import jina_pb2 r = jina_pb2.Request() d = r.docs.add()}	_{from jina.types.request import Request from jina.types.document import Document r = Request() d = Document() r.add_document(d)}
Message	_{from jina.proto import jina_pb2 r = jina_pb2.RequestProto.IndexRequestProto() m = jina_pb2.MessageProto() m.envelop = None m.request = r}	_{from jina.types.message import Message from jina.types.request import Request r = Request() m = Message(None, r)}
QueryLang	_{from jina.proto import jina_pb2 ql = jina_pb2.QueryLangProto(name='SliceQL') ql.parameters['start'] = 1 ql.parameters['end'] = 3}	_{from jina.types.querylang import QueryLang ql = QueryLang(SliceQL(start=1, end=3))}
NdArray	_{from jina.proto import jina_pb2 from jina.drivers.helper import array2pb a = jina_pb2.jina_pb2.NdArrayProto() a.CopyFrom(array2pb(np.ndarray([2, 17])))}	_{from jina.types.ndarray.generic import NdArray a = NdArray() a.value = np.ndarray([2, 17])}

Completeness

To support multimodal search, we've introduced BaseMultiModalEncoder and MultimodalDriver. Check out how to search fashion items with text and images together at Jina examples. #1141, #1144, #1154, #1156
We've introduced Classifiers, a new type of executor. With the help of Classifier, the new executor is designed to enrich the Documents with tags. Check out more details at docs.jina.ai #1194

⚠️ Breaking Changes

Refactor drivers for evaluation from function-based to type-based. #1165
- Removed EncodeEvaluationDriver and CraftEvaluationDriver
- TextEvaluateDriver, NDArrayEvaluateDriver, and FieldEvaluateDriver
- RankingEvaluationDriver renamed to RankEvaluateDriver
Introduce SparseNdArray and provide generic interface for SparseNdArray and DenseNdArray #1190, #1283

Click here for example code

v0.7.0

v0.8.0

dense array

_{from jina.proto import jina_pb2
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
a = jina_pb2.jina_pb2.NdArrayProto()
a.CopyFrom(array2pb(np.ndarray([2, 17])))}

_{from jina.types.ndarray.generic import NdArray
a = NdArray()
a.value = np.ndarray([2, 17])}

sparse array

_{not support}

_{from jina.types.ndarray.generic import NdArray
from .sparse.scipy import SparseNdArray
from scipy.sparse import coo_matrix
row = np.array([20, 0])
col = np.array([0, 20])
data = np.array([2, 17])
a = NdArray(is_sparse=True, sparse_cls=SparseNdArray)
a.value = coo_matrix((data, (row, col)), shape=(21, 21))}

Add callback_on and continue_on_error fot the client. callback_on_body is removed. #1265

Click here for example code

v0.7.0

v0.8.0

_{from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))

with f:
f.search_lines(lines=['hello', 'jina'], callback_on_body=True)}

_{from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))

with f:
f.search_lines(lines=['hello', 'jina'], callback_on='body')}

Add ProtoMessage, LazyRequest to replace the original jina_pb2.Message and jina_pb2.Request so that the protobuf message is deserialized in a lazy way #1210, #1283

Click here for example code

v0.7.0

v0.8.0

_{from jina.proto import jina_pb2
r = jina_pb2.RequestProto.IndexRequestProto()
m = jina_pb2.MessageProto()
m.envelop = None
m.request = r}

_{from jina.types.message import Message
from jina.types.request import Request
r = Request()
m = Message(None, r)}

🐞 Bug Fixes and Other Changes

Flow

Fix argument overridden bug for Pod when passing arguments from Flow #1189
Refactor num_part logic #1247
Enable client to interpret dict of json-like str into parsed documents #1282
Besides callback function for Flow API, three more actions added for postprocessing requests on_done, on_error, on_always #1303

Protos

Use Docker container to generate protobuf files #1241, #1242

Drivers

Refactor over-reduce logic to BaseDriver. Move ReduceDriver function into BaseDriver. Merge PassDriver and RouteDriver into RouteDriver #1228
Adapt the Drivers to the jina.type #1313,

Tests

Remove pip cache from Docker images #1168
Refactor unit tests for ContainerPea to pytest #1179
Switch back to use S3 bucket instead of GitHub for accessing fashionmnist dataset #1183
Refactor unit tests for CompoundExecutors to pytest #1192
Refactor unit tests for hello-world to pytest #1263
Refactor unit tests for indexing to pytest. #1258, #1237
Add unit tests for southpark example #1218
Fix flaky test #1219
Remove legacy code #1291, #1314
Adapt unit tests to jina.type #1319, #1320, #1322

Usability

Add --repository option for jina hub cli so users can push Pod images to their own repository. #1175
Replace id_tag argument with field in RankEvaluateDriver so users can access all fields of matches #1176

Documentation

Overhaul README.md #1213, #1226, #1244, #1249, #1257, #1264, #1271, #1293
Add Korean translation for README.md #1191
Fix multiple typos at docs.jina.ai #1...

Assets 2

26 Oct 12:40

nan-wang

v0.7.0

2110f52

🎉 release v0.7.0

Jina v0.7.0

We are excited to release Jina v0.7.0. Jina is an easier way to do a neural search on the cloud. Highlights of this release include:

Flow evaluation support
Support for preventing duplicates Documents in the index
Flow visualization support

Release v0.7.0

⬆️ Major Features and Improvements

Completeness

Evaluation is fully supported by Jina. jina.executors.evaluators and jina.drivers.evaluate have been introduced to make this happen. Now you can use different metrics to evaluate the Flow. No matter whether you want to evaluate the whole Flow or just part of it, the evaluation can be done smoothly without stopping the running Flow. #1043, #1086, #1087, #1090, #1092, #1099, #1100, #1102, #1114, #1134

Click here to see the example codes

code

index-doc.yml

eval.yml

_{from jina.flow import Flow
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
import numpy as np

def get_index_docs():
doc0 = jina_pb2.Document()
doc0.tags['id'] = '0'
doc0.embedding.CopyFrom(array2pb(np.array([1, 1])))
doc1 = jina_pb2.Document()
doc1.tags['id'] = '1'
doc1.embedding.CopyFrom(array2pb(np.array([1, -1])))
return [doc0, doc1]

# indexed two docs
f_index = (Flow().add(uses='index-doc.yml'))
with f_index:
f_index.index(input_fn=get_index_docs)

def get_eval_docs():
doc = jina_pb2.Document()
doc.embedding.CopyFrom(array2pb(np.array([1, 1])))
groundtruth = jina_pb2.Document()
match0 = groundtruth.matches.add()
match0.tags['id'] = '0'
match1 = groundtruth.matches.add()
match1.tags['id'] = '2'
return [(doc, groundtruth), ]

def validate(resp):
# retrieved docs with id `0` and `1`
# relevant docs with id `0` and `2`
# Precision@2 = 0.5
assert resp.docs[0].evaluations[0].value == 0.5

# evaluate Precision@2
f_eval = (Flow()
.add(uses='index-doc.yml')
.add(uses='eval.yml'))
with f_eval:
f_eval.search(
input_fn=get_eval_docs,
output_fn=validate,
callback_on_body=True)}

_{!CompoundIndexer
components:
- !NumpyIndexer
metas:
name: vecidx
- !BinaryPbIndexer
metas:
name: docidx
requests:
on:
IndexRequest:
- !VectorIndexDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVIndexDriver
with:
executor: docidx
traversal_paths: ['r']
SearchRequest:
- !VectorSearchDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVSearchDriver
with:
executor: docidx
traversal_paths: ['m']}

_{!PrecisionEvaluator
with:
eval_at: 2
id_tag: 'id'}

To prevent duplicates in the index, UniquePbIndexer and UniqueVectorIndexer are introduced together with the corresponding drivers in jina.drivers.cache. Please refer to docs.jina.ai for more details. #1064, #1081, #1147

Click here to see the example codes

_{from jina.flow import Flow
from jina.proto import jina_pb2

doc_0 = jina_pb2.Document()
doc_0.text = f'I am doc0'
doc_1 = jina_pb2.Document()
doc_1.text = f'I am doc1'

def assert_num_docs(rsp, num_docs):
assert len(rsp.IndexRequest.docs) == num_docs

f = Flow().add(
uses='NumpyIndexer', uses_before='_unique')

with f:
f.index(
[doc_0, doc_0, doc_1],
output_fn=lambda rsp: assert_num_docs(rsp, num_docs=2))}

Usability

Add visualization for Flow. Calling plot() function of Flow gives a better view of how the Flow looks. #1002, #1116

Click here to see the example codes

⚠️ Breaking Changes

Document.id, Document.parent_id and Relevance.ref_id are now string types instead of int. Please refer to docs.jina.ai for more details. #1005, #1034, #1136 Accordingly, the following changes are made,
- SortQL.field now uses dunder_get syntax rather than . expansion (e.g. a.b.c -> a__b__c, score.value -> score__value) and now supports dict and list access.
- first_doc_id, random_doc_id and override_doc_id have been removed from CLI.
Refactor logger config into YAML. Add --log-config to jina pea CLI, by default it points to logging.default.yml. --log-sse, --log-profile, --log-with-own-name are deprecated. #1031

Click here to check how the loggers are mapped to different resource files:

Filename	Logger in the code
logging.default.yml	`default_logger` and any logger defined with `JinaLogger()`
logging.docker.yml	`logger` used in the `ContainerPea`
logging.profile.yml	`profile_logger`
logging.remote.yml	`logger` used in the `RemotePea`

Refactor the codes for traversing recursive Documents. Replaced by traversal_paths, granularity_range, adjacency_range, recur_on and recursion_order are deprecated. This allows us to specify where the traversal should happen in an exact way. #995, #998, #1001, #1003, #1006, #1007, #1027, #1036, #1044
Protobuf request_id is now string type. --first-request-id removed from client CLI. --query-uses and --index-uses from hello-world CLI now renamed to --uses-query and --uses-index. #1049

🐞 Bug Fixes and Other Changes

Flow

Refactor log stream server with fluentd. Flunetd acts as a daemon collecting logs from different parts of Jina and forwarding them to a specific output. Check out more details at docs.jina.ai #1002, #999
Add ordinal_idx_arg for batching decorator to support passing ordinal index to indexers #1089
Refactor request_id to uuid #1049
Refactor logger wrapper #1029
Add ssh tunneling for Pod. You can specify ssh information #1018
Switch to hash function for generating ids #1005, #1034
Support to use --uses-before and --uses-after when --parallel=1. Both options only act on when parallel > 1. _pass and _forward are using RouteDriver by default. #1112
Rename replica_id to pea_id and fix the PeaRoleType #1015
Fix the bug in setting top_k #1133 #1138 #1145

Executors

Add checking for the existence of model paths #1077
Improve exception handling for the failure of loading pre-trained models #1065
Fix typing of indexers #1053
Fix the no attribute error for BaseOnnxEncoder #1107

Drivers

Fix bug in QueryDriver when passing dictionary argument. #1080

CLI

Improve the hubio module. jina hub login supports to login with the OAuth authentification. jina hub list is for list the available pods in the jina-hub. jina hub push support to build and push the pod images via Hubapi deployed on AWS API Gateway #1022, #1041, #1118, #1120, #1135
Add the update checking for jina cli #1117

Tests & CICD

Refactor test for Python client #1095
Add tests for including examples during ci #1088
Fix dependency conflicts in ci by replacing [match-py-ver] with [cicd] #1101
Improve PR review process by adding CODEOWNERS #1108
Refactor to pytest in testing request #1045
Add unit test for helper #1046
Fix io test #1052
Fix test coverage #1054, #1056
Use pytest fixture to remove tmp files #1021
Refactor the unit tests to pytest style in test_protobuf #1121
Add docker helper test #1115
Add test in the ci for testing examples #1142
Add test in the ci for testing hello-world in docker with no devel installed #1139

Documentation

Add Portuguese translation for README #1097
Add Ukrainian translation for README.md #1124
Fix Russian README #1057
Fix broken links in README #1033, #1037, #105
Fix links in CHANGELOG and CONTRIBUTING #1032
Improve the docstring for rank drivers #1143

Others

Fix duplicate lines in cookiecutter #1063
Fix conflicts between copyright adding action and typing #1023
Move numpy importing inside function #1019
Rename jina_cli to cli #1017
Fix typing error in mypy #1009
Fix line spaces in code #1105

🙏 Thanks to our Contributors

This release contains contributions from Alex C-G, Alex McKenzie, CatStark, Christopher Lennan, Deepankar Mahapatro, Fernanda Kawasaki, Han Xiao, Joan Fontanals Martinez, Ján Jendrušák, Maximilian Werk, Nan Wang, Oleh Yaroshchuk, Pratik Bhavsar, RenrakuRunrat, Rutuja Surve, Sai Sandeep Mutyala, Sergei Averkiev, Susana Guzman, Wang Bo, jancijen, pswu11

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Assets 2

04 Oct 18:14

nan-wang

0.6.0

df7a7f4

🎉 v0.6.0

Jina v0.6.0

We are excited to release Jina 0.6.0. Jina is the easier way to do neural search on the cloud. Highlights of this release include:

Improve the memory footprint for the Indexer.
Add an example for building a cross-modal search system with Jina.
Add support for indexing .pdf files.

Release 0.6.0

⬆️ Major Features and Improvements

Scalability

Improve the memory footprint for the Indexer. Instead of using the in-memory index during the query mode, both the NumpyIndexer and the BinaryPbIndexer use the memory mapping to better support scaling out for large datasets. To further improve the memory footprint for the vector index, ZarrIndexer based on Zarr has been added to Jina Hub. #950, #984.

Universal

Add an example for building a cross-modal search system with Jina. #978
Add support for indexing .pdf files. PdfExtractor has been added to Jina Hub. #981

⚠️ Breaking Changes

For details of all breaking changes, please refer to #885

Improve the way of traversing recursive document structure. #944, #933, #923, #893, #889,
Rename --yaml-path to --uses in Flow CLI #925, #922
Rename --uses-reducing to --uses-after and add --uses-before. This change enables us to customize the executors' behaviors before sending them to and after receiving from all parallels/shards. #925

🐞 Bug Fixes and Other Changes

Flow

Improve context management of Flow and Pod with ExitStack. #901,
Improve shut-down logic for log server #935, #958
Fix shut-down logic for Peas and Pods #907, #956
Refactor de-/serialization logic #988, #991

Executors

Add a meta variable force_register for executors in order to force Jina to use local version of executor. #883
Fix a bug in reducing functions for encoders. #900
Fix default behavior of CompoundIndexer #939
Fix bug in overwriting metas using Python client. #980

Drivers

Add CollectMatches2DocRankDriver for calculating matches with granularity=k-1 from Matches at granularity=k. #851
Add Matches2DocRankDriver for calculating new scores of matches from original scores #919
Add VectorFillDriver for filling embeddings of Document 2 #909, #913
Add support for using tags with QueryLangDrivers #938
Add support for traversing recursive Documents via explicit tree path definition. #983, #979, #994, #993
Enable BaseSegmenter to change mime_type. #981
Add NdArray2PngURI and Blob2PngURI for convert numpy arrays into data URI. #982

CLI

Add --test-uses option for jina hub build CLI for skipping failed-start peas when building Docker file. #902, #965
Add is_build_success field for checking results of jina hub build. #903
Add --type app option for jina hub new CLI for creating a new Jina app. #917
Add --push option for jina hub build CLI for building and pushing local executors to Jina Hub. #937
Improve jina hub list CLI. #985
Improve speed of CLI autocompletion. #992

Tests

Add more unit tests for reducing functions 1 #898
Move dependencies for unit tests into extra-requirements.txt #906
Add unit tests for sleeping executors #918
Add more unit tests for checking Peas #921
Add more unit tests for decorators of executors. #930
Add more unit tests for overriding Flow arguments. #926, #927
Fix name conflicts in test when running unit tests on Github. #961
Add more unit tests for support of Documents with chunks of different mime_type, #968

Documentation

Improve documentation for drivers #886, #888, #990
Improve README #894
Fix typos in documentation. #904, #912, #940, #978

Others

Improve helper functions. #948, #972, #974
Improve type of annotation. #962, #966, #967
Improve module importing logic for classes from jina-hub. #967
Improve authentification for jina hub #977
Jina ❤️ Hacktoberfest. #986

🙏 Thanks to our Contributors

This release contains contributions from Alasdair Tran, Alex C-G, David Sanwald, Deepankar Mahapatro, Han Xiao, JamesTang-jinaai, Joan Fontanals Martinez, Maximilian Werk, Nan Wang, Rutuja Surve, Sreerag-ibtl, Susana Guzman, Yue Liu, pswu11, rameshwara

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Assets 2

30 Aug 13:12

hanxiao

v0.5.0

47cfb89

🎉 v0.5.0

Jina 0.5.0 Release

We are excited to release Jina 0.5.0. Jina is the easier way to do neural search on the cloud. Highlights of this release include:

Recursive Document structure
Native data querying capabilities
Migration of Executors to Jina Hub
Support for Mindspore

⬆️ Major Features and Improvements

Completeness

Introduce recursive Document structure. In short, the protobuf definition of Document and Chunk are unified. In this new representation, Document has a recursive structure and the deprecated Chunk is now a nested Document one level deeper. This new proto enables cleaner driver design, yields more consistent low-level APIs, and provides great extensibility on future features. #652, #684, #700, #709 #729 #726

This is a breaking change. If you started using Jina before 0.4.1, we highly suggest you read our migration guide.

Add native data querying capabilities. With the new family of Drivers based on BaseQueryLangDriver, you can perform standard query operations on the Document. Here is a list of the new drivers:

Name	Description	Counterpart in other query languages
`FilterQL`	Filter the Document/Chunk by its attributes	`filter`/`where`
`SelectQL`, `SelectRegQL`, `ExcludeQL`, `ExcludeRegQL`	Select attributes	`select`/`exclude`
`SliceQL`	Take the first k doc/chunk	`limit`/`take`/`slicing`
`SortQL`	Sort a list of `Document`s	`sort`/`order_by`
`ReverseQL`	Reverse the list of collections	`reverse`

Check more details at New Query Language Driver.

Usability

Migrate executors to Jina Hub. Jina Hub is an open registry for hosting Jina executors via container images. It enables users to ship and exchange reusable components across various Jina search applications. Jina Hub is referred as a Git Submodule in Jina. The Jina team will maintain the executors on Jina Hub. You can build your own executors as well. #852, #842, #848, #855, #857, #861, #860, #871, #872, #879, #880, #854

Check more details at Jina Hub.

Universal

Add support for Mindspore. #836

⚠️ Breaking Changes

Unify yaml_file and image with uses. You can use: a YAML file path, a supported Executor's class name, the content of a YAML config, or a Docker image. Check more details by running jina pod --help or in the Jina docs #684

v0.4.0

v0.5.0

_{f = (Flow()
.add(name='from_class', yaml_file='_pass')
.add(name='from_yaml', yaml_file='mwu.yml')
.add(name='from_str', yaml_file='!OneHotTextEnocoder')
.add(name='from_docker', image='jinaai/hub.examples.mwu_encoder'))}

_{f = (Flow()
.add(name='from_class', uses='_pass')
.add(name='from_yaml', uses='mwu.yml')
.add(name='from_str', uses='!OneHotTextEnocoder')
.add(name='from_docker', uses='jinaai/hub.examples.mwu_encoder'))}

Replace the replicas argument with parallel to avoid misunderstanding. parallel indicates how many Peas are running in parallel. #700

v0.4.0	v0.5.0
_{!Flow pods: encode: uses: helloworld.encoder.yml replicas: 2}	_{!Flow pods: encode: uses: helloworld.encoder.yml parallel: 2}

Replace join with needs to improve readability. #762

v0.4.0

v0.5.0

_{f = (Flow()
.add(name='p1', uses='_pass')
.add(name='p2', uses='_pass', needs='p1')
.add(name='p3', uses='_pass', needs='p1')
.needs(['p2', 'p3']))}

_{f = (Flow()
.add(name='p1', uses='_pass')
.add(name='p2', uses='_pass', needs='p1')
.add(name='p3', uses='_pass', needs='p1')
.join(needs=['p2', 'p3']))}

Introduce recursive Document structure. This affects a wide range of drivers and executors. Please refer to the full list at #702

🐞 Bug Fixes and Other Changes

Flow

Refactor and improve the code for building the Flow. #685
Fix export_api. #695
Fix the Pea name. #698
Fix the bug of two join operations in the same Flow. #730
Add an alias _pass for _forward; add an argument, name, for Flow.join() so that one can customize the name of the Pods; add an argument, uses, for Flow.join(), which unifies the usage of yaml_path and images. #748
Improve URL regex pattern matching #780

Executors

Add FeatureAgglomeration, TSNEEncoder, RandomSparseEncoder, RandomGaussianEncoder in the numeric encoders. #567, #838
Fix multiple bugs in MilvusIndexer #677 #679
Support full range of models from 🤗Transformers. #701
Fix the type bug in NgtIndexer. #742
Refactor the image crafter. #759
Refactor the framework-based executors to make it easier to build executors from various DL frameworks. #771, #800
Add ImageFlipper. #777
Fix cached_property. #785
Add TorchObjectDetectionSegmenter in the crafters for object detection. #770, #784, #788
Fix the bug in cropping the image. #769
Add a query_by_id function for BaseVectorIndexer so that we can query by Document id. #827
Refactor FaissIndexer #825
Fix a bug in serialization of the indexer. #874

Drivers

Fix the slicing bug in the QueryLang and improve the documents. #696, #714, #822
Add ConcateEmbedDriver for concatenate vectors. #748
Fix the default value issue of the level_depth. #817

Documentation

Add a shortcut for search in the docs. You can start searching by hitting the / key. #683
Add section on common practices. #812
Add a wall of contributors. For our awesome contributors, we've now put your profiles on our README Thanks to all of you! #832, #835
Add more explanations for commit messages to make it easier to contribute. #826
Rephrase and fix typos #722, #731, #740, #768, #818, #820, #821, #837, #849
Improve visualization and fix cluttered TOC. #801

Protos

Refactor tags from map to Struct. #719

Tests

Add unit test for QueryLang. #710
Add tests for VectorSearchDriver and KVSearchDriver. #733
Add tests for EncodeDriver. #734
Add tests for CraftDriver. #737
Add tests for SegmentDriver. #738
Add tests for SliceQL. #782
Add tests for Chunk2DocRankDriver. #813
Improve the unit tests for indexers and add type checking. #838, #844

Others

Add tests and coverage report in CI. Jina's current test coverage is 76.52% #713 #682
Add typing to Jina. #761
Fix the broken labeling action. #787
Support ignoring packages on the dependency list. #859
Add missing Pillow dependency. #858

🙏 Thanks to our Contributors

This release contains contributions from Alex C-G, Andrey Vasnetsov, Anish Pawar, BingHo1013, Emmanuel Adesile, Eric Shen, Han Xiao, JamesTang616, Joan Fontanals Martinez, Kavan72, Maanav Shah, Morry Wang, Nan Wang, Rohan Chaudhari, Shivam Ra, Shivam Raj, Yue Liu, Zenahr Barzani, coolmian, dima, fhaase2, hanxiao, joanna350, roccia, shivam-raj.

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Assets 2

30 Jul 02:43

nan-wang

v0.4.0

eab2b4c

🎉 v0.4.0

Jina 0.4.0

We are excited to release Jina 0.4.0. Jina is the easier way to do neural search on the cloud. Highlights of this release include fallbacks if GPU is unavailable, FaissIndexer on GPU, and switching indexers during querying.

Release 0.4.0

⬆️ Major Features and Improvements

Usability

Add a new value for the on_gpu field. Setting on_gpu: auto in the yaml configure will first check if a GPU device is available and fallback to CPUs when no GPU is found. #617
Improve the accessibility of jina helloworld. We add a CLI argument to enable downloading via the proxy. If you are using a proxy to speed up your internet, try jina helloworld --download-proxy http://127.0.0.1:1087. Just replace the ip and port with your proxy settings. #595
Support to switching between different Indexers during querying. A new argument, ref_indexer, is added for this purpose. With the following yaml config of Indexer, NumpyIndexer is used for indexing and AnnoyIndexer is used for querying. The supported Indexer includes FaissIndexer, AnnoyIndexer, NGTIndexer, NmslibIndexer, SptagIndexer, and NumpyIndexer.
```
!AnnoyIndexer
with:
    ref_indexer:
        !NumpyIndexer
        with:
            index_filename: wrap-npidx
```
#599 #589
Add a new parameter skip-on-error for the Pods. This argument is used to set up on which level you want jina to skip the errors. Check out more details at jina docs #570
```
 !ImageReader
 with:
     skip-on-error: 'EXECUTOR'
```

Scalability

Multiple improvements have been made to speed up the performance.
- Improve the performance of NumpyIndexer. The argsort function is replaced by argpartion, which avoids the unnecessary sorting procedure and speed up the querying process. #641
- Switch to zmqstream for the default message handler, which improves the performance of networking. #618
- Use uvloop from tornado to improve the event handling speed in the Pods. #615

New Executors

Add NGTIndexer. NGT provides high-speed ANN searches for a large volume of data in high dimensional vector data space. #533
```
 !NGTIndexer
 with:
     index_filename: index.gz
     num_threads: 2
     metric: 'l2'
     epsilon: 0.1
```
Add support to running FaissIndexer on GPU and a new argument n_prob for FaissIndexer. Check out more details of the usages at our examples. #636 #638
```
!FaissIndexer
with:
    index_filename: index.gz
    index_key: 'IVF10,PQ4'
    train_filepath: train.gz
    distance: 'l2'
    nprobe: 1
```
Add support for Milvus as a new Indexer. Now you can do indexing and querying with MilvusIndexer. [W.I.P] #651
Add CustomKerasImageEncoder so that you can use your customized model from keras to encode images in jina. The following yaml config loads the model from path/to/your/model and use output of the layer with the name of awesome/encoding/layer as embedding results. #563
```
!CustomKerasImageEncoder
with:
    model_path: path/to/your/model
    layer_name: awesome/encoding/layer
```

Add an argument search_k for AnnoyIndexer. #642

!AnnoyIndexer
with:
    index_filename: index.gz
    metric: 'euclidean'
    n_trees: 10
    search_k: -1

Add FastICAEncoder for encoding. #590

!FastICAEncoder
with:
    output_dim: 32,
    num_features: 128,
    whiten: False,

Documentation

Welcome our evangelist @alexcg1 from New Zealand! He has been working hard on improving document readability, Jina 101, contribution guidelines and README retouches. A new document has been added to guide new contributors. #566

#564
#558
#545

Unit tests

Add the coverage testing. Proudly, Jina's current test coverage is 73.04%. #659

⚠️ Breaking Changes

Rename port_grpc to port_expose. Now we’ve support both gRPC and RESTful APIs and therefore port_grpc does not live up to its name any longer. port_grpc will be deprecated in the future version. #598
Refactor ImageReader to inherit from BaseDocCrafter rather DocSegmenter. In case that you are using ImageReader, check out our examples for more details. #627
Refactor Ranker. The TopKFilterDriver is now used to filter out the chunks that do not belong to the top k documents. This driver is attached to Ranker by default. For DocPbIndexer and DataURIPbIndexer, TopKFilterDriver is removed from the default attachment. With k shards, this will leads to n * k results returned from the indexer when querying. #574
Remove the password_stdin argument for the jina hub CLI. #569

🐞Bug Fixes and Other Changes

Flow

Fix the search_lines API for the Flow #606

Executors

Add a new argument truncation_strategy in BaseTransformerEncoder to adapt the latest Huggingface Transformers v3.0.0. #623

!TransformerTorchEncoder
with:
    pooling_strategy: cls
    model_name: distilbert-base-cased
    max_length: 96
    truncation_strategy: longest_first

Add size property for the indexers. #581

Drivers

Add a new driver UnaryEncoderDriver dedicated for testing and debugging. #635
Fix the problem of PublishDriver. PublishDriver is used to modify the num_parts when the pod is connect to another by the PUB-SUB connection. However, PublishDriver overwrites the original driver of the pod. #569
Remove the if clauses from the Drivers. #646

Protos

Add tags field in the Chunk and Document proto. The tags field is a map of strings and is designed to storage the value of the other fields that will be used for the filtering purpose. #574
Add location field for the Chunks. location is a list of integers. It can be used to mark the position or string, or the coordinates of an image, or the timestamp of an audio clip. #578

Tests

Improve and fix the unit tests. #609 #612 #579 #628

🙏 Thanks to our Contributors

This release contains contributions from hanxiao, JoanFM, nan-wang, fhaase2, anish2197, alexcg1, BingHo1013, shivam-raj, Morriaty-The-Murderer, festeh, generall, emmaadesile, coolmian, JamesTang616, and YueLiu-jina

🙏 Thanks to our Community

And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.

🤝 Work with Jina

Want to work with Jina full-time? Check out our openings on our website.

Assets 2

Releases: jina-ai/serve

🎉 Jina 0.9.0

Jina 0.9.0

Release 0.9.0

⬆️ Major Features and Improvements

Completeness

Ease of Use

⚠️ Breaking Changes

📗 Documentation

🐞 Bug Fixes and Other Changes

Flow

Executors

Drivers

Types

Tests

HubIO

Others

🙏 Thanks to our Contributors

🙏 Thanks to our Community

🤝 Work with Jina

🎉 Release v0.8.22

Release Note (0.8.21)

🏁 Unit Test and CICD

🍹 Other Improvements

🎉 Release v0.8.21

Release Note (0.8.18)

🐞 Bug fixes

🚧 Code Refactoring

📗 Documentation

🍹 Other Improvements

🎉 Release v0.8.20

Release Note (0.8.18)

🐞 Bug fixes

🚧 Code Refactoring

📗 Documentation

🍹 Other Improvements

🎉 Release v0.8.19

Release Note (0.8.18)

🐞 Bug fixes

🚧 Code Refactoring

📗 Documentation

🍹 Other Improvements

🎉 Jina 0.8.0

Release 0.8.0

⬆️ Major Features and Improvements

Ease of Use

Completeness

⚠️ Breaking Changes

🐞 Bug Fixes and Other Changes

Flow

Protos

Drivers

Tests

Usability

Documentation

🎉 release v0.7.0

Jina v0.7.0

Release v0.7.0

⬆️ Major Features and Improvements

Completeness

Usability

⚠️ Breaking Changes

🐞 Bug Fixes and Other Changes

Flow

Executors

Drivers

CLI

Tests & CICD

Documentation

Others

🙏 Thanks to our Contributors

🙏 Thanks to our Community

🤝 Work with Jina

🎉 v0.6.0

Jina v0.6.0

Release 0.6.0

⬆️ Major Features and Improvements

Scalability

Universal

⚠️ Breaking Changes

Release Note (`0.8.21`)

Release Note (`0.8.18`)

Release Note (`0.8.18`)

Release Note (`0.8.18`)