Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Intermittent httpx.RemoteProtocolError: Server disconnected without sending a response. #4971

Open
shihgianlee opened this issue Sep 11, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@shihgianlee
Copy link

shihgianlee commented Sep 11, 2024

Describe the bug

The error occurs intermittently, approximately once every 5,000 requests. It appears to be a well-known httpx issue.

We increased the timeout from 15 to 30 seconds but are still encountering the error. From our observations, it seems the timeout setting in BentoML is not being propagated to the httpx client. A user increased the timeout on the service and client, which appeared to resolve the issue.

See stacktrace below:

httpx.RemoteProtocolError: Server disconnected without sending a response.
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
    return await self._connection.handle_async_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/http11.py", line 143, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/http11.py", line 113, in handle_async_request
    ) = await self._receive_response_headers(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/http11.py", line 186, in _receive_response_headers
    event = await self._receive_event(timeout=timeout)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/http11.py", line 238, in _receive_event
    raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/ddtrace/contrib/httpx/patch.py", line 142, in _wrapped_async_send
    resp = await wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1674, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1702, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1776, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

To reproduce

See httpx error reproduction.

Expected behavior

No response

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=True
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=5000
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.3.2
python: 3.9.20
platform: Linux-6.1.85+-x86_64-with-glibc2.36
uid_gid: 1034:1034

pip_packages
aiohappyeyeballs==2.4.0
aiohttp==3.10.5
aiosignal==1.3.1
aiosqlite==0.20.0
annotated-types==0.7.0
anyio==4.4.0
appdirs==1.4.4
asgiref==3.8.1
async-timeout==4.0.3
attrs==24.2.0
bentoml==1.3.2
bowler==0.9.0
bytecode==0.15.1
cachetools==5.5.0
cattrs==23.1.2
certifi==2024.8.30
charset-normalizer==3.3.2
circus==0.18.0
click==8.1.7
click-option-group==0.5.6
cloudpickle==3.0.0
colorama==0.4.6
contextlib2==21.6.0
dask==2024.6.2
db-dtypes==1.3.0
ddsketch==3.0.1
ddtrace==1.20.15
deepmerge==2.0
deprecated==1.2.14
dill==0.3.8
envier==0.5.2
exceptiongroup==1.2.2
fastapi==0.114.0
fastavro==1.9.7
feast==0.36.0
fissix==24.4.24
frozenlist==1.4.1
fs==2.4.16
fsspec==2023.12.2
google-api-core==2.19.2
google-auth==2.34.0
google-cloud-bigquery==3.12.0
google-cloud-bigquery-storage==2.26.0
google-cloud-bigtable==2.26.0
google-cloud-core==2.4.1
google-cloud-datastore==2.20.1
google-cloud-pubsub==2.21.0
google-cloud-secret-manager==2.16.1
google-cloud-storage==2.18.2
google-crc32c==1.6.0
google-resumable-media==2.7.2
googleapis-common-protos==1.65.0
greenlet==3.1.0
grpc-google-iam-v1==0.13.1
grpcio==1.66.1
grpcio-health-checking==1.62.3
grpcio-reflection==1.62.3
grpcio-status==1.62.3
grpcio-tools==1.62.3
gunicorn==23.0.0
h11==0.14.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.2
httpx-ws==0.6.0
idna==3.8
importlib-metadata==6.11.0
importlib-resources==6.4.5
inflection==0.5.1
iniconfig==2.0.0
inquirerpy==0.3.4
jinja2==3.1.4
joblib==1.4.2
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
limits==3.13.0
locket==1.0.0
markdown-it-py==3.0.0
markupsafe==2.1.5
mdurl==0.1.2
mmh3==4.1.0
moreorless==0.4.0
multidict==6.1.0
mypy==1.11.2
mypy-extensions==1.0.0
mypy-protobuf==3.1.0
numpy==1.24.4
nvidia-ml-py==11.525.150
opentelemetry-api==1.20.0
opentelemetry-instrumentation==0.41b0
opentelemetry-instrumentation-aiohttp-client==0.41b0
opentelemetry-instrumentation-asgi==0.41b0
opentelemetry-sdk==1.20.0
opentelemetry-semantic-conventions==0.41b0
opentelemetry-util-http==0.41b0
packaging==24.1
pandas==1.5.3
pandavro==1.5.2
partd==1.4.2
pathspec==0.12.1
pfzy==0.3.4
pip==23.0.1
pip-requirements-parser==32.0.1
pluggy==1.5.0
prometheus-client==0.20.0
prompt-toolkit==3.0.47
proto-plus==1.24.0
protobuf==4.23.3
psutil==6.0.0
py==1.11.0
pyarrow==17.0.0
pyasn1==0.6.0
pyasn1-modules==0.4.0
pydantic==2.8.2
pydantic-core==2.20.1
pygments==2.18.0
pylogbeat==2.0.1
pyparsing==3.1.4
pytest==7.1.3
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
python-logstash-async==2.6.0
python-multipart==0.0.9
python-service-common==0.3.1
pytz==2024.1
pyyaml==6.0.2
pyzmq==26.2.0
referencing==0.35.1
requests==2.32.3
rich==13.8.1
rpds-py==0.20.0
rsa==4.9
schema==0.7.5
scikit-learn==0.24.2
scipy==1.13.1
setuptools==74.1.2
simple-di==0.1.5
six==1.16.0
sniffio==1.3.1
sqlalchemy==1.4.54
sqlalchemy2-stubs==0.0.2a38
starlette==0.38.5
tabulate==0.9.0
tenacity==8.5.0
threadpoolctl==3.5.0
toml==0.10.2
tomli==2.0.1
tomli-w==1.0.0
toolz==0.12.1
tornado==6.4.1
tqdm==4.66.5
typeguard==4.3.0
types-protobuf==5.27.0.20240907
typing-extensions==4.12.2
urllib3==2.2.2
uv==0.4.8
uvicorn==0.30.6
uvloop==0.20.0
volatile==2.1.0
watchfiles==0.24.0
wcwidth==0.2.13
websockets==13.0.1
wheel==0.44.0
wrapt==1.16.0
wsproto==1.2.0
xgboost==1.6.1
xmltodict==0.13.0
yarl==1.11.1
zipp==3.20.1
@shihgianlee shihgianlee added the bug Something isn't working label Sep 11, 2024
@shihgianlee shihgianlee changed the title bug: Intermittent httpcore.RemoteProtocolError: Server disconnected without sending a response. bug: Intermittent httpx.RemoteProtocolError: Server disconnected without sending a response. Sep 11, 2024
@parano
Copy link
Member

parano commented Sep 12, 2024

Hi @shihgianlee, thanks for reporting the issue. Could you share more about your service code structure and where the timeout value was set?

@shihgianlee
Copy link
Author

Hi @shihgianlee, thanks for reporting the issue. Could you share more about your service code structure and where the timeout value was set?

Sure. We have 2 services defined where service A depends on service B. We have adaptive batching turned on in service B. Also, we have async defined in service A. However, we are getting intermittent Server disconnected without sending a response errors when calling service B from service A in our production Kubernetes cluster.

@bentoml.service(
    traffic={"timeout": 30},
    workers=1,
    logging={
        "access": {
            "enabled": False,
        }
    }
)
class AService:
    b_service = bentoml.depends(BService)

   ...

    @bentoml.api(input_spec=InputFeaturesV2, route="/v2/predict")
    async def predict_v2(self, ctx: bentoml.Context, **params: t.Any):
        input_data = InputFeaturesV2(**params)
        try:
           results = await self.b_service.to_async.predict([features_processed])
           ...
@bentoml.service(
    traffic={"timeout": 30},
    workers=1,
    logging={
        "access": {
            "enabled": False,
        }
    }
)
class BService:

      ...

    @bentoml.api(batchable=True)
    def predict(self, input_list: list[Features]):
        features = pd.DataFrame(input_list)
        try:
            predictions = self.predict(features)
            return predictions
        except Exception:
            logger.exception("Unexpected exception caught")
            raise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants