Pulse · triton-inference-server/server

June 8, 2024 – June 15, 2024

23 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

TensorRT model low throughput (with cuda shmem or system shmem)
#6978 commented on Jun 13, 2024 • 4 new comments
Does ensemble model release CUDA cache?
#5237 commented on Jun 12, 2024 • 4 new comments
Uneven QPS leads to low throughput and high latency as well as low GPU utilization
#7318 commented on Jun 12, 2024 • 4 new comments
Tritonserver Physical RAM Grow Overtime
#6781 commented on Jun 13, 2024 • 3 new comments
Segmentation fault (core dumped) - Server version 2.46.0
#7330 commented on Jun 11, 2024 • 3 new comments
Does Triton Server support Dynamic Request Batching for models which has sparse tensors as inputs
#7333 commented on Jun 11, 2024 • 3 new comments
ci: Add INT64 Datatype Support for Shape Tensors in TensorRT Backend
#7329 commented on Jun 14, 2024 • 2 new comments
Triton Server 24.05 can't initialize CUDA drivers if host system has installed Nvidia driver 555.85
#7319 commented on Jun 12, 2024 • 2 new comments
triton malloc fail
#7308 commented on Jun 12, 2024 • 2 new comments
Single docker layer is too large
#7314 commented on Jun 12, 2024 • 2 new comments
Improve the L0_io to test for peer access
#3893 commented on Jun 10, 2024 • 1 new comment
How to send binary data (audio file) in perf_analyzer?
#6701 commented on Jun 14, 2024 • 1 new comment
Unable to use pytoch library with libtorch backend when using triton inference server In-Process python API
#7222 commented on Jun 13, 2024 • 1 new comment
CUDA runtime API error raised when using only cpu on Mac M3
#7324 commented on Jun 11, 2024 • 1 new comment
Unfixed bugs：issue/5783, Inaccurate request handling when configuring queue policy
#6796 commented on Jun 11, 2024 • 1 new comment
Triton Server Crash with Signal (11)
#6720 commented on Jun 11, 2024 • 1 new comment
Why is my model in ensemble receiving out-of-order input
#7303 commented on Jun 11, 2024 • 1 new comment
Add TT-Metalium as a backend
#7305 commented on Jun 11, 2024 • 1 new comment
Low QPS with momentary traffic surges cause significant increases in inference TP99 latency.
#7313 commented on Jun 11, 2024 • 1 new comment
Memory over 100% with decoupled dali video model
#7315 commented on Jun 11, 2024 • 1 new comment
When the request is large, the Triton server has a very high TTFT.
#7316 commented on Jun 11, 2024 • 1 new comment
Building and developing with libtritonserver.so
#7320 commented on Jun 11, 2024 • 1 new comment
fix: Fix version for setuptools
#7331 commented on Jun 13, 2024 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

June 8, 2024 – June 15, 2024

Overview

Could not load contribution data

4 Pull requests merged by 4 people

4 Pull requests opened by 3 people

5 Issues closed by 5 people

9 Issues opened by 9 people

23 Unresolved conversations

Insights: triton-inference-server/server

June 8, 2024 – June 15, 2024

Overview

Could not load contribution data

4 Pull requests merged by 4 people

4 Pull requests opened by 3 people

5 Issues closed by 5 people

9 Issues opened by 9 people

23 Unresolved conversations