-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Hello and thanks for providing this excellent implementation of htsget!
We are trying to use it together with a minio S3 storage and so far it worked well, however we noticed that when requesting a specific region, the /variants endpoint returned all records regardless. I believe this is a bug in htsget-rs based on the server logs, but I may also have mis-interpreted them or mis-used htsget-rs. Do you have any advice or suggestion on where the issue might be?
Environment:
- rust version: 1.75
- htsget-actix version: v0.6.1
- our htsget-rs config file
Steps to reproduce:
- Download example file https://github.com/vcflib/vcflib/blob/master/samples/sample.vcf
- Process file into bcf:
bgzip sample.vcf
bcftools index sample.vcf.gz
bcftools convert -Ob -o abc.bcf sample.vcf.gz
bcftools index abc.bcf
bcftools index -s abc.bcf.csi
19 . 2
20 . 6
X . 1- Upload bcf + csi index into s3 bucket
- Send a GET request to htsget for contig "19"
- Send a GET request to htsget for contig "20"
Observed behaviour: Both queries returned all variant records from all contigs.
Expected behaviour: Only variants of the requested chromosome are returned.
Observations:
Log from the contig 19 request shows that the query was parsed properly, and that segments (10,16) were requested.
2024-05-29T11:42:02.765895Z INFO HTTP request{http.method=GET http.route=/variants/{id:.+} http.flavor=1.0 http.scheme=http http.host=htsget:8080 http.client_ip=172.23.0.5
http.user_agent=python-requests/2.31.0 http.target=/variants/ex/abc?referenceName=19&start=1000&end=5000&format=BCF otel.name=HTTP GET /variants/{id:.+} otel.kind="server" request_id=a9d7330
0-be43-4965-b972-ebfbef74041c}:variants{request=Query({"end": "5000", "referenceName": "19", "format": "BCF", "start": "1000"}) path=Path("ex/abc") http_request=
HttpRequest HTTP/1.0 GET:/variants/ex/abc
query: ?"referenceName=19&start=1000&end=5000&format=BCF"
params: Path { path: Url { uri: /variants/ex/abc?referenceName=19&start=1000&end=5000&format=BCF, path: None }, skip: 16, segments: [("id", Segment(10, 16))] }
headers:
"host": "htsget:8080"
"connection": "close"
"accept": "*/*"
"accept-encoding": "gzip, deflate, br"
"user-agent": "python-requests/2.31.0"
}: htsget_actix::handlers::get: variants endpoint GET request request=Request { path: "ex/abc", query: {"end": "5000", "referenceName": "19", "format": "BCF", "start": "1000
"}, headers: {"host": "htsget:8080", "connection": "close", "accept": "*/*", "accept-encoding": "gzip, deflate, br", "user-agent": "python-requests/2.31.0"} }
Logs from the contig 20 query show that the same segments (10,16) were requested, although the query is different. I am not sure whether I interpret this properly.
2024-05-29T11:56:32.734937Z INFO HTTP request{http.method=GET http.route=/variants/{id:.+} http.flavor=1.0 http.scheme=http http.host=htsget:8080 http.client_ip=172.23.0.5
http.user_agent=python-requests/2.31.0 http.target=/variants/ex/abc?referenceName=20&start=1000&end=5000&format=BCF otel.name=HTTP GET /variants/{id:.+} otel.kind="server" request_id=baa2d58
8-3098-4ff5-acc7-decb9490403b}:variants{request=Query({"referenceName": "20", "start": "1000", "format": "BCF", "end": "5000"}) path=Path("ex/abc") http_request=
HttpRequest HTTP/1.0 GET:/variants/ex/abc
query: ?"referenceName=20&start=1000&end=5000&format=BCF"
params: Path { path: Url { uri: /variants/ex/abc?referenceName=20&start=1000&end=5000&format=BCF, path: None }, skip: 16, segments: [("id", Segment(10, 16))] }
headers:
"host": "htsget:8080"
"user-agent": "python-requests/2.31.0"
"accept": "*/*"
"accept-encoding": "gzip, deflate, br"
"connection": "close"
}: htsget_actix::handlers::get: variants endpoint GET request request=Request { path: "ex/abc", query: {"referenceName": "20", "start": "1000", "format": "BCF", "end": "5000
"}, headers: {"host": "htsget:8080", "user-agent": "python-requests/2.31.0", "accept": "*/*", "accept-encoding": "gzip, deflate, br", "connection": "close"} }