In my opinion this is not the same issues since there is no way to know the size of the data associated with a raw file descriptor that is not bound to a "physical" resource, meaning that a demuxer that relies on stream_Size will fail because we are opening a file-descriptor (not because the underlying implementation invoked by stream_Size is unreliable).
There are issues associated with file-descriptors that does not correspond to the http accessor, such as not being able to seek. I do not think the comparison you are making is relevant to the case in question.
Imagine that the call to stream_Peek is done in the open-callback of one demuxer that later realizes that it is not a viable option for this type of resource. Now pretend that the stream_Size that follows happens in the demuxer that is to be tried next.
The call to stream_Size will fail because the http accessor has reached end-of-stream, and when it then issues another request based on file->offset, it will get back status-code 416.
The accessor relies on the presence of the (in case of 416, optional) Content-Range-header in order to get the size of a resource; which for many httpd-implementation is not sent on 416.
Hacking around the issue
Given that the http accessor will fail if we have reached end-of-stream, the below can circumvent the issue for that particular accessor (which in my book further demonstrates the problem with the current implementation).
uint64_t i_pos = stream_Tell( p_src );stream_Read( p_src, NULL, 1 ); /* hack to invalidate peek */stream_Seek( p_src, 0 ); /* seek to zero */stream_Size( p_src ); /* get size */stream_Seek( p_src, i_prev_pos ); /* restore position */
Proposed solution
To get the size of a resource over HTTP one could rely on the Content-Range-header if such is available in the current reply, otherwise fallback on issuing a HEAD-request to look at the Content-Length in the response.
Establishing one TCP connection, one TLS session, sending one HEAD request and waiting for the response every time VLC calls GET_SIZE is pure insanity. Do not count on me to explain that insanity to server admins.
There are absolutely no excuses for the image demuxer to fail to handle small files, regardless of known size and seek support. This problem is in no ways specific to bad server implementations of HTTP 416 error, or to HTTP for that matter.
The error strongly suggests that either some demuxers are peeking more data than is reasonable during probe, that the cache block is poorly implemented, or both.
Establishing one TCP connection, one TLS session, sending one HEAD request and waiting for the response every time VLC calls GET_SIZE is pure insanity.
As stated, rely on Content-Rangeif such header is available in the response (which is required to be present for an accepted Range-request), otherwise fallback on a HEAD request.
I have never said that we should issue HEAD all the time, only when the information we want is not present in the original response.
Replying to [comment:7 courmisch]:
There are absolutely no excuses for the image demuxer to fail to handle small files, regardless of known size and seek support.
The root cause is not only relevant to small image demuxer; everything where you potentially hit end-of-stream prior to doing stream_Size is affected - and this can be due to a number of different reasons.
Of course, one could;
"cache" the size in the client code, or;
implement a workaround such as what was posted earlier,
but neither alternative seems reasonable.
Replying to [comment:7 courmisch]:
This problem is in no ways specific to bad server implementations of HTTP 416 error, or to HTTP for that matter.
Considering implementations that does not send Content-Range on HTTP 416 to be "bad" sure is subjective:
http://atch.se/robots.txt missing Content-Range = Apache/2.4.20https://www.google.com/robots.txt missing Content-Range = sffehttp://www.bbc.com/robots.txt missing Content-Range = Apachehttp://git.videolan.org/robots.txt ok = nginx/1.11.2https://github.com/robots.txt ok = GitHub.com
Surely, the probing could be done in a different manner (for the above I manually checked so that 416 is returned, but an extensive study should be more robust).
Establishing one TCP connection, one TLS session, sending one HEAD request and waiting for the response every time VLC calls GET_SIZE is pure insanity.
As stated, rely on Content-Rangeif such header is available in the response (which is required to be present for an accepted Range-request), otherwise fallback on a HEAD request.
Yeah and that is just as insane. It will add N request(s) per HTTP streams for zero added value. Stating the obvious here, but the stream size brings absolutely no information if the stream is already at its end.
Not to mention that VLC code tends to assume that stream_GetSize() is free/fast.
There are absolutely no excuses for the image demuxer to fail to handle small files, regardless of known size and seek support.
The root cause is not only relevant to small image demuxer; everything where you potentially hit end-of-stream prior to doing stream_Size is affected - and this can be due to a number of different reasons.
As a matter of facts, you are wrong. As ALREADY noted in this bug report, we have plenty of demuxers who deal with nonseekable and/or unsized input streams just fine. The image demuxer and the stream API are in disagreement (probably stream_Block).