- Fix for not sending ranges which exceed the content-length which will result
in an error.
- Fix for byte range being set to 1 byte too large, it should be the end index,
not the size as described in
https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
Don't assume that the reason why we didn't find enough slashes in a
URL is because the user didn't specify the slash at the end of the
host name, unless we did find the first two slashes.
Add some curly braces around an if block to make it clear to people
and the compiler which statement an `else` applies to. The logic was
correct before but the indentation was wrong, making it especially
confusing.
The check for the HTTP response code from the curl library was written
incorrectly and guaranteed to always fail. I've fixed the logic to
reflect what I believe was intended.
There's a small block of code that calls strnlen on a string, saves
the esult in a variable, conditionally decrements the variable, and
then does nothing with it, making the entire block of code a no-op.
I don't want to just remove it entirely since it's possible that there
was intended to be some sort of check here that was inadvertently
omitted. So to make the compiler stop complaining I've commented out
the code, but I've left a comment above it explaining why it was
commented out and pointing out that maybe something different needs to
be done with it.
Some sites put unencoded characters in their href attributes that
really should be encoded, most notably spaces. Curl won't accept a URL
with a space in it, and perhaps other such characters as well. Address
this by properly encoding characters in URLs before feeding them to
Curl.
On some sites, the link to each subfolder is an absolute link rather
than a relative one. To accommodate this, convert the links from
absolute to relative before storing them in the link table.
Some web sites will return 404 if you fetch a directory without the
final slash. For example, https://archive.mozilla.org/pub/ works,
https://archive.mozilla.org/pub does not. We need to do two things to
accommodate this:
* When processing the root URL of the filesystem, instead of stripping
off the final slash, just set the offset to ignore it.
* In the link structure, store the actual URL tail of the link
separately from its name, final slash and all if there is one, and
append that instead of the name when constructing the URL for curl.
Refresh a directory's contents when fs_readdir is called
if it has been more than the number of seconds specified by
--refresh_timeout since the directory was last indexed.
It's entirely possible that `ts.data` is `NULL` on an error path, so
handing it to `FREE()`, which bails on a `NULL` argument, is not ideal.
Just pass it to `free()` instead, which is required to no-op if given
`NULL`.