- Fix for not sending ranges which exceed the content-length which will result
in an error.
- Fix for byte range being set to 1 byte too large, it should be the end index,
not the size as described in
https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
This commit converts the build process from a hand-written Makefile
that works on Linux, FreeBSD, and macOS, to an automatically generated
Makefile managed by the autotools toolset.
This incldues:
* Add the compile, config.guess, config.sub, depcomp, install-sh, and
missing helper scripts that autotools requires to be shipped with
the package in order for configure to work.
* Rename Makefile to Makefile.am and restructure it for compatibility
with autotools and specifically with the stuff in our configure
script.
* Create the configure.ac source file which is turned into the
configure script.
* Rename Doxyfile to Doxyfile.in so that the source directories can be
substituted into it at configure time.
* Tweak .gitignore to ignore temporary and output files related to
autotools.
* Generate Makefile.in, aclocal.m4, and configure using `autoreconf`
and include them as checked-in source files.
While I can't fully document how autotools works here the basic
workflow is that when you need to make changes to the build, you
update Makefile.am and/or configure.ac as needed, run `autoreconf`,
and commit the changes you made as well as any resulting changes to
Makefile.in, aclocal.m4, and configure. Makefile should _not_ be
committed into the source tree; it should always be generated using
configure on the system where the build is being run.
Don't assume that the reason why we didn't find enough slashes in a
URL is because the user didn't specify the slash at the end of the
host name, unless we did find the first two slashes.
Add some curly braces around an if block to make it clear to people
and the compiler which statement an `else` applies to. The logic was
correct before but the indentation was wrong, making it especially
confusing.
The check for the HTTP response code from the curl library was written
incorrectly and guaranteed to always fail. I've fixed the logic to
reflect what I believe was intended.
There's a small block of code that calls strnlen on a string, saves
the esult in a variable, conditionally decrements the variable, and
then does nothing with it, making the entire block of code a no-op.
I don't want to just remove it entirely since it's possible that there
was intended to be some sort of check here that was inadvertently
omitted. So to make the compiler stop complaining I've commented out
the code, but I've left a comment above it explaining why it was
commented out and pointing out that maybe something different needs to
be done with it.
Some sites put unencoded characters in their href attributes that
really should be encoded, most notably spaces. Curl won't accept a URL
with a space in it, and perhaps other such characters as well. Address
this by properly encoding characters in URLs before feeding them to
Curl.
I believe an appropriate expectation is that if the user enables
debugging with a command-line flag, then that should also enable
messagse designated as debug messages in the code to be printed.
On some sites, the link to each subfolder is an absolute link rather
than a relative one. To accommodate this, convert the links from
absolute to relative before storing them in the link table.
Some web sites will return 404 if you fetch a directory without the
final slash. For example, https://archive.mozilla.org/pub/ works,
https://archive.mozilla.org/pub does not. We need to do two things to
accommodate this:
* When processing the root URL of the filesystem, instead of stripping
off the final slash, just set the offset to ignore it.
* In the link structure, store the actual URL tail of the link
separately from its name, final slash and all if there is one, and
append that instead of the name when constructing the URL for curl.
Refresh a directory's contents when fs_readdir is called
if it has been more than the number of seconds specified by
--refresh_timeout since the directory was last indexed.
It's entirely possible that `ts.data` is `NULL` on an error path, so
handing it to `FREE()`, which bails on a `NULL` argument, is not ideal.
Just pass it to `free()` instead, which is required to no-op if given
`NULL`.
`FREE()` checks for a `NULL` pointer, but generally httpdirfs does not
`NULL` out pointers it attempts to `FREE()` (or `free()`). As such, the
error message is misleading; make it less so in a trivial way.
Possibly a better, more invasive, change would be for `FREE()` to take a
`void** pp`, check that `*p != NULL`, `free(*p)`, and then `*p = NULL;`.
Were that done, then there would be some plausibility to the current
diagnostic message.
The use of EACCES leads to slightly confusing error messages in
downstream consumers, so prefer EROFS to better articulate what's
actually happening.
While here, use O_RDWR to mask the open flags while testing for
non-RO access. This is at least encouraged by POSIX with their
suggestion that "O_RDONLY | O_WRONLY == O_RDWR".