Some sites put unencoded characters in their href attributes that
really should be encoded, most notably spaces. Curl won't accept a URL
with a space in it, and perhaps other such characters as well. Address
this by properly encoding characters in URLs before feeding them to
Curl.
I believe an appropriate expectation is that if the user enables
debugging with a command-line flag, then that should also enable
messagse designated as debug messages in the code to be printed.
On some sites, the link to each subfolder is an absolute link rather
than a relative one. To accommodate this, convert the links from
absolute to relative before storing them in the link table.
Some web sites will return 404 if you fetch a directory without the
final slash. For example, https://archive.mozilla.org/pub/ works,
https://archive.mozilla.org/pub does not. We need to do two things to
accommodate this:
* When processing the root URL of the filesystem, instead of stripping
off the final slash, just set the offset to ignore it.
* In the link structure, store the actual URL tail of the link
separately from its name, final slash and all if there is one, and
append that instead of the name when constructing the URL for curl.
Refresh a directory's contents when fs_readdir is called
if it has been more than the number of seconds specified by
--refresh_timeout since the directory was last indexed.
It's entirely possible that `ts.data` is `NULL` on an error path, so
handing it to `FREE()`, which bails on a `NULL` argument, is not ideal.
Just pass it to `free()` instead, which is required to no-op if given
`NULL`.
`FREE()` checks for a `NULL` pointer, but generally httpdirfs does not
`NULL` out pointers it attempts to `FREE()` (or `free()`). As such, the
error message is misleading; make it less so in a trivial way.
Possibly a better, more invasive, change would be for `FREE()` to take a
`void** pp`, check that `*p != NULL`, `free(*p)`, and then `*p = NULL;`.
Were that done, then there would be some plausibility to the current
diagnostic message.
The use of EACCES leads to slightly confusing error messages in
downstream consumers, so prefer EROFS to better articulate what's
actually happening.
While here, use O_RDWR to mask the open flags while testing for
non-RO access. This is at least encouraged by POSIX with their
suggestion that "O_RDONLY | O_WRONLY == O_RDWR".