Commit Graph

1655 Commits

Author SHA1 Message Date
Michael Eischer e89fc2a29d
Merge pull request #3943 from MichaelEischer/find-match-only-valid-ids
ignore filenames which are not IDs when expanding a prefix
2022-09-27 20:56:48 +02:00
Michael Eischer 5d3c5b9e50 restic: ignore filenames which are not IDs when expanding a prefix
Some backends generate additional files for each existing file, e.g.

1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef.sha256

For some commands this leads to an "multiple IDs with prefix" error when
trying to reference a snapshot.
2022-09-27 20:30:40 +02:00
Leo R. Lundgren ebe9f2c969 rclone/sftp: Improve handling of ErrDot errors
Restic now yields a more informative error message when exec.ErrDot occurs.
2022-09-25 16:19:03 +02:00
Michael Eischer 34c1a83340 cache: Drop cache entry if it cannot be processed
Failing to process data requested from the cache usually indicates a
problem with the returned data. Assume that the cache entry is somehow
damaged and retry downloading it once.
2022-09-25 11:55:09 +02:00
Michael Eischer aa3b1925b4 cache: Simplify loadFromCacheOrDelegate 2022-09-25 11:35:35 +02:00
Michael Eischer 5c6b6edefe retry index, lock and snapshot loading on hash mismatch 2022-09-25 11:35:35 +02:00
Michael Eischer 822422ef03 retry key loading on hash mismatch 2022-09-25 11:35:35 +02:00
Michael Eischer 78d2312ee9
Merge pull request #3854 from MichaelEischer/sparsefiles
restore: Add support for sparse files
2022-09-24 22:04:02 +02:00
Michael Eischer 19afad8a09 restore: support sparse restores also on windows 2022-09-24 21:39:39 +02:00
Michael Eischer c147422ba5 repository: special case SaveBlob for all zero chunks
Sparse files contain large regions containing only zero bytes. Checking
that a blob only contains zeros is possible with over 100GB/s for modern
x86 CPUs. Calculating sha256 hashes is only possible with 500MB/s (or
2GB/s using hardware acceleration). Thus we can speed up the hash
calculation for all zero blobs (which always have length
chunker.MinSize) by checking for zero bytes and then using the
precomputed hash.

The all zeros check is only performed for blobs with the minimal chunk
size, and thus should add no overhead most of the time. For chunks which
are not all zero but have the minimal chunks size, the overhead will be
below 2% based on the above performance numbers.

This allows reading sparse sections of files as fast as the kernel can
return data to us. On my system using BTRFS this resulted in about
4GB/s.
2022-09-24 21:39:39 +02:00
Michael Eischer 34fe1362da restorer: move zeroPrefixLen to restic package 2022-09-24 21:39:39 +02:00
Michael Eischer a5ebd5de4b restorer: Fix race condition in partialFile.WriteAt
The restorer can issue multiple calls to WriteAt in parallel. This can
result in unexpected orderings of the Truncate and WriteAt calls and
sometimes too short restored files.
2022-09-24 21:39:39 +02:00
Michael Eischer 5b6a77058a Enable sparseness only conditionally
We can either preallocate storage for a file or sparsify it. This
detects a pack file as sparse if it contains an all zero block or
consists of only one block. As the file sparsification is just an
approximation, hide it behind a `--sparse` parameter.
2022-09-24 21:20:00 +02:00
greatroar 5d4568d393 Write sparse files in restorer
This writes files by using (*os.File).Truncate, which resolves to the
truncate system call on Unix.

Compared to the naive loop,

	for _, b := range p {
		if b != 0 {
			return false
		}
	}

the optimized allZero is about 10× faster:

name       old time/op    new time/op     delta
AllZero-8    1.09ms ± 1%     0.09ms ± 1%    -92.10%  (p=0.000 n=10+10)

name       old speed      new speed       delta
AllZero-8  3.84GB/s ± 1%  48.59GB/s ± 1%  +1166.51%  (p=0.000 n=10+10)
2022-09-24 21:18:48 +02:00
Michael Eischer eb83402d39
Merge pull request #3935 from miles170/master
Only display the message if there were locks to be removed
2022-09-24 20:53:13 +02:00
Michael Eischer ef58ddd7b1
Merge pull request #3923 from MichaelEischer/fix-flaky-cache-test
cache: fix flaky TestFileSaveConcurrent on windows
2022-09-24 20:52:55 +02:00
Michael Eischer 7fc178aaf4 internal/cache: extend description of cache sharing test failure 2022-09-24 13:07:01 +02:00
Miles Liu 1acbda18f8
Only display the message if there were locks to be removed
`restic unlock` now only shows `successfully removed locks` if there were locks to be removed.
In addition, it also reports the number of the removed lock files.
2022-09-24 19:02:24 +08:00
Michael Eischer da1a359c8b
Merge pull request #3927 from MichaelEischer/faster-index-each
Speed up MasterIndex.Each
2022-09-24 12:35:23 +02:00
Michael Eischer 041a51512a
Merge pull request #3780 from jkmw/fix/2578
Remove existing path before restoring a symlink
2022-09-24 12:34:42 +02:00
Michael Eischer 1ebd57247a repository: optimize MasterIndex.Each
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.

> name                old time/op  new time/op  delta
> MasterIndexEach-16   6.68s ±24%   0.96s ± 2%  -85.64%  (p=0.008 n=5+5)
2022-09-24 12:21:59 +02:00
Michael Eischer 825b95e313 repository: add benchmark for MasterIndex.Each 2022-09-24 12:21:59 +02:00
greatroar 1220fe9650 internal/cache: Concurrent use of cache not working on Windows 2022-09-17 19:49:44 +02:00
Jerome Küttner ef618bdd3f use os.Remove if path already exists on symlink restore 2022-09-14 08:14:31 +02:00
Michael Eischer 8b9778d537
Merge pull request #3900 from MichaelEischer/b2-init-timeout
Add timeout for the initial connection to B2
2022-09-10 23:28:59 +02:00
Michael Eischer 17c27400f8
Merge pull request #3921 from MichaelEischer/filter-cleanup-error-handling
filter: deduplicate error handling for pattern validation
2022-09-10 23:24:50 +02:00
Michael Eischer be9ccc186e
Merge pull request #3875 from MichaelEischer/fix-fuse-context-cancel
mount: Fix input/output errors for canceled syscalls
2022-09-10 23:20:29 +02:00
Michael Eischer 8e0ca80547 filter: deduplicate error handling for pattern validation 2022-09-09 23:12:41 +02:00
Michael Eischer 8b4dd70013 migrate: Report why an migration cannot be applied
Just returning that `Migration upgrade cannot be applied: check failed`
is not too useful when running `migrate upgrade_repo_v2`.
2022-09-03 11:49:31 +02:00
Michael Eischer 6c69f08a7b
Merge pull request #3905 from DRON-666/haspaths-linear
Reduce quadratic time complexity of `Snapshot.HasPaths`
2022-08-30 20:35:56 +02:00
DRON-666 d0f1060df7 Fix quadratic time complexity of Snapshot.HasPaths 2022-08-30 04:38:17 +03:00
Michael Eischer e5b2c4d571 b2: sniff the error that caused init retry loops 2022-08-28 17:46:03 +02:00
Michael Eischer dc2db2de5e b2: cancel connection setup after a minute
If the connection to B2 fails, the library enters an endless loop.
2022-08-28 14:56:17 +02:00
Michael Eischer 7682149c9d repository: cleanup copy connection count check 2022-08-28 11:40:56 +02:00
Michael Eischer b03277ead5 repository: don't hang when copying using a single connection 2022-08-28 11:40:31 +02:00
Fred be6baaec12 Add success callback to the backend 2022-08-27 22:27:15 +02:00
Fred baf58fbaa8 Add unit tests 2022-08-27 22:21:06 +02:00
Fred d629333efe Add function to notify of success after retrying 2022-08-27 22:21:06 +02:00
Michael Eischer 908f7441fe
Merge pull request #3885 from MichaelEischer/delete-fixes
Improve reliability of upload retries and B2 file deletions
2022-08-26 22:30:50 +02:00
Michael Eischer 4c90d91d4d backend: Test that failed uploads are not removed for backends with atomic replace 2022-08-26 21:20:52 +02:00
Michael Eischer cf0a8d7758 sftp: Only connect once for repository creation
This is especially useful if ssh asks for a password or if closing the
initial connection could return an error due to a problematic server
implementation.
2022-08-26 20:50:40 +02:00
Michael Eischer dd7cd5b9b3 fuse: remove unused context parameter 2022-08-26 20:48:48 +02:00
Michael Eischer a0c1ae9f90 mount: Correctly return context.Canceled for interrupted syscalls
bazil/fuse expects us to return context.Canceled to signal that a
syscall was successfully interrupted. Returning a wrapped version of
that error however causes the fuse library to signal an EIO (input/output
error). Thus unwrap context.Canceled errors before returning them.
2022-08-26 20:48:48 +02:00
MichaelEischer f7808245aa
Merge pull request #3878 from MichaelEischer/cheaper-cache-load
cache: Just try to open cache entry without calling stat first
2022-08-26 20:33:36 +02:00
MichaelEischer bee15dd555
Merge pull request #3879 from MichaelEischer/mem-optimize
Some random (minor) memory-allocation optimizations
2022-08-26 20:33:02 +02:00
MichaelEischer be90a565cc
Merge pull request #3887 from MichaelEischer/rclone-permanent-error
rclone: Return a permanent error if rclone already exited
2022-08-24 21:19:00 +02:00
Michael Eischer 506d92e87c rclone: Return a permanent error if rclone already exited
rclone can exit early for example when the connection to rclone is
relayed for example via ssh: `-o rclone.program='ssh user@example.org
forced-command'`
2022-08-23 22:05:04 +02:00
Michael Eischer 7681a63fdb restic: Cleanup xattr error handling for Solaris
Since xattr 0.4.8 (https://github.com/pkg/xattr/pull/68) returns ENOTSUP
similar to Linux.
2022-08-23 21:25:15 +02:00
Michael Eischer 623556bab6 b2: Increase list size to maximum
Just request as many files as possible in one call to reduce the number
of network roundtrips.
2022-08-21 11:20:03 +02:00
Michael Eischer de0162ea76 backend/retry: Overwrite failed uploads instead of deleting them
For backends which are able to atomically replace files, we just can
overwrite the old copy, if it is necessary to retry an upload. This has
the benefit of issuing one operation less and might be beneficial if a
backend storage, due to bugs or similar, could mix up the order of the
upload and delete calls.
2022-08-21 11:14:53 +02:00
Michael Eischer fc506f8538 b2: Repeat deleting until all file versions are removed
When hard deleting the latest file version on B2, this uncovers earlier
versions. If an upload required retries, multiple version might exist
for a file. Thus to reliably delete a file, we have to remove all
versions of it.
2022-08-21 11:11:00 +02:00
Michael Eischer cc4728d287 repository: Do not report ignored packs in EachByPack
Ignored packs were reported as an empty pack by EachByPack. The most
immediate effect of this is that the progress bar for rebuilding the
index reports processing more packs than actually exist.
2022-08-21 10:38:40 +02:00
Michael Eischer 7a992fc794 repository: Reduce buffer reallocations in ForAllIndexes
Previously the buffer was grown incrementally inside `repo.LoadUnpacked`.
But we can do better as we already know how large the index will be.
Allocate a bit more memory to increase the chance that the buffer can be
reused in the future.
2022-08-19 21:13:40 +02:00
Michael Eischer 77b1980d8e repository: MasterIndex.Packs: reduce allocations 2022-08-19 21:10:43 +02:00
Michael Eischer 6ff9517e45 repository: MasterIndex.ListPacks / Index.EachByPack allow earlier GC
Allow earlier garbage collection of some of the intermediate data
structures.
2022-08-19 21:06:33 +02:00
Michael Eischer ce902aac67 cache: Just try to open cache entry without calling stat first
Instead of first checking whether a file is in the repository cache and
then opening it, we just can open the file. This saves one stat call. If
the file is in the cache, everything is fine and otherwise the code
follows its normal fallback path.
2022-08-19 20:59:06 +02:00
MichaelEischer 0d9ac78437
Merge pull request #3873 from MichaelEischer/gofmt-comments
gofmt comments
2022-08-19 19:54:30 +02:00
MichaelEischer 7e96a5af62
Merge pull request #3872 from MichaelEischer/fuse-fix
mount: Only remember successful snapshot refreshes
2022-08-19 19:21:29 +02:00
Michael Eischer f414db987d gofmt all files
Apparently the rules for comment formatting have changed with go 1.19.
2022-08-19 19:12:26 +02:00
Michael Eischer 522406b4f0 mount: Only remember successful snapshot refreshes
If the context provided by the fuse library is canceled before the index
was loaded this could lead to missing snapshots.
2022-08-19 19:07:07 +02:00
Michael Eischer af50fe9ac0 mount: Map slashes in tags to underscores
Suggested-by: greatroar <>
2022-08-19 18:17:57 +02:00
Michael Eischer 2ea6c82cf6 comment cleanup
gofmt reformatted the comment
2022-08-18 20:15:38 +02:00
Michael Eischer bb27f7408c forget: Fail test if duration parsing error is missing 2022-08-18 20:14:09 +02:00
Leo R. Lundgren 6f517858e8 forget: Error when invalid unit is given in duration policy 2022-08-10 13:37:26 +02:00
MichaelEischer 9ad3ad5972
Merge pull request #3850 from lbausch/go1.19
Update tests to Go 1.19
2022-08-07 14:56:17 +02:00
MichaelEischer 2930a102de
Merge pull request #3731 from metalsp0rk/feature/min-packsize-flag
Feature: min packsize flag
2022-08-07 14:54:45 +02:00
Michael Eischer f3fdc66b32
restic: Use stable sorting in snapshot policy
sort.Sort is not guaranteed to be stable. Go 1.19 has changed the
sorting algorithm which resulted in changes of the sort order. When
comparing snapshots with identical timestamp but different paths and
tags lists, there is not meaningful order among them. So just keep their
order stable.
2022-08-07 14:10:40 +02:00
Michael Eischer 0b7291b8b2 mount: Fix parent inode used by snapshots dir 2022-08-07 13:03:32 +02:00
greatroar cfa80e2c6b mount: remove unused inode field from root node 2022-08-07 13:03:26 +02:00
MichaelEischer 74ae76036f
Merge pull request #2913 from aawsome/mount-snapshot-slashes
mount: Make snapshots dir structure customizable
2022-08-07 12:27:59 +02:00
Michael Eischer caa17988a3 fuse: Redesign snapshot dirstruct
Cleanly separate the directory presentation and the snapshot directory
structure. SnapshotsDir now translates the dirStruct into a format
usable by the fuse library and contains only minimal special case rules.
All decisions have moved into SnapshotsDirStructure which now creates a
fully preassembled tree data structure.
2022-08-07 12:13:06 +02:00
Michael Eischer 1ed775e3a8 debug: support roundtripper logging also for release builds
Different from debug builds do not use the eofDetectRoundTripper if
logging is disabled.
2022-08-05 23:49:39 +02:00
Michael Eischer 38becfc436 debug: enable debug support for release builds 2022-08-05 23:49:39 +02:00
Michael Eischer 82c268c917 Remove unused hooks mechanism 2022-08-05 23:49:39 +02:00
Michael Eischer 7266f07c87 repository: StreamPack in parts if there are too large gaps
For large pack sizes we might be only interested in the first and last
blob of a pack file. Thus stream a pack file in multiple parts if the
gaps between requested blobs grow too large.
2022-08-05 23:48:36 +02:00
Michael Eischer 7f3b2be1e8 s3: Disable multipart uploads below 200MB 2022-08-05 23:48:36 +02:00
Michael Eischer 1b076cda97 rename option to --pack-size 2022-08-05 23:47:43 +02:00
Kyle Brennan 1e3f05c3f1 repository: prevent header overfill 2022-08-05 23:47:12 +02:00
Michael Eischer 0a6fa602c8 add option for setting min pack size 2022-08-05 23:47:12 +02:00
Michael Eischer 2db7733ee3 fuse: remove unused MetaDir 2022-08-05 23:46:46 +02:00
Michael Eischer f678f7cb04 fuse: cleanup test 2022-08-05 23:46:46 +02:00
Alexander Weiss 1751afae26 Make snapshots dirs in mount command customizable 2022-08-05 23:46:46 +02:00
Alexander Weiss 57f4003f2f Generalize fuse snapshot dirs implemetation
+ allow "/" in tags and snapshot template
2022-08-05 23:46:46 +02:00
Alexander Weiss 696c18e031 Add possibility to set snapshot ID (used in test) 2022-08-05 23:46:46 +02:00
MichaelEischer 04a8ee80fb
Merge pull request #3829 from MichaelEischer/prune-refactor
Split prune into slightly small functions
2022-08-05 23:29:52 +02:00
greatroar ad6ac680af internal/restic: Handle EINVAL for xattr on Solaris
Also make the errors a bit less verbose by not prepending the operation,
since pkg/xattr already does that. Old errors looked like

    Listxattr: xattr.list /myfiles/.zfs/snapshot: invalid argument
2022-08-01 12:45:17 +02:00
Michael Eischer 73053674d9 repository: Test fallback to existing blobs 2022-07-30 17:37:07 +02:00
Michael Eischer 623770eebb repository: try to recover from invalid blob while repacking
If a blob that should be kept is invalid, Repack will now try to request
the blob using LoadBlob. Only return an error if that fails.
2022-07-30 17:37:07 +02:00
greatroar 2bdc40e612 Speed up restic init over slow SFTP links
pkg/sftp.Client.MkdirAll(d) does a Stat to determine if d exists and is
a directory, then a recursive call to create the parent, so the calls
for data/?? each take three round trips. Doing a Mkdir first should
eliminate two round trips for 255/256 data directories as well as all
but one of the top-level directories.

Also, we can do all of the calls concurrently. This may reintroduce some
of the Stat calls when multiple goroutines try to create the same
parent, but at the default number of connections, that should not be
much of a problem.
2022-07-30 13:09:08 +02:00
greatroar 23ebec717c Remove stale comments from backend/sftp
The preExec and postExec functions were removed in
0bdb131521 from 2018.
2022-07-30 13:07:25 +02:00
Michael Eischer 4a10ebed15 archiver: reduce memory usage for large files
FutureBlob now uses a Take() method as a more memory-efficient way to
retrieve the futures result. In addition, futures are now collected
while saving the file. As only a limited number of blobs can be queued
for uploading, for a large file nearly all FutureBlobs already have
their result ready, such that the FutureBlob object just consumes
memory.
2022-07-23 14:45:07 +02:00
Michael Eischer b817681a11 archiver: Incrementally serialize tree nodes
That way it is not necessary to keep both the Nodes forming a Tree and
the serialized JSON version in memory.
2022-07-23 14:45:07 +02:00
Michael Eischer c206a101a3 archiver: unify FutureTree/File into futureNode
There is no real difference between the FutureTree and FutureFile
structs. However, differentiating both increases the size of the
FutureNode struct.

The FutureNode struct is now only 16 bytes large on 64bit platforms.
That way is has a very low overhead if the corresponding file/directory
was not processed yet.

There is a special case for nodes that were reused from the parent
snapshot, as a go channel seems to have 96 bytes overhead which would
result in a memory usage regression.
2022-07-23 14:45:07 +02:00
Michael Eischer 32f4997733 archiver: remove unused fileInfo from progress callback 2022-07-23 14:16:23 +02:00
Michael Eischer dcb00fd2d1 archiver: cleanup Saver interface 2022-07-23 14:16:23 +02:00
Michael Eischer 79321a195c archiver: remove dead attribute from FutureNode 2022-07-23 14:16:23 +02:00
Michael Eischer 5a6f2f9fa0 Fix S3 legacy layout migration 2022-07-23 11:19:32 +02:00
Michael Eischer 04e49924fb checker: Fix S3 legacy layout detection 2022-07-23 11:19:32 +02:00
Michael Eischer fcb3ddf181 check: Complain about usage of s3 legacy layout 2022-07-23 11:19:32 +02:00
Michael Eischer 8b8bd4e8ac check: complain about mixed pack files 2022-07-23 11:19:32 +02:00
MichaelEischer 443cc49afd
Merge pull request #3830 from MichaelEischer/cleanup-repo
Extract Load/SaveTree/JSONUnpacked from repository
2022-07-23 10:46:13 +02:00
MichaelEischer 1f5369e072
Merge pull request #3831 from MichaelEischer/move-code
Move code out of the restic package and consolidate backend specific code
2022-07-23 10:33:05 +02:00
Michael Eischer 9729e6d7ef backend: extract readerat from restic package 2022-07-17 15:29:09 +02:00
Michael Eischer c44b21d366 restorer: extract hardlinks index from restic package 2022-07-17 13:45:42 +02:00
Michael Eischer 8c11fc3ec9 crypto: move crypto buffer helpers 2022-07-17 13:42:23 +02:00
Michael Eischer a0cef9f247 limiter: move to internal/backend 2022-07-17 13:40:15 +02:00
Michael Eischer 163ab9c025 mock: move to internal/backend 2022-07-17 13:40:06 +02:00
Michael Eischer 89d3ce852b repository: extract Load/StoreJSONUnpacked
A Load/Store method for each data type is much clearer. As a result the
repository no longer needs a method to load / store json.
2022-07-17 13:22:00 +02:00
Michael Eischer fbcbd5318c repository: extract LoadTree/SaveTree
The repository has no real idea what a Tree is. So these methods never
belonged there.
2022-07-17 13:11:28 +02:00
Michael Eischer 5639c41b6a azure: Strip ? prefix from sas token 2022-07-16 23:55:18 +02:00
Roger Gammans 64a7ec5341 azure: add SAS authentication option 2022-07-16 23:55:18 +02:00
Lorenz Bausch d6e3c7f28e
Wording: change repo to repository 2022-07-08 20:05:35 +02:00
Michael Eischer ce89018902 Fix data race in blob_saver
After the `BlobSaver` job is submitted, the buffer can be released and
reused by another `FileSaver` even before `BlobSaver.Save` returns. That
FileSaver will the change `buf.Data` leading to wrong backup statistics.

Found by `go test -race ./...`:

WARNING: DATA RACE
Write at 0x00c0000784a0 by goroutine 41:
  github.com/restic/restic/internal/archiver.(*FileSaver).saveFile()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:176 +0x789
  github.com/restic/restic/internal/archiver.(*FileSaver).worker()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af
  github.com/restic/restic/internal/archiver.NewFileSaver.func2()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/michael/go/pkg/mod/golang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57 +0x91

Previous read at 0x00c0000784a0 by goroutine 29:
  github.com/restic/restic/internal/archiver.(*BlobSaver).Save()
      /home/michael/Projekte/restic/restic/internal/archiver/blob_saver.go:57 +0x1dd
  github.com/restic/restic/internal/archiver.(*BlobSaver).Save-fm()
      <autogenerated>:1 +0xac
  github.com/restic/restic/internal/archiver.(*FileSaver).saveFile()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:191 +0x855
  github.com/restic/restic/internal/archiver.(*FileSaver).worker()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af
  github.com/restic/restic/internal/archiver.NewFileSaver.func2()
      /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/michael/go/pkg/mod/golang.org/x/sync@v0.0.0-20210220032951-036812b2e83c/errgroup/errgroup.go:57 +0x91
2022-07-03 14:47:53 +02:00
Michael Eischer 6f53ecc1ae adapt workers based on whether an operation is CPU or IO-bound
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
2022-07-03 12:19:26 +02:00
Michael Eischer 753e56ee29 repository: Limit to a single pending pack file
Use only a single not completed pack file to keep the number of open and
active pack files low. The main change here is to defer hashing the pack
file to the upload step. This prevents the pack assembly step to become
a bottleneck as the only task is now to write data to the temporary pack
file.

The tests are cleaned up to no longer reimplement packer manager
functions.
2022-07-02 22:42:34 +02:00
Michael Eischer fa25d6118e archiver: Reduce tree saver concurrency
Large amount of tree savers have no obvious benefit, however they can
increase the amount of (potentially large) trees kept in memory.
2022-07-02 22:42:34 +02:00
Michael Eischer bba1e81719 archiver: Limit blob saver count to GOMAXPROCS
Now with the asynchronous uploaders there's no more benefit from using
more blob savers than we have CPUs. Thus use just one blob saver for
each CPU we are allowed to use.
2022-07-02 22:42:34 +02:00
Michael Eischer 120ccc8754 repository: Rework blob saving to use an async pack uploader
Previously, SaveAndEncrypt would assemble blobs into packs and either
return immediately if the pack is not yet full or upload the pack file
otherwise. The upload will block the current goroutine until it
finishes.

Now, the upload is done using separate goroutines. This requires changes
to the error handling. As uploads are no longer tied to a SaveAndEncrypt
call, failed uploads are signaled using an errgroup.

To count the uploaded amount of data, the pack header overhead is no
longer returned by `packer.Finalize` but rather by
`packer.HeaderOverhead`. This helper method is necessary to continue
returning the pack header overhead directly to the responsible call to
`repository.SaveBlob`. Without the method this would not be possible,
as packs are finalized asynchronously.
2022-07-02 22:42:34 +02:00
MichaelEischer 3e1de52e0a
Merge pull request #3805 from greatroar/global
cmd/restic, limiter: Move config knowledge to internal packages
2022-07-02 21:56:35 +02:00
MichaelEischer 621023a50b
Merge pull request #3772 from MichaelEischer/fix-mixed-index
rebuild-index: correctly rebuild index for mixed packs
2022-07-02 20:10:02 +02:00
MichaelEischer 90e9c5c4cc
Merge pull request #3729 from MichaelEischer/full-ids-in-check
Include full IDs in check output
2022-07-02 20:09:39 +02:00
Michael Eischer cdaf9b4f26 Don't crash if SecretString is uninitialized 2022-07-02 19:44:28 +02:00
Michael Eischer 5e0f1c3cef check: remove dead code 2022-07-02 19:28:57 +02:00
Michael Eischer 0df022fa6d check: Print full ids
The short ids are not always unique. In addition, recovering from
damages is easier when having the full ids as that makes it easier to
access the corresponding files.
2022-07-02 19:28:57 +02:00
Michael Eischer 04c23fa95d rebuild-index: correctly rebuild index for mixed packs
For mixed packs, data and tree blobs were stored in separate index
entries. This results in warning from the check command and maybe other
problems.
2022-07-02 19:24:02 +02:00
MichaelEischer bb5f196b09
Merge pull request #3733 from restic/improve-stats
Improve stats
2022-07-02 19:07:31 +02:00
MichaelEischer c16f989d4a
Merge pull request #3470 from MichaelEischer/sanitize-debug-log
Sanitize debug log
2022-07-02 19:00:54 +02:00
Michael Eischer a6e9e08034 Account for pack header overhead at each entry
This will miss the pack header crypto overhead and the length field,
which only amount to a few bytes per pack file.
2022-07-02 18:55:58 +02:00
Alexander Neumann 6c4ceaf1e7 Print number of bytes added to the repo
This includes optional compression and crypto overhead.
2022-07-02 18:55:12 +02:00
Alexander Neumann 99634c0936 Return real size from SaveBlob 2022-07-02 18:55:12 +02:00
MichaelEischer fdc53a9d32
Merge pull request #3787 from MichaelEischer/refactor-repository
repository: (Mostly) index-related cleanups
2022-07-02 18:54:04 +02:00
Michael Eischer 6923353c43 redact swift auth token in debug output 2022-07-02 18:47:35 +02:00
Michael Eischer 5a11d14082 redacted keys/token in backend config debug log 2022-07-02 18:47:35 +02:00
Michael Eischer 0936d864a4 redact http authorization header in debug log output 2022-07-02 18:47:35 +02:00
Michael Eischer ec7c9ce88b drop unused repository.Loader interface 2022-07-02 18:39:59 +02:00
Michael Eischer 2cd7e90ad1 repository: cleanup 2022-07-02 18:39:59 +02:00
Michael Eischer c1a8fa4290 repository: remove unused packIDToIndex field 2022-07-02 18:39:59 +02:00
Michael Eischer e68c3a4e62 repository: simplify CreateIndexFromPacks 2022-07-02 18:39:59 +02:00
Michael Eischer 1974ad7ce2 repository: hide MasterIndex.FinalizeFullIndexes / FinalizeNotFinalIndexes 2022-07-02 18:39:59 +02:00
Michael Eischer ef53ca4a5a repository: remove MasterIndex.All() 2022-07-02 18:39:59 +02:00
Michael Eischer bf81bf0795 repository: Properly set id for finalized index
As MergeFinalIndex and index uploads can occur concurrently, it is
necessary for MergeFinalIndex to check whether the IDs for an index were
already set before merging it. Otherwise, we'd loose the ID of an index
which is set _after_ uploading it.
2022-07-02 18:39:59 +02:00
Michael Eischer e0a7852b8b repository: remove unused (Master)Index.Count 2022-07-02 18:39:58 +02:00
Michael Eischer 8ef2968f28 repository: remove unused index.ListPack 2022-07-02 18:39:12 +02:00
Michael Eischer e4f20dea61 repository: inline index.encode 2022-07-02 18:39:12 +02:00
Michael Eischer fe5a8e137a repository: remove unused index.Store 2022-07-02 18:39:12 +02:00
Michael Eischer 628ae799ca repository: make flushPacks private 2022-07-02 18:39:12 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Michael Eischer a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
greatroar a0fa9c6e9f Revert "restic prune: Merge three loops over the index"
This reverts commit 8bdfcf779f.
Should fix #3809. Also needed to make #3290 apply cleanly.
2022-06-30 15:27:34 +02:00
greatroar 90d2c0502b cmd/restic, limiter: Move config knowledge to internal packages
The GlobalOptions struct now embeds a backend.TransportOptions, so it
doesn't need to construct one in open and create. The upload and
download limits are similarly now a struct in internal/limiter that is
embedded in GlobalOptions.
2022-06-22 18:29:58 +02:00
MichaelEischer bc96879d41
Merge pull request #3785 from MichaelEischer/replace-tomb-usage
Remove usage of tomb package
2022-06-19 14:42:48 +02:00
MichaelEischer 307f14604f
Merge pull request #3795 from greatroar/sema
backend: Move semaphores to a dedicated package
2022-06-18 17:12:01 +02:00
MichaelEischer 19581dbc18
Merge pull request #3786 from greatroar/prune
restic prune: Merge three loops over the index
2022-06-18 16:54:50 +02:00
greatroar 8bdfcf779f restic prune: Merge three loops over the index
There were three loops over the index in restic prune, to find
duplicates, to determine sizes (in pack.Size) and to generate packInfos.
These three are now one loop. This way, prune doesn't need to construct
a set of duplicate blobs, pack.Size doesn't need to contain special
logic for prune's use case (the onlyHdr argument) and pack.Size doesn't
need to construct a map only to have it immediately transformed into a
different map.

Some quick testing on a 160GiB local repo doesn't show running time or
memory use of restic prune --dry-run changing significantly.
2022-06-18 10:40:33 +02:00
greatroar 910d917b71 backend: Move semaphores to a dedicated package
... called backend/sema. I resisted the temptation to call the main
type sema.Phore. Also, semaphores are now passed by value to skip a
level of indirection when using them.
2022-06-18 10:01:58 +02:00
MichaelEischer 2c893fe43c
Merge pull request #3798 from greatroar/errors
all: Move away from pkg/errors, easy cases
2022-06-17 19:01:40 +02:00
greatroar f92ecf13c9 all: Move away from pkg/errors, easy cases
github.com/pkg/errors is no longer getting updates, because Go 1.13
went with the more flexible errors.{As,Is} function. Use those instead:
errors from pkg/errors already support the Unwrap interface used by 1.13
error handling. Also:

* check for io.EOF with a straight ==. That value should not be wrapped,
  and the chunker (whose error is checked in the cases changed) does not
  wrap it.
* Give custom Error methods pointer receivers, so there's no ambiguity
  when type-switching since the value type will no longer implement error.
* Make restic.ErrAlreadyLocked private, and rename it to
  alreadyLockedError to match the stdlib convention that error type
  names end in Error.
* Same with rest.ErrIsNotExist => rest.notExistError.
* Make s3.Backend.IsAccessDenied a private function.
2022-06-14 08:36:38 +02:00
Jayson Wang f144920ed5 fix handling of maxKeys in SearchKey 2022-06-12 14:19:06 +02:00
Alexander Neumann 1dd4b9b60e
Merge pull request #3788 from greatroar/sftp-posix-rename
backend/sftp: Support atomic rename
2022-06-06 19:39:48 +02:00
Alexander Neumann 07114ccb21
Merge pull request #3789 from greatroar/fix-loadblob
internal/repository: Fix LoadBlob + fuzz test
2022-06-06 19:33:42 +02:00
greatroar c9557b2822 internal/repository: Fix LoadBlob + fuzz test
When given a buf that is big enough for a compressed blob but not its
decompressed contents, the copy at the end of LoadBlob would skip the
last part of the contents.

Fixes #3783.
2022-06-06 17:02:28 +02:00
greatroar fa8f02292e backend/sftp: Support atomic rename
... if the server has posix-rename@openssh.com.
OpenSSH introduced this extension in 2008:
7c29661471
2022-06-06 13:40:42 +02:00
Michael Eischer e002b09d57 archiver: free workers once finished 2022-06-05 15:48:10 +02:00
Michael Eischer 408ac1a0c2 archiver: remove tomb usage 2022-06-05 15:47:52 +02:00
Michael Eischer 5eba1217e7 migrate: Cleanup option to request repository check 2022-06-04 23:45:00 +02:00
MichaelEischer 60ca6b1418
Merge pull request #3774 from greatroar/archiver-pool
archiver: Remove cleanup goroutine from BufferPool
2022-06-04 18:50:24 +02:00
Jerome Küttner 9adaa6e240 Delete existing path before restoring a symlink 2022-06-01 17:26:25 +02:00
greatroar 0db1d11b2e archiver: Remove cleanup goroutine from BufferPool
This isn't doing anything. Channels should get cleaned up by the GC when
the last reference to them disappears, just like all other data
structures. Also inlined BufferPool.Put in Buffer.Release, its only
caller.
2022-05-29 17:09:16 +02:00
Alexander Neumann d2c5843c68
Merge pull request #3704 from MichaelEischer/compression-migrations
Support migration to repository format with compression
2022-05-29 15:52:21 +02:00
Alexander Neumann 78a21bbccf
Merge pull request #3752 from MichaelEischer/fix-dir-sync-errors
local: Ignore additional errors for directory syncing
2022-05-29 12:54:51 +02:00
greatroar dde8e9e296 internal/restic: Custom ID.MarshalJSON
This skips an allocation. internal/archiver benchmarks, Linux/amd64:

name                     old time/op    new time/op    delta
ArchiverSaveFileSmall-8    3.94ms ± 6%    3.91ms ± 6%    ~     (p=0.947 n=20+20)
ArchiverSaveFileLarge-8     304ms ± 3%     301ms ± 4%    ~     (p=0.265 n=18+18)

name                     old speed      new speed      delta
ArchiverSaveFileSmall-8  1.04MB/s ± 6%  1.05MB/s ± 6%    ~     (p=0.803 n=20+20)
ArchiverSaveFileLarge-8   142MB/s ± 3%   143MB/s ± 4%    ~     (p=0.421 n=18+19)

name                     old alloc/op   new alloc/op   delta
ArchiverSaveFileSmall-8    17.9MB ± 0%    17.9MB ± 0%  -0.01%  (p=0.000 n=19+19)
ArchiverSaveFileLarge-8     382MB ± 2%     382MB ± 1%    ~     (p=0.687 n=20+19)

name                     old allocs/op  new allocs/op  delta
ArchiverSaveFileSmall-8       540 ± 1%       528 ± 0%  -2.19%  (p=0.000 n=19+19)
ArchiverSaveFileLarge-8     1.93k ± 3%     1.79k ± 4%  -7.06%  (p=0.000 n=20+20)
2022-05-27 12:26:37 +02:00
MichaelEischer 88a8701fb5
Merge pull request #3734 from lbausch/validate-patterns
Validate exclude patterns
2022-05-14 16:20:15 +02:00
MichaelEischer b2a2e5f727
Merge pull request #3753 from greatroar/indexmap-alloc
repository: Re-tune indexmap allocation strategy
2022-05-14 15:44:08 +02:00
MichaelEischer b52c631bd3
Merge pull request #3754 from greatroar/simplify-hashing
hashing: Fix up comments
2022-05-14 15:33:51 +02:00
Lorenz Bausch 36bd464e8c
Add tests for validating exclude patterns 2022-05-11 22:41:00 +02:00
greatroar 39a335e690 hashing: Fix up comments 2022-05-11 21:36:10 +02:00
greatroar 5141228e0c repository: Re-tune indexmap allocation strategy
fd05037e1a changed the allocation batch
size from 256 to 128 under the assumption that an indexEntry is 60 bytes
on amd64, but it's 64: structs are padded out to a multiple of 8 for
alignment reasons. That means we'd waste no space in malloc even without
the batch allocation, at least on 64-bit machines. While that strategy
cuts the overallocation down dramatically for many small indexes, it also
seems to slow allocation down (Go 1.18, Linux, amd64, -benchtime=2s):

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    DecodeIndexParallel-8     4.67s ± 3%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    IndexHasUnknown-8        37.8ns ± 8%    36.5ns ±14%      ~     (p=0.841 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    37.7ns ±10%      ~     (p=0.968 n=5+5)
    IndexAlloc-8              615ms ±18%     607ms ± 1%      ~     (p=1.000 n=10+5)
    IndexAllocParallel-8      245ms ±11%     285ms ± 6%   +16.40%  (p=0.001 n=10+5)
    MasterIndexAlloc-8        286ms ± 9%     275ms ± 2%      ~     (p=1.000 n=10+5)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 1%      ~     (p=0.690 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.8ms ± 2%    +1.48%  (p=0.016 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%    -0.00%  (p=0.000 n=8+4)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%    -0.00%  (p=0.008 n=8+5)
    MasterIndexAlloc-8        213MB ± 0%     159MB ± 0%   -25.47%  (p=0.000 n=10+5)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     2632k ± 0%  +188.19%  (p=0.008 n=5+5)
    IndexAllocParallel-8       913k ± 0%     2632k ± 0%  +188.21%  (p=0.008 n=5+5)
    MasterIndexAlloc-8         318k ± 0%     1172k ± 0%  +267.86%  (p=0.008 n=5+5)

Instead, this patch sets a batch size of 4, which means no space is
wasted by malloc on 64-bit and very little on 32-bit. It still gets very
close to the savings from not allocating in batches, without requiring
special code for bits.UintSize==64. Benchmark results, again for
Linux/amd64:

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.83s ± 9%     ~     (p=0.315 n=10+10)
    DecodeIndexParallel-8     4.67s ± 3%     4.68s ± 4%     ~     (p=0.315 n=10+10)
    IndexHasUnknown-8        37.8ns ± 8%    44.5ns ±19%     ~     (p=0.095 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    36.9ns ± 8%     ~     (p=0.690 n=5+5)
    IndexAlloc-8              615ms ±18%     628ms ±18%     ~     (p=0.218 n=10+10)
    IndexAllocParallel-8      245ms ±11%     262ms ± 9%   +7.02%  (p=0.043 n=10+10)
    MasterIndexAlloc-8        286ms ± 9%     287ms ±13%     ~     (p=1.000 n=10+10)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 0%     ~     (p=1.000 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.5ms ± 0%     ~     (p=0.056 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%     ~     (p=1.000 n=8+10)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%   -0.00%  (p=0.000 n=8+8)
    MasterIndexAlloc-8        213MB ± 0%     160MB ± 0%  -25.02%  (p=0.000 n=10+9)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+10)
    IndexAllocParallel-8       913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+8)
    MasterIndexAlloc-8         318k ± 0%      525k ± 0%  +64.99%  (p=0.000 n=10+10)

The allocation method indexmap.newEntry has also been rewritten in a
form that is a few instructions shorter.
2022-05-11 21:22:14 +02:00
Michael Eischer 48a0d83143 local: Ignore additional errors for directory syncing
Apparently SMB/CIFS on Linux/macOS returns somewhat random errnos when
trying to sync a windows share which does not support calling fsync for
a directory.
2022-05-11 20:37:59 +02:00
MichaelEischer ac36fda155
Merge pull request #3749 from greatroar/simplify-hashing
hashing: Remove io.WriterTo implementation
2022-05-11 20:03:43 +02:00
MichaelEischer df554e5f69
Merge pull request #3748 from greatroar/runworkers
repository: Remove RunWorkers, report ctx.Err()
2022-05-11 19:38:46 +02:00
greatroar 54b8337813 hashing: Remove io.WriterTo implementation
This functionality has gone unused since
4b3dc415ef changed hashing.Reader's only
client to use ioutil.ReadAll on a bufio.Reader wrapping the hashing
Reader.

Reverts bcb852a8d0.
2022-05-10 23:41:18 +02:00
greatroar 2e0f1f5113 repository: Remove RunWorkers, report ctx.Err()
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
2022-05-10 22:26:00 +02:00
MichaelEischer 47c56dea5c
Merge pull request #3746 from greatroar/cache-lstat
cache: Don't Lstat before creating CACHEDIR.TAG
2022-05-10 20:31:15 +02:00
greatroar 2da377c582 cache: Don't Lstat before creating the tag file
The tag file is opened with O_CREATE|O_EXCL and ErrExist is handled, so
we don't need to check for existence first.
2022-05-10 18:52:39 +02:00
Michael Eischer ae7e51382a Fix error on temp file deletion on windows
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
2022-05-09 22:43:26 +02:00
Michael Eischer c1bbbcd0dc migrate: Allow migrations to request a check run
This is currently only used by upgrade_repo_v2.
2022-05-09 22:31:30 +02:00
Michael Eischer 5815f727ee checker: convert error type to use pointer-receivers 2022-05-09 22:31:30 +02:00
Michael Eischer e36a40db10 upgrade_repo_v2: Use atomic replace for supported backends 2022-05-09 22:31:30 +02:00
Michael Eischer 5406743102 prune: Automatically repack uncompressed trees for repo v2
Tree packs are cached locally at clients and thus benefit a lot from
being compressed. Ensure this be having prune always repack pack files
containing uncompressed trees.
2022-05-09 22:31:30 +02:00
Alexander Neumann c8c0d659ec Add migration to compress all data 2022-05-09 22:31:30 +02:00
Alexander Neumann 8c244214bf Add tests for upgrade migration 2022-05-09 22:31:30 +02:00
Alexander Neumann a5f1d318ac Try to make repo upgrade migration more failsafe 2022-05-09 22:31:30 +02:00
Alexander Neumann 82ed5a3a15 Add repo upgrade migration 2022-05-09 22:31:30 +02:00
Michael Eischer 92816fa966 init: Enable compression support by default 2022-05-09 22:31:30 +02:00
Lorenz Bausch 9fb81c4246
Validate exclude patterns 2022-05-07 21:12:47 +02:00
Lorenz Bausch e7fd200237
Keep original pattern for later use 2022-05-07 21:08:09 +02:00
Michael Eischer cf5cb673fb repository: Use existing method to collect pack ids 2022-04-30 19:14:21 +02:00
Michael Eischer b335cb6285 repository: Refactor index IDs collection 2022-04-30 19:14:21 +02:00
Daniel Gröber f31b4f29c1 Use config file modes to derive new dir/file modes
Fixes #2351
2022-04-30 15:59:51 +02:00
Michael Eischer 4b01b06f2f repository: Test compressed blobs in StreamPack 2022-04-30 11:34:10 +02:00
Michael Eischer bcab548617 pack: slightly expand testing of compressed blobs 2022-04-30 11:34:10 +02:00
Michael Eischer ec2b25565a repository: test uncompressedLength field and index example 2022-04-30 11:34:10 +02:00
Michael Eischer 9ffb8920f1 repository: run blackbox tests using old and new repo version 2022-04-30 11:34:10 +02:00
Michael Eischer abe5935693 repository: unify repository version-specific initialization
Mark the master index as compressed also when initializing a new
repository. This is only relevant for testing.
2022-04-30 11:34:10 +02:00
Alexander Neumann 8776031f96 Leave allocating slices to the decompress code 2022-04-30 11:34:10 +02:00
Alexander Neumann 5eb05a0afe Configure zstd encoder/decoder 2022-04-30 11:34:10 +02:00
Michael Eischer 2f36e044db Cleanup pack header check 2022-04-30 11:34:10 +02:00
Alexander Neumann 8b11b86383 Add option global --compression 2022-04-30 11:34:10 +02:00
Michael Eischer 7132df529e repository: Increase index size for repo version 2
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
2022-04-30 11:34:10 +02:00
Michael Eischer 66f9048bce repository: Alloc zstd encoder/decoder on demand 2022-04-30 11:34:10 +02:00
Michael Eischer fd05037e1a repository: recalibrate index batch allocation size 2022-04-30 11:34:10 +02:00
Michael Eischer 6fb408d90e repository: implement pack compression 2022-04-30 11:34:10 +02:00
Michael Eischer 362ab06023 init: Add flag to specify created repository version 2022-04-30 10:07:42 +02:00
Michael Eischer 4b957e7373 repository: Implement index/snapshot/lock compression
The config file is not compressed as it should remain readable by older
restic versions such that these can return a proper error.

As the old format for unpacked data does not include a version header,
make use of a trick: The old data is always encoded as JSON. Thus it can
only start with '{' or '['. For any other value the first byte indicates
a versioned format. The version is set to 2 for now. Then the zstd
compressed data follows.
2022-04-30 10:07:42 +02:00
Michael Eischer e597b99b55 repository: Reduce repack workers to prevent deadlock
As repack streams packs these occupy one backend connection. Uploading a
new pack also requires a backend connection. To prevent a deadlock
during repack when reaching the backend connections limit, simply limit
the repackWorker count to always leave one connection for uploading.
2022-04-23 11:28:18 +02:00
Michael Eischer ee627cd832 backend/mem: Actually enforce connection limit
This will allow tests to detect deadlocks related to the connections
limit.
2022-04-23 11:22:00 +02:00
Michael Eischer 4f97492d28 Backend: Expose connections parameter 2022-04-23 11:13:08 +02:00
Alexander Neumann 89ee1cf9ee
Merge pull request #3610 from MichaelEischer/windows-temp-files
Improve handling of temporary files on windows
2022-04-10 20:29:04 +02:00
Alexander Neumann a059ef90f8
Merge pull request #3702 from MichaelEischer/extend-config-error
Print used key name if config fails to load
2022-04-10 20:25:24 +02:00
Michael Eischer 4077a81b34 Add simple test for fs.TempFile on windows 2022-04-09 23:37:58 +02:00
Michael Eischer 9a3f1a9703 Simplify and comment TempFile implementation for windows 2022-04-09 23:37:58 +02:00
Michael Eischer c2aabb2686 Print used key name if config fails to load 2022-04-09 22:38:18 +02:00
MichaelEischer c60a5f00c9
Merge pull request #3675 from ItsMattL/update
Refactor file handing for self-update.
2022-04-09 21:55:56 +02:00
Matt LaPlante 0ba9d4ced7 Refactor file handing for self-update.
* Write new file payload to a temp file before touching the original
binary. Minimizes the possibility of failing mid-write and corrupting
the binary.
* On Windows, move the original binary out to a temp file rather than
removing it as the running binary is locked. Fixes issue #2248.
2022-04-09 21:40:33 +02:00
Alexander Neumann 04e054465a
Merge pull request #3475 from MichaelEischer/local-sftp-conn-limit
Limit concurrent operations for local / sftp backend
2022-04-09 21:33:00 +02:00
Michael Eischer 7b9ae91e04 copy: Load snapshots before indexes 2022-04-09 12:27:25 +02:00
Michael Eischer 9e12159230 Fix O(n) backend list calls in FindFilteredSnapshots
When resolving snapshotIDs in FindFilteredSnapshots either
FindLatestSnapshot or FindSnapshot is called. Both operations issue a
list operation to the backend. When for example passing a long list of
snapshot ids to `forget` this could lead to a large number of list
operations.
2022-04-09 12:26:31 +02:00
Michael Eischer 3d29083e60 copy/find/ls/recover/stats: Memorize snapshot listing before index
These commands filter the snapshots according to some criteria which
essentially requires loading the index before filtering the snapshots.
Thus create a copy of the snapshots list beforehand and use it later on.
2022-04-09 12:26:30 +02:00
Michael Eischer ece06f125e sftp: Limit concurrent backend operations 2022-04-09 12:21:38 +02:00
Michael Eischer cd783358d3 local: Limit concurrent backend operations
Use a limit of 2 similar to the filereader concurrency in the archiver.
2022-04-09 12:21:38 +02:00
Michael Eischer 0b258cc054 backends: clean reader closing 2022-04-09 12:21:38 +02:00
Alex Duchesne 9e34c791c9 Better temp file cleanup on Windows. 2022-04-09 12:00:22 +02:00
Charlie Jiang d9c9415cfd Fix rclone (scoop shim) and sftp issue due to detached console on Windows 2022-04-03 17:53:17 +08:00
greatroar c23c0f7c14 Cast unix.Statfs_t.Type to int64 when checking for btrfs
Fixes #3687. Uses the cast suggested by @MichaelEischer, except that the
contant isn't cast along, because it's untyped and will be converted by
the compiler as necessary.
2022-03-31 22:30:45 +02:00
Alexander Neumann db8a958991
Merge pull request #3683 from MichaelEischer/fix-golangci-lint-warnings
Fix golangci lint warnings
2022-03-29 11:45:10 +02:00
Michael Eischer af31266b7d golangci-lint: replace deprecated golint with revive 2022-03-28 22:33:17 +02:00
Michael Eischer 2f81af6afa bloblru: Fix comment for New function 2022-03-28 22:25:25 +02:00
Michael Eischer 61e179ee78 switch to golang.org/x/term 2022-03-28 22:24:15 +02:00
Michael Eischer c60540b196 add go:build headers everywhere 2022-03-28 22:23:47 +02:00
Michael Eischer fefe9f5c0e pack: Hide more implementation details 2022-03-28 22:12:16 +02:00
Michael Eischer a773cb6527 pack: cleanup header size calculation 2022-03-28 22:09:49 +02:00
Michael Eischer 6408686973 repository: Simplify Blob equality check 2022-03-28 22:09:49 +02:00
Michael Eischer 243698680a crypto: Use helpers for size calculations 2022-03-28 22:09:49 +02:00
Michael Eischer d6db5a1fc2 archiver: Fix test
The test relied on an undeocumented sideeffect of the LoadBlob implementation
2022-03-28 22:09:49 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer dc3d77dacc repository: make saveAndEncrypt private 2022-03-28 22:09:49 +02:00
Michael Eischer 6877e7edbb repository: Rename LoadAndDecrypt to LoadUnpacked
The method is the complement for SaveUnpacked and not for
SaveAndEncrypt. The latter assembles blobs into pack files.
2022-03-28 22:09:49 +02:00
Michael Eischer 2e1613d4c6 errors: Ensure that errors.IsFatal(errors.Fatal("err")) == true
This fixes a few cases where restic output "Fatal: Fatal: [...]"
2022-03-28 22:09:49 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
Alexander Neumann 4d5db61bd0
Merge pull request #3484 from MichaelEischer/stream-check-repack
Stream packs in `check --read-data` and during repacking
2022-03-26 20:46:17 +01:00
Alexander Neumann e682f7c0d6 Add tests for StreamPack 2022-03-21 21:15:03 +01:00
Alexander Neumann 0b8b524f12
Merge pull request #3512 from MichaelEischer/cleaner-lock-refresh
Prevent lock refresh from leaving behind lots of stale locks
2022-03-21 11:10:37 +01:00
Alexander Neumann a350625554
Merge pull request #3524 from MichaelEischer/atomic-sftp
sftp: Implement atomic uploads
2022-03-21 11:08:22 +01:00
Alexander Neumann 8388f66c4c
Merge pull request #3668 from greatroar/symlink-size
Report symlink sizes from FUSE mount
2022-03-21 11:02:32 +01:00
Alexander Neumann 0937008648
Merge pull request #3654 from MichaelEischer/limit-huge-tree-streams
Limit number of large tree blobs loaded in parallel by StreamTrees
2022-03-21 11:01:04 +01:00
Michael Eischer cd190bee14 filter: short circuit if no negative patterns 2022-03-20 13:33:08 +01:00
Vincent Bernat 2ee07ded2b filter: ability to use negative patterns
This is quite similar to gitignore. If a pattern is suffixed by an
exclamation mark and match a file that was previously matched by a
regular pattern, the match is cancelled. Notably, this can be used
with `--exclude-file` to cancel the exclusion of some files.

Like for gitignore, once a directory is excluded, it is not possible
to include files inside the directory. For example, a user wanting to
only keep `*.c` in some directory should not use:

    ~/work
    !~/work/*.c

But:

    ~/work/*
    !~/work/*.c

I didn't write documentation or changelog entry. I would like to get
feedback if this is the right approach for excluding/including files
at will for backups. I use something like this as an exclude file to
backup my home:

    $HOME/**/*
    !$HOME/Documents
    !$HOME/code
    !$HOME/.emacs.d
    !$HOME/games
    # [...]
    node_modules
    *~
    *.o
    *.lo
    *.pyc
    # [...]
    $HOME/code/linux/*
    !$HOME/code/linux/.git
    # [...]

There are some limitations for this change:

 - Patterns are not mixed accross methods: patterns from file are
   handled first and if a file is excluded with this method, it's not
   possible to reinclude it with `--exclude !something`.

 - Patterns starting with `!` are now interpreted as a negative
   pattern. I don't think anyone was relying on that.

 - The whole list of patterns is walked for each match. We may
   optimize later by exiting early if we know no pattern is starting
   with `!`.

Fix #233
2022-03-20 13:33:08 +01:00
Michael Eischer 12606b575f filter: Cleanup variable naming 2022-03-20 13:33:08 +01:00
Michael Eischer 5f145f0c7e filter: introduce pattern struct 2022-03-20 13:33:08 +01:00
Vincent Bernat 13c40d4199 filter: additional tests for filter.List() 2022-03-20 13:33:08 +01:00
Jason Lenz 2b1932a258 Report symlink sizes from FUSE mount for snapshot dir
Fixes #3667.
2022-03-17 22:21:47 -05:00
greatroar fdc738fb70 Report symlink sizes from FUSE mount
Fixes #3667.
2022-03-13 16:48:35 +01:00
Michael Eischer 254c8743fc Limit number of large tree blobs loaded in parallel by StreamTrees
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.
2022-02-19 12:26:09 +01:00
Michael Eischer 27524979e8 restorer: Remove dead code 2022-02-13 11:43:09 +01:00
gum3ng dd30083c2b [#issue 3127] Add xattr support for Solaris 2022-02-13 14:24:37 +05:30
MichaelEischer fb4c5af5c4
Merge pull request #3642 from gco/master
Fix test failures on Solaris
2022-02-12 22:07:14 +01:00
Michael Eischer 5ec312ca06 sftp: Implement atomic uploads
Create a temporary file with a sufficiently random name to essentially
avoid any chance of conflicts. Once the upload has finished remove the
temporary suffix. Interrupted upload thus will be ignored by restic.
2022-02-12 20:30:49 +01:00
Michael Eischer c6fd13425b remember the refreshed lock file even if removal failed
This ensures that restic won't create lots of new lock files without
deleting them later on.

In some cases a Delete operation on a backend can return a "File does
not exist" error even though the Delete operation succeeded. This can
for example be caused by request retries. This caused restic to forget
about the new lock file and continue trying to remove the old (already
deleted) lock file.
2022-02-12 20:21:58 +01:00
Michael Eischer bba8ba7a5b repository: cancel streampack context after error 2022-02-12 20:18:25 +01:00
Michael Eischer 47554a3428 repository: Fix error handling in repack
When storing a blob fails, this is a fatal error which must not be
retried.
2022-02-12 20:18:25 +01:00
Michael Eischer 4b3dc415ef checker: cleanup header extraction 2022-02-12 20:18:25 +01:00
Michael Eischer 930a00ad54 checker: reuse bufio reader 2022-02-12 20:18:25 +01:00
Michael Eischer 34ebafb8b6 repository: don't crash if blob size is too short 2022-02-12 20:18:25 +01:00
Michael Eischer becebf5d88 repository: remove unused DownloadAndHash 2022-02-12 20:18:25 +01:00
Michael Eischer f1e58e7c7f checker: rewrite ReadData to stream packs 2022-02-12 20:18:25 +01:00
Michael Eischer f40abd92fa restorer: convert to use StreamPack 2022-02-12 20:18:25 +01:00
Michael Eischer f00f690658 repository: stream packs during repacking 2022-02-12 20:18:25 +01:00
Michael Eischer c4a2bfcb39 repository: Add StreamPacks function
The function supports efficiently loading a specified list of blobs from
a single pack in a streaming fashion. That is there's no need for
temporary files independent of the pack size.
2022-02-12 20:18:25 +01:00
Michael Eischer 153e2ba859 repository: Implement lisiting blobs per pack file 2022-02-12 20:18:24 +01:00
MichaelEischer cc90f2ba6b
Merge pull request #2816 from greatroar/noatime
Set O_NOATIME flag on Linux
2022-02-07 21:38:31 +01:00
Michael Eischer 6b17a7110c backup: Set O_NOATIME in the right place
The archiver uses FS.OpenFile, where FS is an instance of the FS
interface. This is different from fs.OpenFile, which uses the OpenFile
method provided by the fs package.
2022-02-06 15:00:37 +01:00
greatroar 7080fed7ae Set O_NOATIME flag on Linux
Citing Kerrisk, The Linux Programming Interface:

    The O_NOATIME flag is intended for use by indexing and backup
    programs. Its use can significantly reduce the amount of disk
    activity, because repeated disk seeks back and forth across the
    disk are not required to read the contents of a file and to update
    the last access time in the file’s i-node[.]

restic used to do this, but the functionality was removed along with the
fadvise call in #670.
2022-02-06 15:00:34 +01:00
Michael Eischer 8ae4d86a84 rename snapshot timestamp filter variable 2022-02-05 22:42:38 +01:00
Greg 2e9180638e Fix test failures on Solaris
Add exceptions for symlinks, sticky bits, and device nodes in the same places where the BSDSs and/or Darwin have them.
2022-01-25 18:05:56 -08:00
Aneesh Agrawal 058dfc20da Avoid choosing parent snapshot newer than time of current snapshot
Currently, `restic backup` (if a `--parent` is not provided)
will choose the most recent matching snapshot as the parent snapshot.
This makes sense in the usual case,
where we tag the snapshot-being-created with the current time.

However, this doesn't make sense if the user has passed `--time`
and is currently creating a snapshot older than the latest snapshot.
Instead, choose the most recent snapshot
which is not newer than the snapshot-being-created's timestamp,
to avoid any time travel.

Impetus for this change:
I'm using restic for the first time!
I have a number of existing BTRFS snapshots
I am backing up via restic to serve as my initial set of backups.
I initially `restic backup`'d the most recent snapshot to test,
then started backing up each of the other snapshots.
I noticed in `restic cat snapshot <id>` output
that all the remaining snapshots have the most recent as the parent.
2022-01-23 23:55:00 -05:00
rawtaz 7f6fc78f95
Merge pull request #3544 from restic/fix-b2-delete-retry
b2: Successful delete if file does not exist
2022-01-13 23:28:30 +01:00
Charlie Jiang a5b0e0bef4 fix: rclone receiving SIGINT prematurely on Windows causing restic hang forever
Co-authored-by: greatroar <61184462+greatroar@users.noreply.github.com>
2021-12-28 13:14:46 +08:00
Michael Eischer 50053a85d3 rest: Adjust http2 missing eof test to golang >= 1.17.3, >= 1.16.10
The missing eof with http2 when a response included a content-length
header but no data, has been fixed in golang 1.17.3/1.16.10. Therefore
just drop the canary test and schedule it for removal once go 1.18 is
required as minimum version by restic.
2021-11-13 21:57:30 +01:00
MichaelEischer f1cfb97237
Merge pull request #3514 from phcreery/rclone_timeout
rclone: extend timeout from 60s to 240s
2021-11-07 18:10:34 +01:00
MichaelEischer cb81ee9396
Merge pull request #3474 from kitone/fix-issue-3382
Honor RESTIC_CACHE_DIR environment variable
2021-11-07 17:57:54 +01:00
Michael Eischer b0e64deb27 rclone: Fix timeout calculation 2021-11-07 17:49:33 +01:00
phcreery 43d173b042 rclone: add timeout option and documentation 2021-11-07 17:49:21 +01:00
Michael Eischer 15d7313387 local: Fix fd leak when encountering files directly inside data/ 2021-11-06 19:44:57 +01:00
MichaelEischer 78c7dd53ef
Merge pull request #3526 from greatroar/dump-refactor
Refactor internal/dump + concurrent load/write
2021-11-05 22:38:39 +01:00
kitone 1b23675f21 cache --cleanup should handle directories created by restic check.
Because there is no guarantee that a cleanup of these directories will occur
after the "restic check", we extend the behavior to detect and manage these
specific cache directories and allow their cleanup too.
2021-11-04 15:10:38 +01:00
greatroar c71729dfc4 Refactor internal/dump + concurrent load/write
Package internal/dump has been reworked so its API consists of a single
type Dumper that handles tar and zip formats. Tree loading and node
writing happen concurrently.
2021-11-01 23:01:55 +01:00
Alexander Neumann fb5d9345a7
Merge pull request #3510 from MichaelEischer/fix-archiver-early-on-abort
archiver: Fix TestArchiverAbortEarlyOnError test
2021-10-16 15:37:41 +02:00
kitone 95eb859b54 Honor RESTIC_CACHE_DIR environment variable
Fix #3382: restic check doesn't obey the RESTIC_CACHE_DIR environment variable
2021-10-10 16:00:02 +02:00
Michael Eischer 257740b0cc b2: Successful delete if file does not exist
When deleting a file, B2 sometimes returns a "500 Service Unavailable"
error but nevertheless correctly deletes the file. Due to retries in
the B2 library blazer, we sometimes also see a "400 File not present"
error. The retries of restic for the delete request then fail with
"404 File with such name does not exist.".

As we have to rely on request retries in a distributed system to handle
temporary errors, also consider a delete request to be successful if the
file is reported as not existing. This should be safe as B2 claims to
provide a strongly consistent bucket listing and thus a missing file
shouldn't mysteriously show up again later on.
2021-10-09 23:51:12 +02:00
MichaelEischer f2a3b3b4a1
Merge pull request #3537 from greatroar/dump-lru
Check cap instead of len in bloblru
2021-10-09 20:00:27 +02:00