aboutsummaryrefslogtreecommitdiff
path: root/src/recover.c
AgeCommit message (Collapse)AuthorFiles
2023-01-22Update copyright yearsSergey Poznyakoff1
2022-01-06Speed up flushing the bucket cache on diskSergey Poznyakoff1
The implementation of _gdbm_cache_flush becomes prohibitively inefficient during extensive updates of large databases. The bug was reported at https://github.com/Perl/perl5/issues/19306. To fix it, make sure that all changed cache entries are placed at the head of the cache_mru list, forming a contiguous sequence. This way a potentially long iteration over all cache entries can be cut off at the first entry with ca_changed == FALSE. This commit also gets rid of several superfluous fields in struct gdbm_file_info: - cache_entry Not needed, because the most recently used cache entry (cache_mru) is always the current one. - bucket_changed dbf->cache_mru->ca_changed reflects the status of the current bucket. - second_changed Not needed because _gdbm_cache_flush, which flushes all changed buckets, is now invoked unconditionally by _gdbm_end_update (and also whenever dbf->cache_mru changes). * src/gdbmdefs.h (struct gdbm_file_info): Remove cache_entry. The current cache entry is cache_mru. Remove bucket_changed, and second_changed. All uses changed. * src/proto.h (_gdbm_current_bucket_changed): New inline function. * src/bucket.c (_gdbm_cache_flush): Assume all changed elements form a contiguous sequence beginning with dbf->cache_mru. (set_cache_entry): Remove. All callers changed. (lru_link_elem,lru_unlink_elem): Update dbf->bucket as necessary. (cache_lookup): If the obtained bucket is not changed and is going to become current, flush all changed cache elements. * src/update.c (_gdbm_end_update): Call _gdbm_cache_flush unconditionally. * src/findkey.c: Use dbf->cache_mru instead of the removed dbf->cache_entry. * src/gdbmseq.c: Likewise. * tools/gdbmshell.c (_gdbm_print_bucket_cache): Likewise. * src/falloc.c: Use _gdbm_current_bucket_changed to mark the current bucket as changed. * src/gdbmstore.c: Likewise. * src/gdbmdelete.c: Likewise. Use _gdbm_current_bucket_changed. * tests/gtcacheopt.c: Fix typo. * tests/gtload.c: New option: -cachesize
2022-01-02Update copyright yearsSergey Poznyakoff1
2021-11-14Switch to hash table cache implementationSergey Poznyakoff1
* src/cachetree.c: Remove. * src/Makefile.am: Remove cachetree.c * doc/gdbm.texi: Document the changes. * src/bucket.c (cache_tab_lookup_slot) (cache_tab_resize): New function. (cache_elem_new): Initialize ca_coll. (cache_elem_free, cache_lookup) (_gdbm_cache_init,_gdbm_cache_free): Rewrite with hash-based cache lookup. (_gdbm_fetch_data): Remove unused function. * src/gdbm.h.in (GDBM_GETDBFORMAT, GDBM_GETDIRDEPTH) (GDBM_GETBUCKETSIZE, GDBM_GETCACHEAUTO, GDBM_SETCACHEAUTO): New option codes. * src/gdbmdefs.h (cache_node): Remove. (cache_elem): Remove ca_node. Add ca_coll (collision resolution pointer). (gdbm_file_info): New members: cache_auto, cache_bits, cache. * src/gdbmopen.c (gdbm_fd_open): Change cache initialization. * src/gdbmsetopt.c (GDBM_GETDBFORMAT,GDBM_GETDIRDEPTH) (GDBM_GETBUCKETSIZE,GDBM_GETCACHEAUTO) (GDBM_SETCACHEAUTO): Implement new options. (setopt_gdbm_getflags): Reflect the state of GDBM_CLOEXEC and GDBM_NUMSYNC. * src/proto.h (_gdbm_fetch_data,_gdbm_cache_tree_alloc) (_gdbm_cache_tree_destroy,_gdbm_cache_tree_delete) (_gdbm_cache_tree_lookup): Remove protos. * src/recover.c (_gdbm_finish_transfer): Restore original cache settings. * tests/Makefile.am: Add new test. * tests/testsuite.at: Likewise. * tests/gtcacheopt.c: New file. * tests/setopt02.at: New test case.
2021-10-18gdbm_recover does not disable crash toleranceSergey Poznyakoff1
* src/recover.c (_gdbm_finish_transfer): Remove call to _gdbmsync_done. * doc/gdbm.texi: Reflect the changes.
2021-08-11Fix duplicated mmap in gdbm_recoverSergey Poznyakoff1
* src/recover.c (_gdbm_finish_transfer): Reuse memory mapping from the intermediate dbm structure.
2021-08-10BugfixesSergey Poznyakoff1
* src/gdbmdefs.h (SAVE_ERRNO): Preserve both gdbm_errno and errno. * src/recover.c (_gdbm_finish_transfer): Transfer all cache fields (cache_mru was missing).
2021-08-02Fix gdbm_recoverSergey Poznyakoff1
* src/recover.c (_gdbm_finish_transfer): Close snapshot descriptors, if any. Restore xheader, avail, and avail_size members.
2021-03-17Follow-up fixes to fd5cf245ea.Sergey Poznyakoff1
These address https://puszcza.gnu.org.ua/bugs/?503 * src/gdbmdefs.h (gdbm_avail_block_valid_p): Remove. * src/gdbmopen.c (gdbm_avail_block_validate): Use inline conditional instead of gdbm_avail_block_valid_p. (gdbm_fd_open): Revert to reading master avail_block in two passes (as was before fd5cf245ea). (validate_header): Add back master avail block consistency check. * src/gdbmtool.c (_gdbm_avail_list_size): Use _gdbm_avail_block_read. * src/recover.c (_gdbm_finish_transfer): Reset dbf->file_size.
2021-03-16Fix memory leak in gdbm_recoverSergey Poznyakoff1
* src/recover.c (_gdbm_finish_transfer): Free the cache.
2021-01-02Update copyright yearsSergey Poznyakoff1
2020-01-27Update copyright yearsSergey Poznyakoff1
2019-11-12Rewrite bucket cacheSergey Poznyakoff1
The new bucket cache uses the least recently used replacement policy (instead of the least recently read, implemented previously). It also allows for quick bucket lookups by the corresponding disk address. To this effect the cache entries form a red-black tree sorted by bucket address. Additionally, data buckets are also cached. * README: Describe the new branch. * src/bucket.c: Rewrite cache support. * src/cachetree.c: New file. * src/Makefile.am: Add new file. * src/findkey.c (_gdbm_read_entry): Use _gdbm_fetch_data. This ensures data pages are cached as well as buckets. * src/gdbm.h.in (GDBM_BUCKET_CACHE_CORRUPTED): New error code. (gdbm_cache_stat): New struct. (gdbm_get_cache_stats): New proto. * src/gdbmclose.c (gdbm_close): Call _gdbm_cache_free to dispose of the cache. * src/gdbmdefs.h (cache_elem_color): New data type. (cache_elem): New members: ca_left, ca_right, ca_node, and ca_hits. (cache_tree): New typedef. (gdbm_file_info): Remove bucket_cache and last_read. New fields: cache_num, cache_tree, cache_mru, cache_lru, cache_avail, cache_access_count. * src/gdbmerrno.c: Handle GDBM_BUCKET_CACHE_CORRUPTED. * src/gdbmopen.c (gdbm_fd_open): Change cache initialization. (_gdbm_init_cache, _gdbm_cache_entry_invalidate: Remove. * src/gdbmsetopt.c (setopt_gdbm_setcachesize): Cache can be re-initialized on the fly. * src/gdbmtool.c: Change bucket printing routines. * src/proto.h (_gdbm_read_bucket_at): Remove. (_gdbm_fetch_data,_gdbm_cache_init,_gdbm_cache_free) (_gdbm_cache_flush,_gdbm_cache_elem_new) (_gdbm_cache_tree_alloc,_gdbm_cache_tree_destroy) (_gdbm_cache_tree_delete,_gdbm_rbt_remove_node) (_gdbm_cache_tree_lookup): New protos. (_gdbm_init_cache,_gdbm_cache_entry_invalidate): Remove. * src/recover.c (_gdbm_finish_transfer): Adapt to the new cache structure. * src/update.c: Likewise. * tests/setopt00.at: Fix second GDBM_SETCACHESIZE test.
2019-04-08Update copyright yearsSergey Poznyakoff1
2019-04-08Preserve locking type during database reorganizationSergey Poznyakoff1
* src/recover.c (_gdbm_finish_transfer): Preserve locking type.
2018-10-18Attempt recovery in case of invalid next_block header fieldSergey Poznyakoff1
* src/gdbmopen.c (validate_header): Return GDBM_NEED_RECOVERY if next_block is invalid. (_gdbm_validate_header): New function. (gdbm_fd_open): Set need_recovery depending on return from validate_header. (gdbm_open): Bail out on invalid value of GDBM_OPENMASK bits. * src/proto.h (_gdbm_validate_header): New proto. * src/recover.c (check_db): Re-validate the header. * src/gdbmtool.c (export_handler): Fix option processing.
2018-07-02BugfixesSergey Poznyakoff1
* src/recover.c (backup_name): Fix memory overwrite. * src/gdbmtool.c (recover_handler): New option "force".
2018-05-30Namespace cleanupSergey Poznyakoff1
Rename: __read to gdbm_file_read __write to gdbm_file_write __lseek to gdbm_file_seek __fsync to gdbm_file_sync
2018-05-30Fix memory leaks in handling history (gdbmtool) and in gdbm_recoverSergey Poznyakoff1
2018-05-25More database consistency checksSergey Poznyakoff1
* NEWS: Update. * THANKS: Update. * src/bucket.c (_gdbm_get_bucket): Check if directory entry is valid. Don't cache invalid buckets. * src/gdbm.h.in (GDBM_BAD_DIR_ENTRY): New error code. * src/gdbmerrno.c: Likewise. * src/gdbmopen.c (validate_header): Compute expected number of bucket elements based on the bucket size, not on the block size. (_gdbm_init_cache_entry): New function. * src/proto.h (_gdbm_init_cache_entry): New proto. * src/recover.c (gdbm_recover): Clear error state after return from check_db indicating failure.
2018-05-24More error checking; improve gdbm_recoverSergey Poznyakoff1
* Makefile.am (set-dist-date): New rule (dist-hook): Catch FIXMEs in NEWS. * NEWS: Updated. * src/findkey.c (gdbm_bucket_element_valid_p): New function. (_gdbm_read_entry): Validate the retrieved bucket element. * src/gdbm.h.in (gdbm_recovery): New member: duplicate_keys. (GDBM_BAD_HASH_TABLE): New error code. * src/gdbmdefs.h (TYPE_WIDTH,SIGNED_TYPE_MAXIMUM) (OFF_T_MAX): New defines. (off_t_sum_ok): New function. (gdbm_bucket_element_valid_p): New prototype. * src/gdbmerrno.c: Support for GDBM_BAD_HASH_TABLE code. * src/gdbmtool.c (recover_handler): Fix argument counting. New argument 'summary' prints statistics summary at the end of the run. (export_handler,import_handler): Fix argument counting. * src/mmap.c (SUM_FILE_SIZE): Rewrite as inlined function. Add error checking. (_gdbm_mapped_remap): More error checking. * src/recover.c (run_recovery): Don't bail out on GDBM_CANNOT_REPLACE. (gdbm_recover): Initialize duplicate_keys * src/systems.h: Include limits.h
2018-01-01Happy GNU YearSergey Poznyakoff1
2017-01-02Happy GNU YearSergey Poznyakoff1
2016-07-26Fix remaining uses of gdbm_set_errno function.Sergey Poznyakoff1
Use the GDBM_SET_ERRNO and GDBM_SET_ERRNO2 macros to make sure the error gets reported in debug output. * src/fullio.c (_gdbm_full_read) (_gdbm_full_write): Return -1 and set gdbm_errno on error. * src/bucket.c: Use GDBM_SET_ERRNO(2?) or GDBM_DEBUG where necessary. * src/falloc.c: Likewise. * src/findkey.c: Likewise. * src/gdbmdefs.h: Likewise. * src/gdbmopen.c: Likewise. * src/gdbmstore.c: Likewise. * src/mmap.c: Likewise. * src/recover.c: Likewise. * src/update.c: Likewise.
2016-07-20Introduce debug hooks.Sergey Poznyakoff1
* configure.ac: New option --enable-debug Print feature summary at the end of the run. * src/debug.c: New file. * src/Makefile.am [GDBM_COND_DEBUG_ENABLE]: Build debug.o Define GDBM_DEBUG_ENABLE. * src/gdbmdefs.h [GDBM_DEBUG_ENABLE] (_gdbm_debug_hook_install) (_gdbm_debug_hook_remove,_gdbm_debug_hook_check) (_gdbm_debug_hook_val): New protos. (GDBM_DEBUG_HOOK, GDBM_DEBUG_OVERRIDE) (GDBM_DEBUG_ALLOC): New defines. * src/gdbm.h.in (GDBM_RCVR_FORCE): New flag. * src/recover.c (gdbm_recover): Check database before attempting recovery, unless GDBM_RCVR_FORCE flag is set. * doc/gdbm.texi: Document GDBM_RCVR_FORCE * src/gdbmreorg.c (gdbm_reorganize): Use GDBM_RCVR_FORCE. * src/gdbmtool.c (main): Always allocate file_name. * src/bucket.c: Put GDBM_DEBUG_OVERRIDE and GDBM_DEBUG_ALLOC in critical places. * src/falloc.c: Likewise. * src/findkey.c: Likewise. * src/gdbmopen.c: Likewise. * src/gdbmstore.c: Likewise. * src/update.c: Likewise. * tests/Makefile.am [GDBM_COND_DEBUG_ENABLE]: Define GDBM_DEBUG_ENABLE. * tests/gtload.c: New options -hook, -recover, -verbose, -backup, -max-failures, -max-failed-keys, and -max-failed-buckets. Attempt recovery after errors.
2016-07-19Implement gdbm_recover functionSergey Poznyakoff1
* configure.ac: Don't check for rename. * src/Makefile.am (libgdbm_la_SOURCES): Add recover.c * src/recover.c: New file. * src/bucket.c (_gdbm_get_bucket): Remove extra space before [ * src/err.c (prerror): Take additional argument (gdbm_perror): Print system errno if necessary. * src/gdbm.h.in (GDBM_CLOERROR): New flag. (gdbm_fd_open, gdbm_copy_meta): New proto. (gdbm_last_syserr,gdbm_db_strerror,gdbm_recover): New proto. (gdbm_syserr): New extern. (gdbm_recovery): New struct. (GDBM_RCVR_DEFAULT,GDBM_RCVR_ERRFUN) (GDBM_RCVR_MAX_FAILED_KEYS) (GDBM_RCVR_MAX_FAILED_BUCKETS) (GDBM_RCVR_MAX_FAILURES) (GDBM_RCVR_BACKUP): New flags. (GDBM_BACKUP_FAILED): New error code. * src/gdbmclose.c (gdbm_close): Work correctly if dbf->desc == -1. * src/gdbmcount.c (gdbm_count): Remove spurious sorting. Use _gdbm_next_bucket_dir for iterating over the buckets. * src/gdbmdefs.h (struct gdbm_file_info)<last_syserror> <last_errstr>: New members. * src/gdbmerrno.c (gdbm_set_errno): Set last_syserror as well. (gdbm_clear_error): Reset last_syserror. (gdbm_last_syserr): New function. (gdbm_errlist): New entry for GDBM_BACKUP_FAILED. (gdbm_db_strerror): New function. (gdbm_syserr): New global. * src/gdbmload.c (get_parms): Buffer can be NULL. * src/gdbmopen.c (gdbm_fd_open): New function. (gdbm_open): Rewrite as a wrapper over gdbm_fd_open. * src/gdbmreorg.c (gdbm_reorganize): Rewrite as a wrapper over gdbm_recover. * src/proto.h (_gdbm_next_bucket_dir): New proto. * src/gdbmtool.c: New command: recover. * tests/.gitignore: Add gtrecover * tests/gtrecover.c: New test program. * tests/Makefile.am: Build gtrecover

Return to:

Send suggestions and report system problems to the System administrator.