Age | Commit message (Collapse) | Author | Files |
|
|
|
The implementation of _gdbm_cache_flush becomes prohibitively
inefficient during extensive updates of large databases. The
bug was reported at https://github.com/Perl/perl5/issues/19306.
To fix it, make sure that all changed cache entries are placed at
the head of the cache_mru list, forming a contiguous sequence.
This way a potentially long iteration over all cache entries can be
cut off at the first entry with ca_changed == FALSE.
This commit also gets rid of several superfluous fields in
struct gdbm_file_info:
- cache_entry
Not needed, because the most recently used cache entry
(cache_mru) is always the current one.
- bucket_changed
dbf->cache_mru->ca_changed reflects the status of the current
bucket.
- second_changed
Not needed because _gdbm_cache_flush, which flushes all changed
buckets, is now invoked unconditionally by _gdbm_end_update (and
also whenever dbf->cache_mru changes).
* src/gdbmdefs.h (struct gdbm_file_info): Remove cache_entry. The
current cache entry is cache_mru.
Remove bucket_changed, and second_changed.
All uses changed.
* src/proto.h (_gdbm_current_bucket_changed): New inline function.
* src/bucket.c (_gdbm_cache_flush): Assume all changed elements form
a contiguous sequence beginning with dbf->cache_mru.
(set_cache_entry): Remove. All callers changed.
(lru_link_elem,lru_unlink_elem): Update dbf->bucket as necessary.
(cache_lookup): If the obtained bucket is not changed and is going
to become current, flush all changed cache elements.
* src/update.c (_gdbm_end_update): Call _gdbm_cache_flush unconditionally.
* src/findkey.c: Use dbf->cache_mru instead of the removed dbf->cache_entry.
* src/gdbmseq.c: Likewise.
* tools/gdbmshell.c (_gdbm_print_bucket_cache): Likewise.
* src/falloc.c: Use _gdbm_current_bucket_changed to mark the current
bucket as changed.
* src/gdbmstore.c: Likewise.
* src/gdbmdelete.c: Likewise. Use _gdbm_current_bucket_changed.
* tests/gtcacheopt.c: Fix typo.
* tests/gtload.c: New option: -cachesize
|
|
|
|
* src/gdbmsync.c (gdbm_numsync_cmp): Properly handle unsigned overflow.
* tests/gtload.c: New option -numsync.
|
|
|
|
|
|
|
|
|
|
* src/gdbm.h.in (gdbm_close, gdbm_sync): Return int
(GDBM_FILE_CLOSE_ERROR, GDBM_FILE_SYNC_ERROR): New error codes.
* src/gdbmclose.c (gdbm_close): Return 0 on success, -1 on failure.
Set gdbm_errno and errno.
* src/gdbmsync.c (gdbm_sync): Likewise.
* src/gdbmerrno.c: Handle new error codes.
* src/mmap.c (_gdbm_mapped_sync): Set gdbm_errno.
* src/proto.h (gdbm_file_sync): Set gdbm_errno.
* doc/gdbm.3: Document changes.
* doc/gdbm.texi: Document changes.
* NEWS: Document changes.
* configure.ac: Set patchlevel.
* tests/Makefile.am: Add new test.
* tests/testsuite.at: Add new test.
* tests/closerr.at: New test case.
* tests/closerr.c: New test program.
* tests/gtdel.c: Check gdbm_close return.
* tests/gtdump.c: Likewise.
* tests/gtfetch.c: Likewise.
* tests/gtload.c: Likewise.
* tests/gtopt.c: Likewise.
* tests/gtrecover.c: Likewise.
|
|
The hooks were introduced as a temporary tool in de7834e9. They did
their job and are not necessary any more.
|
|
Verify that avail_table is sorted by size and that each element's
size falls within allowed range.
* src/bucket.c (gdbm_get_bucket): Fix bucket validation. Validate
bucket_avail.
(_gdbm_split_bucket): Check return from _gdbm_free.
* src/falloc.c (adjust_bucket_avail,_gdbm_free): Return error code.
All uses updated.
(pop_avail_block): Fix eventual memory leak. Use gdbm_avail_block_validate.
* src/gdbmdefs.h (gdbm_avail_table_valid_p): Change signature.
* src/gdbmopen.c (gdbm_avail_table_valid_p): Traverse the array verifying
address and size of each element.
(gdbm_avail_block_validate)
(gdbm_bucket_avail_table_validate): New functions.
(validate_header): Remove call to gdbm_avail_block_valid_p. Avail_block
is validated later, after it's been loaded.
Bail out if header->next_block does not equal the file size.
(gdbm_fd_open): Validate avail_block.
* src/gdbmstore.c (_gdbm_store): Check return from _gdbm_free. Avoid
endless loop in case of inconsistent h_table.
* src/gdbmtool.c (_gdbm_print_avail_list): Use gdbm_avail_block_validate.
* src/proto.h: Update.
* tests/gtload.c: Improve error diagnostics.
|
|
|
|
|
|
* configure.ac: Fix description wording.
* src/Makefile.am [GDBM_COND_DEBUG_ENABLE]: Don't
define GDBM_DEBUG_ENABLE.
* tests/Makefile.am: Likewise.
* src/debug.c (gdbm_debug_printer)
(gdbm_debug_flags): New globals.
(gdbm_debug_token, gdbm_debug_parse_state)
(gdbm_debug_datum): New functions.
* src/gdbm.h.in [@GDBM_DEBUG_ENABLE@]: Define GDBM_DEBUG_ENABLE.
(gdbm_debug_printer_t): New typedef.
(gdbm_debug_printer, gdbm_debug_flags): New externs.
(GDBM_DEBUG_ERR,GDBM_DEBUG_OPEN)
(GDBM_DEBUG_READ,GDBM_DEBUG_STORE)
(GDBM_DEBUG_LOOKUP,GDBM_DEBUG_ALL): New defines.
(gdbm_debug_token,gdbm_debug_parse_state)
(gdbm_debug_datum): New protos.
* src/gdbmdefs.h (GDBM_DEBUG,GDBM_DEBUG_DATUM): New macros.
* src/findkey.c: Add debugging info.
* src/gdbmfetch.c: Likewise.
* src/gdbmopen.c: Likewise.
* src/gdbmseq.c: Likewise.
* src/gdbmstore.c: Likewise.
* src/gdbmtool.c (open_handler): Allow the use of ~/
(command) <repeat,variadic>: New members.
(run_command): Handle variadic functions.
(run_last_command): New command. In interactive mode,
repeats the last command if it was marked
with repeat=1 (currently, only "next").
New command: "debug".
(all functions): Use terror instead of fprintf(stderr,...);
* src/gdbmtool.h (handler_param) <vararg>: New member.
(run_last_command): New proto.
* src/gram.y: Call run_last_command) on empty input.
* tests/gtload.c: New option: -debug=
|
|
* configure.ac: New option --enable-debug
Print feature summary at the end of the run.
* src/debug.c: New file.
* src/Makefile.am [GDBM_COND_DEBUG_ENABLE]: Build debug.o
Define GDBM_DEBUG_ENABLE.
* src/gdbmdefs.h [GDBM_DEBUG_ENABLE] (_gdbm_debug_hook_install)
(_gdbm_debug_hook_remove,_gdbm_debug_hook_check)
(_gdbm_debug_hook_val): New protos.
(GDBM_DEBUG_HOOK, GDBM_DEBUG_OVERRIDE)
(GDBM_DEBUG_ALLOC): New defines.
* src/gdbm.h.in (GDBM_RCVR_FORCE): New flag.
* src/recover.c (gdbm_recover): Check database before attempting
recovery, unless GDBM_RCVR_FORCE flag is set.
* doc/gdbm.texi: Document GDBM_RCVR_FORCE
* src/gdbmreorg.c (gdbm_reorganize): Use GDBM_RCVR_FORCE.
* src/gdbmtool.c (main): Always allocate file_name.
* src/bucket.c: Put GDBM_DEBUG_OVERRIDE and GDBM_DEBUG_ALLOC
in critical places.
* src/falloc.c: Likewise.
* src/findkey.c: Likewise.
* src/gdbmopen.c: Likewise.
* src/gdbmstore.c: Likewise.
* src/update.c: Likewise.
* tests/Makefile.am [GDBM_COND_DEBUG_ENABLE]: Define GDBM_DEBUG_ENABLE.
* tests/gtload.c: New options -hook, -recover, -verbose,
-backup, -max-failures, -max-failed-keys,
and -max-failed-buckets.
Attempt recovery after errors.
|
|
* src/gdbm.h.in (GDBM_GETBLOCKSIZE): New option.
* src/gdbmcount.c (gdbm_count): Fix memory leak on
error.
* src/gdbmsetopt.c (gdbm_setopt): Rewrite.
Handle GDBM_GETBLOCKSIZE.
* NEWS: Document GDBM_GETBLOCKSIZE
* doc/gdbm.texi: Likewise.
* tests/gtload.c: New options -bsexact and -verbose.
* tests/Makefile.am: Add new testcases.
* tests/testsuite.at: Likewise.
* tests/blocksize00.at: New testcase.
* tests/blocksize01.at: Likewise.
* tests/blocksize02.at: Likewise.
|
|
* src/datconv.c: New file.
* src/Makefile.am (gdbmtool_SOURCES): Add datconv.c.
* src/gdbmtool.h (slist, kvpair): New structures.
(gdbmarg): Keep various types of data depending on the
value of the type member.
(slist_new, slist_free)
(kvpair_string, kvpair_list): New protos.
(gdbmarg_new): Remove.
(gdbmarg_string, gdbmarg_datum)
(gdbmarg_kvpair, gdbmarg_free)
(gdbmarg_destroy): New protos.
(xd_expand, xd_store, datadef_locate): New protos.
(field, dsegm): New structs.
(dsegm_new, dsegm_new_field, dsegm_free_list): New protos.
* src/gdbmtool.c: Rewrite.
* src/gram.y: Change grammar to allow for defining key and
content structure and for supplying structured data as arguments
to fetch, store and similar functions.
* src/lex.l: Handle new token types.
* tests/dtload.c (main): Fix parser.
* tests/gtload.c: Likewise.
|
|
The new code is more flexible and performs better when
lots of inserts are being made (e.g. when populating the
database with new data).
* src/gdbm.h.in (GDBM_SETMAXMAPSIZE): New constant.
* src/gdbmconst.h (SIZE_T_MAX): New define.
* src/gdbmdefs.h (gdbm_file_info) <cache_size>: Change type
to size_t.
<mmap_inited,mapped_size_max>: New member.
<mapped_remap>: Remove.
* src/gdbmopen.c: Fix a typo.
(gdbm_open): Initialize new members.
(_gdbm_init_cache): Second argument is size_t.
* src/gdbmsetopt.c (gdbm_setopt): Optval argument is void*.
Handle GDBM_SETMAXMAPSIZE.
Improve error checking.
* src/mmap.c (_GDBM_IN_MAPPED_REGION_P): Fix comparison with
the lower bound.
(_GDBM_NEED_REMAP): Return true if mapped_region is NULL.
(SUM_FILE_SIZE): Rewrite.
(_gdbm_mapped_unmap): Don't call msync.
(_gdbm_internal_remap): Take 2 arguments, the second one
giving the new mapped size.
Unmap the region prior to remapping it.
Always pass NULL as the argument to mmap.
(_gdbm_mapped_remap): Rewrite the logic. Change semantics of the
third argument. All uses updated.
(_gdbm_mapped_init): Reflect the above changes.
(_gdbm_mapped_read,_gdbm_mapped_write): Use mmap_inited to decide
whether to use mmap, because mapped_region can be reset to zero
by another functions (namely, _gdbm_mapped_lseek).
Reset mmap_inited to FALSE, if _gdbm_mapped_remap fails.
(_gdbm_mapped_lseek): Rewrite offset computations. Invalidate
the mapped region.
* src/proto.h (_gdbm_init_cache): Change prototype.
* src/update.c (write_header, _gdbm_end_update): Remove checks
for dbf->mapped_region.
* tests/gtload.c: Implement the -maxmap option (set maximal
mapped memory size).
* doc/gdbm.texinfo: Document GDBM_SETMAXMAPSIZE.
|
|
|
|
|