What's New in v20.1.1

May 26, 2020

This page lists additions and changes in v20.1.1 since v20.1.0.

Warning:

A denial-of-service (DoS) vulnerability is present in CockroachDB v20.1.0 - v20.1.10 due to a bug in protobuf. This is resolved in CockroachDB v20.1.11 and later releases. When upgrading is not an option, users should audit their network configuration to verify that the CockroachDB HTTP port is not available to untrusted clients. We recommend blocking the HTTP port behind a firewall.

For more information, including other affected versions, see Technical Advisory 58932.

Warning:

Cockroach Labs has discovered a bug relating to incremental backups, for CockroachDB v20.1.0 - v20.1.13. If a backup coincides with an in-progress index creation (backfill), RESTORE, or IMPORT, it is possible that a subsequent incremental backup will not include all of the indexed, restored or imported data.

Users are advised to upgrade to v20.1.15 or later, which includes resolutions.

For more information, including other affected versions, see Technical Advisory 63162.

Get future release notes emailed to you:

Downloads

Docker image

icon/buttons/copy
$ docker pull cockroachdb/cockroach:v20.1.1

Backward-incompatible changes

  • The copy of system and crdb_internal tables extracted by cockroach debug zip is now written using the TSV format (inside the .zip file), instead of an ASCII-art table. #48094
  • Updated the textual error and warning messages displayed by cockroach quit. #47692
  • cockroach quit now prints progress details on its standard error stream, even when --logtostderr is not specified. Scripts that wish to ignore this output can redirect the standard error stream. #47692
  • CockroachDB v20.1 introduced an experimental new rule for the --join flag causing it to prefer SRV records, if present in DNS, to look up the peer nodes to join. However, it is also found to cause disruption in in certain deployments. To reduce this disruption and UX surprise, the feature is now gated behind a new command-line flag --experimental-dns-srv which must now be explicitly passed to cockroach start to enable it. #49129
  • Added a new cluster setting, server.shutdown.lease_transfer_wait, that allows you to configure the server shutdown timeout period for transferring range leases to other nodes. Previously, the timeout period was not configurable and was set to 5 seconds, and the phase of server shutdown responsible for range lease transfers would give up after 10000 attempts of transferring replica leases away. The limit of 10000 attempts has been removed, so that now only the maximum duration server.shutdown.lease_transfer_wait applies. #47692

General changes

  • The statement diagnostics bundle now contains a new file, trace-jaeger.json, that can be manually imported in Jaeger for visualization. #47432

Enterprise edition changes

  • Fixed a bug where the job ID of a lagging changefeed would be omitted, and instead it would be reported as sinkless. #48562

SQL language changes

  • The pg_collation, pg_proc, pg_database, and pg_type tables in the pg_catalog database no longer require privileges on any database in order for the data to be visible. #48080, #48765
  • Histogram collection with CREATE STATISTICS is no longer supported on columns with type array. Only row count, distinct count, and null count are collected for array-type columns. #48343
  • ROLLBACK TO SAVEPOINT is no longer permitted after miscellaneous internal errors. #48305
  • Fixed an issue with optimizing subqueries involving set operations that can prevent queries from executing. #48680
  • CockroachDB now correctly reports the type length for the char type. #48642
  • The RowDescription message of the wire-level protocol now contains the table ID and column ID for each column in the result set. These values correspond to pg_attribute.attrelid and pg_attribute.attnum. If a result column does not refer to a simple table or view, these values will be zero, as they were before. The message also contains the type modifier for each column in the result set. This corresponds to pg_attribute.atttypmod. If it is not available, the value is -1, as it was before. #48748, #49087

Command-line changes

  • cockroach debug zip now tries multiple times to retrieve data using SQL if it encounters retry errors and skips over fully decommissioned nodes. It also supports two command-line parameters --nodes and --exclude-nodes. When specified, they control which nodes are inspected when gathering the data. This makes it possible to focus on a group of nodes of interest in a large cluster, or to exclude nodes that cockroach debug zip would have trouble reaching otherwise. Both flags accept a list of individual node IDs or ranges of node IDs, e.g., --nodes=1,10,13-15. #48094
  • Client commands such as cockroach init and cockroach quit now support the --cluster-name and --disable-cluster-name-verification flags in order to support running them on clusters that have been configured to use a cluster name. Previously it was impossible to run such commands against nodes configured with the --cluster-name flag. #48016
  • It is now possible to drain a node without shutting down the process, using the cockroach node drain command. This makes it easier to integrate with service managers and orchestration: it now becomes safe to issue cockroach node drain and then separately stop the service via a process manager or orchestrator. Without this new mode, there is a risk to misconfigure the service manager to auto-restart the node after it shuts down via quit, in a way that's surprising or unwanted. The new command node drain also recognizes the new --drain-wait flag. #47692
  • The time that cockroach quit waits client-side for the node to drain (remove existing clients and push range leases away) is now configurable via the command-line flag --drain-wait. Note that separate server-side timeouts also apply separately. Check the server.shutdown.* cluster settings for details. #47692
  • The commands cockroach quit and cockroach node drain now report a "work remaining" metric on their standard error stream. The value reduces until it reaches 0 to indicate that the graceful shutdown has completed server-side. An operator can now rely on cockroach node drain to obtain confidence of a graceful shutdown prior to terminating the server process. #47692
  • The default value of the parameter --drain-wait for cockroach quit has been increased from 1 minute to 10 minutes, to give more time for nodes with thousands of ranges to migrate their leases away. #47692
  • Added support for list cert with certificates which require --cert-principal-map to pass validation. #48177
  • Added support for the --cert-principal-map flag in the cockroach cert, cockroach sql, cockroach init, and cockroach quit commands. #48177
  • Made --storage-engine sticky (i.e., resolve to the last used engine type when unspecified) even when specified stores are encrypted at rest. #49073

Admin UI changes

  • Fixed a bug where Raft log too large was reported incorrectly for replicas for which the raft log size is not to be trusted. #48286
  • Fixed a bug where a multi-node cluster without localities defined wouldn't be able to render the Network Latency page. #49191
  • Fixed a bug where link to specific problem ranges had an incorrect path. Problem ranges are now linked correctly again. #49188

Bug fixes

  • Fixed a bug where vectorized queries on composite datatypes could sometimes return invalid data. #48463
  • Fixed a bug that could lead to data corruption or data loss if a replica was both the source of a snapshot and was being concurrently removed from the range and certain specific conditions exist inside RocksDB. This scenario is rare, but possible. #48321
  • Fixed a bug where the migration for ongoing schema change jobs would cause the node to panic with an "index out of bounds" error upon encountering a malformed table descriptor with no schema change mutation corresponding to the job to be migrated. #48838
  • Fixed an error where instead of returning a parsing error in queries with count(*) CockroachDB could incorrectly return no output (when the query was executed via row-by-row engine). #47485
  • Fixed a bug where CockroachDB was incorrectly releasing memory used by hash aggregation. #47518
  • cockroach debug zip can now successfully avoid out-of-memory errors when extracting very large system or crdb_internal tables. It will also report an error encountered while writing the end of the output ZIP file. #48094
  • Removed redundant metadata information for subqueries and postqueries in EXPLAIN (VERBOSE) output. #47975
  • TRUNCATE can now run on temporary tables, fixing a bug in v20.1 where temporary tables could not be truncated, resulting in an error unexpected value: nil. #48078
  • Fixed a bug in which (tuple).* was only expanded to the first column in the tuple and the remaining elements were dropped. #48290
  • Fixed case where PARTITION BY and ORDER BY columns in window specifications were losing qualifications when used inside views. #47715
  • CockroachDB will no longer display a severe internal error upon certain privilege check failures via pg_catalog built-in functions. #48242
  • Fixed a bug where a read operation in a transaction with a past savepoint rollback would give an internal error for exceeding the maximum count of results requested #48165
  • The distinction between delete jobs for columns and dependent jobs for deleting indices, views and sequences is now better defined. #48259
  • Fixed incorrect results that could occur when casting negative intervals or timestamps to type decimal. #48345
  • Fixed an error that occurred when statistics collection was explicitly requested on a column with type array. #48343
  • Fixed a nil pointer dereference in Pebble's block cache due to a rare "double free" of a block. #48346
  • Fixed Pebble to properly mark sstables for compaction which contain range tombstones. This matches the behavior when using RocksDB and ensures that space used for temporary storage is reclaimed quickly. #48346
  • Fixed a bug introduced in v20.1 that could cause multiple index GC jobs to be created for the same schema change in rare cases. #47818
  • Fixed a bug where CockroachDB could return an internal error when performing a query with CASE, AND, OR operators in some cases when it was executed via the vectorized engine. #48072
  • Fixed a rare bug where stats were not automatically generated for a new table. #48027
  • Fixed a panic that could occur when SHOW RANGES or SHOW RANGE FOR ROW was called with a virtual table. #48347
  • Made SRV resolution non-fatal for join list records to align with the standard and improve reliability of node startup. #48349
  • Fixed a rare bug causing a range to deadlock and all the writes to the respective range to timeout. #48303
  • Fixed a long-standing bug where HTTP requests would start to fail with error 503 "transport: authentication handshake failed: io: read/write on closed pipe" and never become possible again until the node is restarted. #48456
  • When processing --join, invalid SRV records with port number 0 are now properly ignored. #48527
  • Fixed a bug where SHOW STATISTICS USING JSON contained incorrect single quotes for strings with spaces inside histograms. #48544
  • Fixed a bug where the two settings kv.range_split.by_load_enabled and kv.range_split.load_qps_threshold were incorrectly marked as non-public in the output of SHOW CLUSTER SETTINGS. #48585
  • You can no longer drop databases that contain tables which are currently offline due to IMPORT or RESTORE. Previously dropping a database in this state could lead to a corrupted schema which prevented running backups. #48606
  • Fixed a bug preventing timestamps from being closed which could result in failed follower reads or failure to observe resolved timestamps in changefeeds. #48682
  • Fixed debug encryption-status and the Admin UI display of encryption status when using Pebble. #47995
  • CockroachDB now deletes the partially imported data after an IMPORT fails or is canceled. #48605
  • Fixed a bug where the SHOW CREATE statement would sometimes show a partitioning step for an index that has been dropped. #48768
  • Re-allowed diagnostics.forced_sql_stat_reset.interval, diagnostics.sql_stat_reset.interval and external.graphite.interval to set to their maximum values (24hr, 24hr and 15min respectively). #48760
  • Fixed a bug where CockroachDB could encounter an internal error when a query with LEFT SEMI or LEFT ANTI join was performed via the vectorized execution engine. This is likely to occur only with vectorize=on setting. #48751
  • Fixed a bug where running cockroach dump on tables with interleaved primary keys would erroneously include an extra CREATE UNIQUE INDEX "primary" ... INTERLEAVE IN PARENT statement in the dump output. This made it impossible to reimport dumped data without manual editing. #48776
  • Fixed a bug where running cockroach dump on a table with collated strings would omit the collation clause for the data insertion statements. #48832
  • CockroachDB now properly restores tables that were backed up while they were in the middle of a schema change. #48850
  • Manually writing a NULL value into the system.users table for the hashedPassword column will no longer cause a server crash during user authentication. #48836
  • Fixed a bug where in rare circumstances, CockroachDB may fail to open a store configured to use the Pebble storage engine. #49080
  • Fixed a bug where the storage engine, when configured to use the Pebble storage engine, would return duplicate keys, causing incorrect or inconsistent results. #49080
  • Fixed a bug where columns of a table could not be dropped after a primary key change. #49088
  • Fixed a bug which falsely indicated that kv.closed_timestamp.max_behind_nanos was almost always growing. #48716
  • Fixed a bug where changing the primary key of a table that had partitioned indexes could cause indexes to lose their zone configurations. In particular, the indexes rebuilt as part of a primary key change would keep their partitions but lose the zone configurations attached to those partitions. #48827
  • Fixed costing of lookup join with a limit on top, resulting in better plans in some cases. #49137
  • Fixed a bug where on dropping a database, it would not drop the entry for its public schema in the system.namespace table. #49139
  • SHOW BACKUP SCHEMAS no longer shows table comments as they may be inaccurate. #49130
  • Fixed a memory leak which can affect changefeeds performing scans of large tables. #49161
  • Prevented namespace orphans (manifesting as database "" not found errors) when migrating from v19.2. #49200
  • Fixed a bug that caused query failures when using arrays in window functions. #49238

Performance improvements

  • Disabled the Go runtime block profile by default which results in a small but measurable reduction in CPU usage. The block profile has diminished in utility with the advent of mutex contention profiles and is almost never used during performance investigations. #48153
  • The cleanup job which runs after a primary key change to remove old indexes, which blocks other schema changes from running, now starts immediately after the primary key swap is complete. This reduces the amount of waiting time before subsequent schema changes can run. #47818
  • Histograms used by the optimizer for query planning now have more accurate row counts per histogram bucket, particularly for columns that have many null values. The histograms also have improved cardinality estimates. This results in better plans in some cases. #48626, #48646
  • Fixed a bug that caused a simple schema change to take more than 30s. #48621
  • Queries run via the vectorized execution engine are now processed faster, with most noticeable gains on the queries that output many rows. #48732
  • Reduced time needed to run a backup command when it is built on a lot of previous incremental backups. #48772

Doc updates

Contributors

This release includes 94 merged PRs by 27 authors. We would like to thank the following contributors from the CockroachDB community:

  • Drew Kimball (first-time contributor)

Yes No

Yes No