Release Notes for Couchbase Server 7.2
Release 7.2.3 (November 2023)
Couchbase Server Release 7.2.3 was released in November 2023
Fixed Security Vulnerabilities For complete details of fixed security vulnerabilities in Couchbase Enterprise Server, see the Enterprise Security Alerts page. |
This release contains:
-
Fixes to issues
XDCR
Issue | Description | Resolution |
---|---|---|
DCP nozzle race would leave gomemcached feed running and leak memory. |
Fixed: the feed is correctly synchronized and closed. |
Release 7.2.2 (September 2023)
Couchbase Server 7.2.2 was released in September 2023
Couchbase release 7.2.1 is no longer available for download. |
This release contains:
-
Fixes to issues
Fixed issues
Analytics Service
Issue | Description | Resolution |
---|---|---|
Some DCP messages were unnecessarily processed multiple times. |
DCP messages are now processed once, regardless of the configured |
|
Frames that contain only DCP data were unnecessarily processed by the complete ingestion pipeline, resulting in performance degradation. |
The complete ingestion pipeline no longer processes frames that only contain DCP data. |
Search Service
Issue | Description | Resolution |
---|---|---|
An issue occurred with a node failover. This issue is only pertinent to the 7.2.1 release, and does not affect older builds. |
During rebalancing, a check is now made for missing partitions. |
Release 7.2.1 (September 2023)
Couchbase release 7.2.1 is no longer available for download. |
Couchbase Server 7.2.1 was released in September 2023.
This release contains:
-
New features
-
Enhancements
-
A list of future reserved words in an upcoming Server release
-
Newly supported platforms
-
Fixes to issues
New Features
This release includes the following new features.
XDCR
The following XDCR features are new:
-
XDCR replications, specified by means of the REST API, can now use the
filterBinary
flag. This specifies whether binary documents should be replicated. Detailed information on the filterBinary flag is provided on the REST reference page, Creating a Replication. -
Using the REST API, node-connectivity can now be checked, prior to the creation of an XDCR reference. See Checking Connections.
-
In Couchbase Server Version 7.2.1 and later, XDCR provides enhanced information on cluster-rebalance status. See Rebalance Information
Enhancements
This release includes the following enhancements:
-
Bloom filters for the Indexing Service were not previously enabled by default. Bloom filters are now enabled by default. This change reduces the Index Service disk lookups when there are insert heavy workloads. Bloom filters are used by the index storage layer to reduce the disk i/o and improve the overall efficiency of the index service. You can disable bloom filters and opt out.
Future Reserved Words
To give you enough time to prepare ahead, we are going to add the following reserved words for features in an upcoming Couchbase Server release:
-
SEQUENCE
-
CACHE
-
RESTART
-
MAXVALUE
-
MINVALUE
-
NEXT
-
PREV
-
PREVIOUS
-
NEXTVAL
-
PREVVAL
-
CYCLE
-
RECURSIVE
-
RESTRICT
No action is required for upgrading to Server 7.2.1.
New Supported Platforms
See Supported Platforms for the complete list of supported platforms.
This release adds support for the following new platforms:
-
Alma Linux 9
-
Rocky Linux 9
Fixed Issues
This release contains the following fixes.
Analytics Service
Issue | Description | Resolution |
---|---|---|
External collections could not be created using Azure Managed Identity. |
Azure dependencies have been updated to correct this issue. |
|
Query results could be unnecessarily converted twice to JSON when documents were large. |
The Query result is now converted to JSON once for all documents. |
|
When the Prometheus stats returned from Analytics exceeded four kilobytes, the status code was inadvertently set to 500 (Internal Error), and this resulted in a large number of warnings in the Analytics warning log. Couchbase Server discarded these statistics. |
This has been fixed to properly return a 200 (OK) status code when the size of Prometheus stats exceeds 4KiB, allowing these stats to be recorded properly. The warning is not displayed. |
Data Service
Issue | Description | Resolution |
---|---|---|
The last item in a replica checkpoint was not expelled. In scenarios such as large average item size, high numbers of replicas or low Bucket quota could result in a data-node entering an unrecoverable Out-of-Memory state. |
ItemExpel has been enhanced to release all the items in a checkpoint when memory conditions allow. |
|
A rollback loop affected legacy clients when collections were used and a tombstone newer than the last mutation in the default collection was purged. |
The lastReadSeqno is now Incremented when the client is not collection-aware. |
|
In rare cases, after a failover or memcached restart, a replica rollback while under memory pressure might have caused a crash in the Data Service. |
Memory pressure recovery logic (Item expelling) is now skipped when replica rollback is in progress. |
|
XDCR or restore from backup entered an endless loop if attempting to overwrite a document which was deleted or expired some time ago with a deleteWithMeta operation. This was due to a specific unanticipated state in memory which increased CPU usage, and connection became unusable for further operations. |
deleteWithMeta is now resilient to temporary non-existent values with xattr datatype. |
|
When using .NET SDK on Windows 10 client and client certs were enabled on CB Server, the Data-Service did not establish a connection and client bootstrap failed with a OpenSSL “session id context uninitialized" error. |
Data-Service has been updated to disable TLS session resume. |
|
GET_META requests for deleted items fetched metadata in memory which was not evicted in value-eviction buckets. |
Metadata items are now cleaned when the expiry pager runs. |
|
DCP clients streamed in out-of-sequence-order [OSO] backfill snapshots under Magma observed duplicate documents received in the disk snapshot. This happened where the stream was paused and resumed when the resume point was wrongly set to a key already processed in the stream. |
OSO backfill in Magma now sets the correct resume point after a pause. |
|
Data Service rebalance duration was significantly impacted if other DCP clients created a large number of Streams, if those streams needed to be read from disk, due to the lack of prioritizing between rebalance and other DCP clients. |
The number of backfills each DCP client can perform concurrently has been limited to allow fairer allocation of resources. |
|
The computation count for the items remaining DCP/Checkpoint stats exposed to Prometheus was the O(N) function. Where N is the number of items in a checkpoint. This caused various performance issues including Prometheus stats timeouts when checkpoints accumulated a high number of items. |
The computation count has been optimized and now is O(1). |
|
A spurious auto-failover could happen when Magma compaction visited a TTL’d document that was already deleted. |
Document not found does not now increment the number of read failures. |
Index Service
Issue | Description | Resolution |
---|---|---|
During scaling, an GSI indexer rebalance froze and did not successfully complete. This was because an index snapshot was not correctly deleted and recreated. |
A flag now handles snapshots to ensure they are correctly deleted or recreated when indexes are updated during rebalancing. |
|
When alter index updated the replica count, new replicas were not built immediately when the original definition was {defer_build: true}. Existing replicas were built and new replicas were built in the next processing iteration. |
New replicas are now built when the replica count is updated for deferred indexes. The status of existing index instances is checked, and if ready, a new build of the instance is triggered. |
|
When the indexer was unable to keep up with KV mutations, and there was a queue of mutations within the indexer, there was a large memory overhead from the bookkeeping of queued up mutations. |
Indexer has been improved to optimize memory usage so that the bookkeeping overhead is reduced for queued up mutations. |
Query Service
Issue | Description | Resolution |
---|---|---|
Due to how nested dependencies were handled, a sudden rise in memory utilization of the query service on a node caused a memory alert issue. The node did not recover correctly following a restart. |
Nested dependencies are now handled appropriately in the ADVISE statement. |
|
A query with multiple filters on an index key, one of which was a parameter, could produce incorrect results. This was caused by incorrectly composing the exact index spans to support the query. |
The way in which exact spans are set has been modified to correct this issue. |
|
Covering FLATTEN_KEYS() on an array index generated incorrect results. This was because a modified version of the ANY clause was applied after the index which meant false positives were retained and Distinct scan rows were eliminated. |
The ANY filter is now applied on an index scan itself when covering an index scan with flatten keys. |
|
Inter-service read timeout errors were not detected or handled accordingly. User requests consequently failed with timeout errors without retrying with a new connection. |
The error handling and retry mechanism has been modified to handle these types of timeout issues and errors. |
|
Under certain circumstances, a query with UNNEST used a covering index scan and incorrect results were returned. Reference to the UNNEST expression should have prevented the covering index from being used for the query as the index did not contain the entire array. |
The logic to determine covering UNNEST scans has been changed to not use a covering index scan for such queries. |
|
When an index scan had multiple spans, index selectivity was incorrectly calculated. |
Index selectivity for multiple index spans is now correctly calculated. |
|
Incorrect results were returned for a non-IndexScan on a constant false condition. This was due to incorrect handling of a FALSE WHERE clause. |
The FALSE WHERE clause is now correctly handled. |
|
Querying system:functions_cache in a multi query node cluster returned incomplete results with warnings. The query result included entries in the local query node, but none from remote query nodes. This was due to a typographical error. |
The typographical error has been corrected. |
|
A panic in go_json.stateInString under parsed value functions caused by incorrect concurrent access resulted in the state being freed whilst still in use. |
The concurrent access issue has been resolved. |
|
A Prepared statement might have resulted in an incorrect result in a multi-node environment. For example, a database with two query nodes. |
Correlated subqueries from an encoded plan are now detected and marked. This ensures correct results are provided. |
|
When a WITH clause (common table expression, or CTE) was used inside a subquery, and the WITH clause definition referenced the parent query, and was correlated, the query engine did not properly detect the correlation. This produced an incorrect result from the WITH clause evaluation because the result was not cached correctly. |
Correlations inside WITH clause definitions are now properly detected. |
|
cbq required a client authentication key file whenever a certificate authority file was used. |
cbq now accepts a certificate authority file without a client key file enabling use with username and password credentials. |
|
When appropriate optimizer statistics were used in Cost-Based Optimizer (CBO), for a query with ORDER BY, if there were multiple indexes available for the query, CBO unconditionally favored an index that provided ordering. Such indexes were not always the best ones to use. |
CBO now allows cost-based comparison of indexes. |
|
An ADVISE statement with multiple levels of UNNEST caused a syntax error in the CREATE INDEX statement from the Index Advisor. |
ADVISE has been improved when there are queries with multiple levels of UNNEST. |
Cluster Manager
Issue | Description | Resolution |
---|---|---|
A Cluster Manager process crash meant the Delete Bucket memcached command was not always called before bucket files were deleted later in rebalance. This caused the memcached process to crash repeatedly causing data service downtime. |
The Delete Bucket command is now called on memcached before a file is deleted during rebalance. This ensures mencached doesn’t attempt to read the files. |
Cross Datacenter Replication (XDCR)
Issue | Description | Resolution |
---|---|---|
Data streamed from the Data Service over XDCR should always be streamed in order by mutation id. However, in some scenarios, for efficiency, the Data Service streamed records that were not ordered by mutation id. In certain situations, this out-of-sequence-order [OSO] caused performance issues. |
OSO mode is now available as a global override to be switched off for any currently deployed replications to avoid performance issues. |
|
XDCR did not process documents with a JSON array and Extended Attributes (XATTRs). When a document contained XATTRs, XDCR checked for XATTRs in transactions, transaction filters were enabled, and XATTRs were not checked. |
When documents contain arrays, XATTRs are now checked in the transaction XATTRs, and the document is not prevented from being parsed in an array. |
|
Binary documents were replicated when an Advanced Filtering Expression was present. |
A filter has been added which can be turned on to prevent all binary documents from being replicated. |
|
It appeared that XDCR had stalled and an explanation was not provided. For rebalances on the source or target, the XDCR pipeline should be restarted, and data movement should continue. Before the pipeline was restated, there might have been fewer data movements as the rebalanced VBs were no longer streaming. |
An ETA is now provided in the Server UI to show when the pipeline is due to be restarted. |
|
Checkpoint Manager created checkpoint records out-of-sequence when many target nodes ran slowly. |
Checkpoint Manager now creates checkpoints in sequence when target nodes are slow. |
|
The bucket topology service sent a concurrent map iteration and map write panic to XDCR which caused a fatal error. |
Validation has been improved to prevent the panic from happening. |
|
Prometheus stats did not include a pipeline’s status. |
The pipeline status is now provided as part of a prometheus stat. |
|
A Checkpoint Manager Initialization error caused two memory leak types. These were a backfill pipeline and a main pipeline memory leak. |
The Pipeline Manager and backfill pipeline have both been modified to prevent the memory leaks. |
|
XDCR Checkpoint Manager instances were not cleaned up under certain circumstances due to timing and networking issues when contacting target, or when an invalid backfill task was fed in as input. |
Checkpoint Manager instances are now cleaned up. A flag has been added to check for invalid backfill tasks. |
|
When a replication spec change was made to a non-Data Service node, delete replication hung and caused the node to return an incorrect replication configuration. |
XDCR now checks that the node is running the Data Service and handles it correctly. |
|
XDCR could fail due to multiple connection issues. For example, DNS issues or firewalls. For a number of databases, it was a difficult task to manually check every node to determine where the connection issue was. For multiple nodes in a database and in the target database, debugging the issue required many connection checks. |
A connection pre-check feature has been added to XDCR which ensures all connections from source nodes to target nodes are valid. Credentials are now also checked. |
|
Running ipv6 only mode + non-encrypted remote resulted in invalid IP addresses being returned, leading to connection issues. |
A valid IP address is now returned. |
|
StatsMgr stopping could hang due to watching for notifications resulting in stranded go-routines. |
Go-routines are now stopped correctly. |
|
When ipv4 only mode was used, and full encryption only had an alternate address configured where the internal address was unresolvable, XDCR resulted in an error when it contacted the target data nodes. |
The specific scenario has been fixed so that replication can now proceed. |
|
The Prometheus endpoint did not expose any XDCR error metrics. |
XDCR error metrics are now exposed via Prometheus. |
|
A legacy race condition where metadata store could cause a conflict was exposed as part of the binary filter improvements. |
Legacy race conditions have all been resolved. |
|
Under certain circumstances, when rebalancing, the target cluster could return an EACCESS error code that caused source XDCR to pause the pipeline. |
This has been reversed. Instead of pausing the pipeline when rebalancing, XDCR now retries when an EACCESS error is encountered in XmemNozzle. XDCR counts and prints this activity in the log. |
|
Checkpoint Manager could be stuck when stopping if it had not been started yet, resulting in memory leak. |
Checkpoint Manager can now be stopped correctly even when it hasn’t been started. |
Metrics and Monitoring
Issue | Description | Resolution |
---|---|---|
An issue occurred where the Cluster Manager instructed Prometheus to reload the configuration and the reload timeout impacted other requests. |
The Cluster Manager has been improved to handle timeouts when instructing Prometheus to reload the configuration. |
|
The Cluster Manager’s computed utilization stats were inaccurate due to time interval discrepancies in components where data was collected. |
The Cluster Manager now reports raw stats as Prometheus counters. |
Storage
Issue | Description | Resolution |
---|---|---|
Inconsistencies were observed where a single Magma bucket in a database took a long time to warm up. |
The seq index scan has been optimized for tombstones of zero value size. Optimization is for look up by key, sequence iteration, and key iteration. Docs of 0 value size are placed in both key index and seq index. |
|
Disk backfills were hanging permanently due to high memory consumption when large documents were streamed over many DCP streams concurrently. |
Memory for a document read by a DCP stream is now released before switching to another stream. |
Known Issues
Search Service
Issue | Description | Workaround |
---|---|---|
An issue occurs with node failover.
If a user does not bring in a replacement node before the failover occurs,
then lost active or lost replica search indexes that have partitions on the replaced node are not rebuilt. This issue is only pertinent to the 7.2.1 release, and does not affect older builds. |
This situation can be prevented if a replacement node was brought into the cluster in place of the failed over node, before starting the rebalance operation. This problem should not affect on-line or off-line rebalances. If lost indexes are encountered, then, a manual update to the affected search index definition(s) will trigger a rebuild of the affected indexes. |
Release 7.2
New Features and Enhancements
-
The following new platforms are supported:
-
Red Hat Enterprise Linux 9
-
Oracle Linux 9
-
Ubuntu 20 LTS (ARM64)
-
Ubuntu 22 LTS (x86, ARM64)
-
Amazon Linux 2023 (x86, ARM64)
-
macOS 12 Apple Silicon
See Supported Platforms for the complete list of supported platforms, and notes on deprecated platforms.
-
-
Cost Based Optimizer for Analytics (CBO). The cost-based optimizer for Analytics chooses the optimal plan to execute an Analytics query. The cost-based optimizer gathers and utilizes samples from Analytics collections, and then queries the samples at query planning time to estimate the cost of each operation.
The Analytics Service introduces new syntax for managing samples, and provides parameters and hints to help specify the behavior of the cost-based optimizer. See Cost-Based Optimizer for Analytics.
-
Time Series Queries. Time series data is any data which changes over time. It is usually collected frequently, in regular or irregular intervals, from a device or a process.
The Query Service provides a standard format for time series data, which promotes compact storage and quick processing, and introduces a _TIMESERIES function to query time series data. See Time Series Data and the _TIMESERIES Function.
-
Change History. A change history can be maintained for collections in a bucket. Changes to documents within the collections are included in the change history. A maximum size for the change history can be specified in bytes or seconds. See Change History.
For information on establishing change-history default settings, at bucket-creation time, see Creating and Editing Buckets. For information on switching the change history on or off for a specific collection, see Creating and Editing a Collection. To examine the change-history status for each collection in a bucket, see the collections option for
cbstats
. To read the change history, use the Kafka 4.1 Connector. -
New alerts are provided for change-history size threshold and Index Service low residence threshold. See Setting Alerts.
-
You can now configure block size for Magma storage when you create a bucket. See Creating and Editing Buckets.
-
New metrics are provided for tracking XDCR conflict resolution on the target cluster. See Monitoring Conflict Resolution on the Target Cluster.
-
Couchbase Server now checks node certificates to ensure a node-name is correctly identified with a Subject Alternative Name (SAN) when certificates are uploaded and when a node is added or joins a cluster. See Node-Certificate Validation.
-
The Analytics Service now supports external datasets on Google Cloud Platform (GCP) storage. You can manage these datasets using the UI or the Analytics Links REST API. See Managing Links and Analytics Links REST API.
-
When connecting from an external network, you can now use the
network=external
option to specify an alternate address when usingcbbackupmgr
,cbimport
, andcbexport
. See Host Formats information in cbbackupmgr, cbimport, and cbexport. -
You can now download the
cbbackupmgr
,cbimport
, andcbexport
tools from a tools package. This enables developers or testers to use the tools from machines on which Couchbase Server is not installed. See Server Tools Packages. -
Capella databases use Certificate Authorities (CAs), to establish secure connections: these CAs are now automatically trusted when you use Couchbase Web Console or the REST API to establish fully secure XDCR connections between Capella databases and Couchbase Enterprise Server 7.2+. See Capella Trusted CAs.
Deprecated and Removed Features and Platforms
-
The following operating systems are no longer supported:
-
SUSE Linux Enterprise Server 12 versions earlier than SP2
-
MacOS 10.15 Catalina
-
RHEL 7
-
CentOS 7
-
Oracle Linux 7
-
Ubuntu 18 LTS
-
-
MacOS 11 Big Sur is deprecated.
-
Debian 10 is deprecated.
See Supported Platforms for the complete list of supported platforms.
-
TLS 1.0 and 1.1 are deprecated. See Establishing the Minimum TLS-Version.
Fixed Issues
This release contains the following fixes.
Cluster Manager
Issue | Description |
---|---|
Reporting wrong fragmentation and data size stats. |
|
Buckets page should load even if the browser machine is slow or bandwidth is low. |
Cross Datacenter Replication (XDCR)
Issue | Description |
---|---|
Prolonged TMPFAIL or ENOMEM causes memory bloat. |
|
Inter Cluster XDCR failing in Server 7.1.2 and Capella. |
|
XDCR on non-KV node freezes when replication settings changed several times. |
Query Service
Issue | Description |
---|---|
Display the number of uses for prepared statements accurately. |
|
[SQL++] insert does not trigger a memory quota exceeded. |
|
Potential for request stall if stream operator fails to notify request that it has terminated. |
|
Optimizer hints are not displayed in EXPLAIN statements for subqueries. |
|
Disable impersonate if KV node does not support collections to prevent Query service errors when upgrading from 6.6.5. |
|
ORDER BY after UNION requires explicitly aliased terms. |
|
Covered FTS SEARCH() with memory_quota fails. |
|
Active requests and queued requests in SQL++ metrics are gauges, not counters. |
|
Memory exceeded quota error with ARRAY_AGG. |
|
OBJECT_ functions may return incorrect results. |
Index Service
Issue | Description |
---|---|
When an index drop is immediately followed by bucket delete, the indexer can deadlock when a rare race condition occurs. |
|
A scanning issue occurs when an index is on a node and the replica index is on a different node. |
|
Index build hangs in mixed mode when the projector skips transaction records. |
|
Index build stuck on "Check pending stream" during shard rebalance testing. |
|
Report aggregated node level statistics information using prometheus. |
|
Use streamId instead of index.Stream to determine stream catchup pending. |
|
cbindex did not execute the build index and the performance test is stuck. |
|
Change log level when using watchers to connect to indexer services in a cluster. |
|
"FlushTs Not Snapshot Aligned" message incorrectly displayed in Log multiple times. |
|
Nil value in the Node table causes a panic issue in the logs. |
|
Orphaned watcher background thread in logs following a Server upgrade. |
|
RedistributeIndexes flag should consider partitioned and non-partitioned indexes. |
Storage
Issue | Description |
---|---|
Address plasma RP Version 16 bit overflow, and recovery and data logs. |
|
Ensure rows that were previously compacted do not return when a crash and recovery occurs in the Magma storage engine. |
Eventing Service
Issue | Description |
---|---|
Use "_txn:" so the Eventing service detects and rejects transaction documents. |
|
Restoring Eventing Functions to new scope in the same bucket upgrades and overwrites the admin or global Function Scope. |
|
Running "advancedGetOpWithCache" returns an incorrect meta.id on the second call when accessing a document twice. |
|
FunctionOverload parser results in false positives when it incorrectly flags function names with reserved function names as their prefix. |
|
Eventing Bucket Backed Cache in the Advanced Accessor couchbase.get() not efficient for very large documents. The final implementation returns small documents with a speed up is 20-25X while large documents are returned with a speed up of 400-500X when using {"cache": true}. |
|
Eventing writes to the wrong keyspace (_default._default) if the collection name is long (over 30 characters). |
|
As of 7.2, the curl() call no longer performs URL encoding. Older functions with a language compatibility of 6.6.2 still work. In some cases, an eventing function created in 7.1 might require a parameter added to the curl() call "url_encode_version": "7.1.0" as the 7.1 release changed the 6.6.2 URL encoding method. |
Known Issues
This release contains the following known issues.
Couchbase CLI
Issue | Description | Workaround |
---|---|---|
7.1.0 couchbase-cli may authenticate using either a username or password, or a mTLS (client certificate). The CLI argument validation does not handle the case where no authentication is required. For example, node-init. This results in a false positive where couchbase-cli requires authentication flags to be provided. |
Before a node is initialized, use placeholders for the username and password. After the node is initialized, a username and password must be supplied. |
Search Service
Issue | Description | Workaround |
---|---|---|
Intermittent crashes and errors happen on Full Text Search at query time in term dictionary and postings list. This happens when attempting to access invalid addresses or out-of-bounds data. For example, when term dictionary might be an empty byte slice. |
For the errors, a retry logic on the application layer might help. However, there is no workaround for the intermittent crashes which Couchbase are investigating. |
Analytics Service
Issue | Description | Workaround |
---|---|---|
The first version of the cost-based optimizer (CBO) focuses on optimizing SPJ (select-project-join) queries, or multi-inner-join subgraphs of such queries. For more complex queries, e.g. queries involving outer joins or complex correlated subqueries, some parts of the query will be handled by CBO and other parts will not be. As a result, query plans currently displayed for such queries will be missing CBO-provided cost and cardinality estimates, instead showing those values as |
There is no workaround for this display issue. |