VMware Blockchain | 14 JUN 2022 | Build 266

Check for additions and updates to these release notes.

What's New

VMware Blockchain is an enterprise-grade blockchain platform that meets the needs of business-critical multi-party workflows. This patch release includes the following fixes:

Fixed a problem that caused the VMware Blockchain 1.6 nodes on vCenter Server 6.7 to become unresponsive

The problem caused the deployed blockchain nodes to become unresponsive, SSH login failure into VMs, and the VM web console on vCenter Server displayed error messages such as watchdog: BUG: soft lockup during the initial boot.

Migration performance enhancement to utilize maximum CPU load

Performance improvement was achieved by adding parallelization to fetch a batch of ACLs from the Daml execution engine and storing these ACLs in parallel.

Fixed a problem that caused a configuration error in the VMware Blockchain Orchestrator content library

The implemented fix modifies the internal content library file paths to synchronize with the VMware Blockchain Orchestrator appliance.

Component Versions

The supported domain versions include:

Domain Version
VMware Blockchain Platform
VMware Blockchain Orchestrator
DAML SDK 2.0.1

The VMware products and solutions discussed in this document are protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc. and its subsidiaries in the United States and other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies.

Upgrade Considerations

Implement the clone-based upgrade process only when upgrading from VMware Blockchain 1.5 to 1.6. See the Perform Clone-Based Upgrade on vSphere or Perform Clone-Based Upgrade on AWS instructions in the Using and Managing VMware Blockchain Guide.

Resolved Issues

  • In rare cases, VMware Blockchain deployments on vSphere 6.7 might fail

    On-premises VMware Blockchain Replica and Client node deployments on vSphere 6.7 can potentially enter a CPU lock during the initial boot.

  • A staggered restart of Replica nodes might cause some nodes into state transfer or, in a rare occurrence, to become non-functional

    When Replica nodes are restarted one after the other, some Replica nodes might enter into state transfer which does not complete. In certain rare circumstances, the blockchain might become non-functional due to the Replica nodes not agreeing to a single view.

  • When a deployed VM is created, the VM might not have a private static IP address due to a race condition

    In a rare occurrence, a deployed VM might not have a private static IP address when it is created. The problem occurs due to a race condition in the underlying Photon OS.

Known Issues

  • New -

    Daml index_db fails to start on some Client node VMs with less than 32GB of memory

    Under certain circumstances, when a blockchain deployment is scaled up to seven Replica nodes and restarted, the Client nodes are not operational if there are less than 32GB of memory. This problem occurs because the Daml index_db container does not have sufficient memory and fails to start.

    Workaround: It is recommended to use greater or equal to 32GB of memory on Client node VMs.

    If increasing memory is not possible, complete the following steps to restart containers on the Client node:

    1. Stop all the containers on the Client node.
      $ curl -X POST
    2. Start the daml_index_db so that the daml_index_db receives the memory first.
      $ sudo docker start daml_index_db
    3. Start the remaining containers on the Client node.
      $ curl -X POST
  • Concord container fails after a few days of running because batch requests cannot reach pre-processing consensus

    In rare cases, if one of the batch requests has not reached the pre-execution consensus, the entire batch is canceled. When one of the batch requests is in the middle of pre-processing, it cannot be canceled, and users must wait until the processing completes. This missing validation causes the Concord container process to fail.

    Workaround: None. The agent restarts the Concord container when this problem occurs to fix the error.

  • Small form factor for Client node groups in AWS deployment is not supported

    With the introduction of client services, a Client node requires 32GB minimum memory allocation. The small form factor uses the M4.xlarge instance type, which provides 16GB of memory. 

    Workaround: Update the clientNodeSpec parameter values with M4.2xLarge in the deployment descriptor for AWS deployments that require smaller provisions.

  • State Transfer delay hinders Replica nodes from restarting and catching up with other Replica nodes

    The State Transfer process is slow due to storage and CPU resource shortage. As a result, a restarted Replica node cannot catch up with other Replica nodes and fails.

    Workaround: Use the backup process to restore the failed Replica node.

  • Fluctuations in transactions per second (TPS) might destabilize some Client nodes

    In some cases, after the blockchain ran for several hours, fluctuations in TPS might be observed for some Client nodes. After that, the load stabilizes and continues with slight drops in TPS.

    Workaround: None

  • Primary Concord container fails after a few hours or days of run

    In rare cases, the information required by the pre-processor thread gets modified by the consensus thread, which causes the Concord container process to fail.

    Workaround: None. The agent restarts the Concord container when this problem occurs to fix the error.

  • Due to a RocksDB size calculation error, the oldest database checkpoints are removed even when adequate disk space is available

    A known calculation error in the database checkpoints causes over-estimation of RocksDB size resulting in the oldest database checkpoints being removed to ensure RocksDB internal operations do not fail because of insufficient disk space. However, the database checkpoint removal is not required because adequate disk space is available.

    Workaround: Configure your system to allocate more disk space for the Concord container.

    For example, if your system is configured to retain two database checkpoints, the total disk space allotted for the Concord container must be greater than six times the RocksDB size. On the other hand, if your configuration retains one database checkpoint, the total disk space allotted for the Concord container must be greater than four times the RocksDB size.                             

  • Large state snapshot requests to create a checkpoint time out

    The Daml ledger API sends state snapshot requests to the database Checkpoint manager to create a checkpoint. The checkpointing takes 18 seconds or more for large databases, and this delay causes a timeout.

    Workaround: Restart the Daml ledger API after 30 seconds.

  • Assertion fails on State Transfer source while sending messages to a State Transfer destination

    Οn a rare occasion, when two destination Replica nodes request blocks with overlapping ranges, prefetch capability is enabled on the source.

    For example, when a destination Replica-node-1 requests blocks between 500 and 750, the source prefetches blocks 751-800. When another destination, Replica-node-2, requests blocks between 751 and 900, the source prefetch is considered valid, and the assertion fails while sending blocks to destination Replica-node-2.

    Workaround: Use the backup and restore operation to avoid requesting and receiving overlapping block ranges for Replica nodes.

  • Client nodes cannot resynchronize data when the data size is greater than 5GB

    Client nodes cannot resynchronize data from the Replica nodes when the data is greater than 5GB, and the data folder is removed due to data corruption. Therefore, any .dar file uploads cause the Client node Daml Ledger API to fail.

    Workaround: Complete the applicable steps:

    • Periodically back up the data in the /mnt/data/db folder. 
    • For multiple Client node availability, backup data from another Client node in the same Client node group. If there is a data loss, restore from the previously backed up data.  See Client node backup.
  • Wavefront metrics do not appear after the network interface is deactivated and re-enabled on Replica nodes

    This problem was observed while executing a test case that explicitly deactivated the network interface on Replica nodes and rarely manifested in a production environment.

    Workaround: Restart the telegraf container by running the command, docker restart telegraf for the metrics to appear in Wavefront.

check-circle-line exclamation-circle-line close-line
Scroll to top icon