To troubleshoot a peer-to-peer data sync issue, first determine the cause of the
issue by obtaining and analyzing the database error and warning messages
captured in the debug logs.
As demonstrated in the following snippet, obtain the database debug logs
before Ditto is initialized so that any potential issues related to the file
system access and authentication background tasks that run during initialization
are tracked. You want to begin gathering logs before Ditto is initialized in
order to capture any potential issues with those two background tasks.
For OnlinePlayground and onlineWithAuthentication identities, the first thing
that Ditto needs to do is authenticate to the Big Peer.Confirm that all devices share the same AppID by verifying that the app_id log line contains your AppID:
Once confirmed, verify that authentication was successful.
JSON
AuthClient: authentication request succeeded
Once authentication is successful, client re-authentication does not occur until
the device’s local certificate expires, as indicated by the following:
[DEBUG] 2023-08-03T17:36:35.046Z: failed to connect to peer; error = Connect failed because a TLS stream couldn't be established: invalid peer certificate contents: invalid peer certificate: UnknownIssuer; remote_peer = MulticastRemotePeer(id: 10, announce Q2RLCGEYdpLwjFditto)
If you see the above message, your device’s locally cached certificate is invalid. The device needs to call ditto.auth.logout() and reconnect to the Internet to get a new certificate. Alternatively, you can clear the local cache by reinstalling the mobile application or clearing the local persistence directory.
Error attempting authentication with token: Could not connect to authentication server
Shell
Error attempting authentication with token: Failed to authenticate: Internal server error; debug: Some("Ditto cloud failed to decode the response body. This is most likely due to malformed JSON in the response: error decoding response body: expected value at line 1 column 1"); client info: None
If you see either of the above messages, this is most likely a problem with your authentication webhook. There is a debugging level log line that will provide more information about the authentication webhook if you need more information. Send a POST request with a JSON stringified body to your server to ensure that the AWS-hosted Big Peer servers located in U.S. regions successfully send POST requests to your authentication server.
If you see that Ditto is unable to sync, and see the log lines below, it’s possible that your ditto instance is being deallocated before it is able to synchronize with other peers. Ensure that the ditto instance is initialized and saved in a global variable that is long-lived for the duration of the application.
authenticationExpiringSoon and authenticationRequired both need to be
implemented according to the sample code.
Since callback objects are only invoked when Ditto initializes and the client authentication certificate expires, do not create subscriptions inside callbacks.
Keep a strong reference to the AuthClient for the duration of the Ditto object, otherwise the auth handler will become garbage collected at an inappropriate time.
Verify that your webhook provider name is correctly copied in the Ditto portal.
The provider name given to the Ditto Client must match a provider name in the Portal (e.g., my-auth).
Ensure that you keep a reference to the AuthClient in scope for the duration that Ditto is also active. You should attach the dittoAuth variable to the object so it does not get garbage collected.For example:
C#
namespace Sync { public class DittoClient { private Ditto ditto; private DittoAuthDelegate dittoAuthDelegate; public DittoClient(string appId, string id, string jwt) { dittoAuthDelegate = new DittoAuthDelegate(id, jwt); ditto = new Ditto(identity); } ... }}
A small peer sync with the Big Peer using a WebSocket connection.The debug logs will tell you each step to create a successful WebSocket connection to the big peer.First, the peer will discover the big peer and you’ll see a Discovered event.
If you see StreamFailed or ConnectionEnded or any errors related to WebSocket connection with the Big Peer, there is likely an error in the Big Peer subscription server. For troubleshooting help, contact Ditto support.
If your certificate chain is corrupted, you will see the following.
JSON
Connect failed because the underlying websocket transport reported an error: TLS error: webpki error: UnsupportedCriticalExtensio
To fix this, please try to reset your certificate chain. On MacOS, open the Keychain Acess application and navigate to the settings, and click “Reset default keychains…”
JSON
Keychain.app > Settings > Reset Default Keychains
You will then see that the underlying physical replication session has been
started with phy started.The pkSOME_BYTES identifer displays the public key to the Big Peer instance that
a Small Peer WebSocket connection has temporary sync access.Note that the following snippet is merely an example; the pkSOME_BYTES identifer is not a guaranteed static variable.
Once temporary remote access is authorized, the following debug log message
appears to indicate that the Small Peer (client) WebSocket connection to the Big
Peer (cloud server) is successful:
For the Small Peer to have an active subscription to read and write to other peers, it must first be authorized access by a Big Peer. If the Small Peer does not have permissions to access database replication, it cannot subscribe to its collections to read and write.
Local permissions refer to the current peer that is running on the device.Once the peer is authorized, it will begin to print the active subscriptions. Ditto will print the active subscriptions every time the sync engine wakes up. This includes a local write to the
database or a sync event from another peer writing to their local database.You will want to verify that these permissions are correct and what you expect. A peer cannot subscribe to a collection if it does not have read permission.The default Big Peer remote permission to access a connected Small Peer local
database replication to read and write is as follows:
Now that default permissions are verified, confirm that the Big Peer is connected to a Small Peer and the local-to-global database replication task is running.Ditto prints a list of all the queries that the peer is currently subscribed to.
By default, each peer includes some internal subscriptions, which are denoted using a double underscore (__) before the collection name; for example, the following default internal __presence subscription:
You will also see a list of remote subscriptions. The big peer subscribes to everything, so you will see the following line which references the big peer:
If there are any application-level subscriptions, they will be listed by collection. For instance, if a Small Peer that subscribes to all tasks that are not deleted, the following appears:
Application: notifying a local subscription change
If you have problems with subscriptions or permssions, you can try the operations with global read and write permission to verify that sync succeeds in this case. You can do this by using an OnlinePlayground identity, which defaults to global read and write permissions.Check the following
Do your permissions match your subscriptions? If you do not have permission to subscribe to data, it will not sync to the device.
Are you subscribing to what you expect to see?
Do you have more subscriptions than you expect to have?
ParseError can be printed in the debug logs when there is a problem with your query. For example, if you create a subscription in Swift that is the empty string, you will see a ParseError in the logs.
If a Write is made to the local database, the following debugging messages appear:
JSON
Write txn committed; txn_id = 40; originator = User
JSON
Notifying a database change; transaction = 40
Once a Write is made to the local database, Ditto re-prints the active subscriptions and permissions to access the debug log, as demonstrated in the previous snippet.Next, the Small Peer creates the “update file.” Annotated in the debug logs as follows, the update file provides the status of the data update and, using the pkA and pkB identifiers, indicates the two peers involved in the data exchange.
JSON
Creating a sending update, sending_update_path = "pkA/pkB/sending_update"
Once the local peer finishes creating the update file, the following appears to indicate that the update file is complete and the local peer is ready to send the new update to the remote peer. Note that at this time the local Write has yet to be synchronized across all connected peers.
If the local peer has received the update, all connected peers are synchronized, and the local database replication process is complete, the following appears:
JSON
No next update chunk to send - setting is_ready to falseNo message to send
Do you have a firewall or proxy enabled that is blocking Ditto’s connection to the Big Peer?
A proxy may either either block connections or cause errors in the log by substituting its own TLS certificate: invalid certificate: UnknownIssuer. If you see this log message you will either need to get Ditto unblocked or add the CA certificate to the Small Peer’s trusted certificate store.Verify that you can reach the following endpoints. You should see the output exactly as written below:
JSON
> nc -v MY_APP_ID.cloud.ditto.live 443Connection to MY_APP_ID.cloud.ditto.live port 443 [tcp/https] succeeded!^C
If this test passes, next check to see if WebSockets are blocked on your machine. Some corporate networks, firewalls, or proxies block the HTTP upgrade packet that tells the WebSocket server to keep the connection alive. Check with your IT administrator to see if your computer is configured to block WebSocket connections.
This section only discusses blocked transactions on native platforms (e.g. iOS, Android, Windows, Linux). Ditto in web browsers operates sufficiently differently and isn’t covered here.
Blocked write transactions will automatically retry until they succeed. A blocked write transaction will never crash. Howewever, blocked write transactions are a common cause for poor database performance. Long running blocks are generally bad since they mean that nothing else can be writing to the database during this time. This could manifest itself as one of many problems:
Unresponsive UI: an interaction might try to update a document, but is blocked by an existing write transaction
Slow sync: new updates cannot be integrated into the store, since they’re blocked by another write transaction
A blocked write transaction can hint that something is wrong with the application code, or at a deeper level in Ditto. This page contains some tips & tricks to help understand the situation.The process of unblocking is automatic and you don’t need to write any code to handle this. However, you can drastically reduce the chance of blocking transactions by making sure a device is only syncing the data it really needs.
Send a POST request with a JSON stringified body to your server to ensure that Ditto Server successfully sends POST requests to your authentication server.
authenticationExpiringSoon
Since callback objects are only invoked when Ditto initializes and the client authentication certificate expires, do not create subscriptions inside callbacks.
Keep a strong reference to the AuthClient for the duration of the Ditto object, otherwise the auth handler will become garbage collected at an inappropriate time.
Ensure that you keep a reference to the AuthClient in scope for the duration that Ditto is also active. You should attach the dittoAuth variable to the object so it does not get garbage collected.For example:
psuedocode
namespace Sync { public class DittoClient { private Ditto ditto;+ private DittoAuthDelegate dittoAuthDelegate; public DittoClient(string appId, string id, string jwt) {+ dittoAuthDelegate = new DittoAuthDelegate(id, jwt); ditto = new Ditto(identity); } ... }}
Evict irrelevant data. You can evict all irrelevant once per day
Turn off verbose logging. Verbose logging can slow down replication considerably, especially with thousands of documents. Hence, it could look like that sync is stalling, but that can be indistinguishable from the logging mechanism slowing down ditto because it is trying to write too many lines.
Look at the size of your ditto directory. Is it very large? Large databases will be slower. Try to query less data.
Use profiling tools for your platform to better understand where the memory leak
may be occurring.
Ensure you are not loading too much data into memory at once. Ditto is designed to work with large datasets, but you should only load the data you need at any given time.
A common issue we see in reactive apps is a failure to dispose of resources as
conditions change. Your app could create a large accumulation of publishers that
infinitely grow. Every liveQuery and subscription in ditto must be explicitly
stopped using the stop or cancel API. See syncing data for more information.
This section only discusses blocked transactions on native platforms (e.g. iOS, Android, Windows, Linux). Ditto in web browsers operates sufficiently differently and isn’t covered here.
Blocked write transactions will automatically retry until they succeed. A blocked write transaction will never crash. Howewever, blocked write transactions are a common cause for poor database performance. Long running blocks are generally bad since they mean that nothing else can be writing to the database during this time. This could manifest itself as one of many problems:
Unresponsive UI: an interaction might try to update a document, but is blocked by an existing write transaction
Slow sync: new updates cannot be integrated into the store, since they’re blocked by another write transaction
A blocked write transaction can hint that something is wrong with the application code, or at a deeper level in Ditto. This page contains some tips & tricks to help understand the situation.The process of unblocking is automatic and you don’t need to write any code to handle this. However, you can drastically reduce the chance of blocking transactions by making sure a device is only syncing the data it really needs.
At any given time, there can be only one write transaction. Any subsequent attempts to open another write transaction will become blocked until the existing write transaction finishes.
Read transactions operate differently to write transactions.Read transactions cannot be blocked and can run in parallel with write transactions. Read transactions don’t block each other, don’t block write transactions, and aren’t blocked by write transactions.If a write transaction is blocked, Ditto will log a message with increasing severity every 10s.
Time (t) a transaction has been blocked
Log Level
t ≤ 30s
DEBUG
31s < t ≤ 120s
WARN
120s < t
ERROR
To see these logs in the database, it’s important to have a minimum log level set. Transactions that are blocked for over 2 minutes should always be visible in the logs.If INFO level is used, then INFO, WARN, and ERROR messages will all be included in the logs. This means any write transactions blocking for more than 30s should always be visible in the logs.
If a write transaction is blocked, the device logs will look something like the following. In this example we can see a write transaction was blocked for a total of 150s (or 2.5 minutes).
As time progressed, Ditto complained more and more loudly (starting with DEBUG logs before eventually logging at ERROR level). Eventually the existing transaction finished and blocked transaction was was able to proceed.The write transaction which was blocked was for a Ditto internal component. This is identified by “originator=Internal”.The existing, long-running write transaction which was causing the block was a user call in the public SDK. This is identified by “blocked_by=User”. So a user-level write transaction is blocking some internal workload. This is not necessarily a problem, as the internal system will catch up eventually.
An application might block its own write transactions by performing multiple writes at the same time in different places. If one is slow (perhaps it does too much work, or perhaps it reaches out to external APIs, etc.) then the other write transactions will block until it finishes.
// Thread/Queue 1 (starts first):{ // Somewhere in the app, a long running write transaction exists ditto.store["people"].findByID(docID).update { mutableDoc in // Most update tasks are quick, but a developer might // doing something slow within the update block: let apiData = getDataFromASlowExternalAPICall() // <-- !!!! mutableDoc?["age"] = apiData.age mutableDoc?["ownedCars"].set(DittoCounter()) mutableDoc?["ownedCars"].counter?.increment(by: apiData.count) }}// Thread/Queue 2 (starts second):{ // Somewhere else in the app, concurrently (e.g. background thread or queue) // another write transaction tries to update a document. // // This will block until the "people" update block above completes. let docID = try! ditto.store["settings"].upsert([ "_id": "abc123", "preference": 31, ])}