2024-05-09 01:26:05 +00:00
|
|
|
# Troubleshooting conduwuit
|
|
|
|
|
|
|
|
> ## Docker users ⚠️
|
|
|
|
>
|
2024-08-24 03:13:43 +00:00
|
|
|
> Docker is extremely UX unfriendly. Because of this, a ton of issues or support
|
|
|
|
> is actually Docker support, not conduwuit support. We also cannot document the
|
|
|
|
> ever-growing list of Docker issues here.
|
2024-05-09 01:26:05 +00:00
|
|
|
>
|
2024-08-24 03:13:43 +00:00
|
|
|
> If you intend on asking for support and you are using Docker, **PLEASE**
|
|
|
|
> triple validate your issues are **NOT** because you have a misconfiguration in
|
|
|
|
> your Docker setup.
|
2024-05-09 01:26:05 +00:00
|
|
|
>
|
2024-08-24 03:13:43 +00:00
|
|
|
> If there are things like Compose file issues or Dockerhub image issues, those
|
|
|
|
> can still be mentioned as long as they're something we can fix.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
2024-07-27 23:44:11 +00:00
|
|
|
## General potential issues
|
|
|
|
|
|
|
|
#### Potential DNS issues when using Docker
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
Docker has issues with its default DNS setup that may cause DNS to not be
|
|
|
|
properly functional when running conduwuit, resulting in federation issues. The
|
|
|
|
symptoms of this have shown in excessively long room joins (30+ minutes) from
|
|
|
|
very long DNS timeouts, log entries of "mismatching responding nameservers",
|
|
|
|
and/or partial or non-functional inbound/outbound federation.
|
2024-07-27 23:44:11 +00:00
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
This is **not** a conduwuit issue, and is purely a Docker issue. It is not
|
|
|
|
sustainable for heavy DNS activity which is normal for Matrix federation. The
|
|
|
|
workarounds for this are:
|
2024-07-27 23:44:11 +00:00
|
|
|
- Use DNS over TCP via the config option `query_over_tcp_only = true`
|
2024-08-24 03:13:43 +00:00
|
|
|
- Don't use Docker's default DNS setup and instead allow the container to use
|
|
|
|
and communicate with your host's DNS servers (host's `/etc/resolv.conf`)
|
2024-07-27 23:44:11 +00:00
|
|
|
|
2024-05-09 01:26:05 +00:00
|
|
|
## Rocksdb / database issues
|
|
|
|
|
|
|
|
#### Direct IO
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
Some filesystems may not like RocksDB using [Direct
|
|
|
|
IO](https://github.com/facebook/rocksdb/wiki/Direct-IO). Direct IO is for
|
|
|
|
non-buffered I/O which improves conduwuit performance, but at least FUSE is a
|
|
|
|
filesystem potentially known to not like this. See the [example
|
|
|
|
config](configuration/examples.md) for disabling it if needed. Issues from
|
|
|
|
Direct IO on unsupported filesystems are usually shown as startup errors.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
#### Database corruption
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
If your database is corrupted *and* is failing to start (e.g. checksum
|
|
|
|
mismatch), it may be recoverable but careful steps must be taken, and there is
|
|
|
|
no guarantee it may be recoverable.
|
2024-06-17 01:30:51 +00:00
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
The first thing that can be done is launching conduwuit with the
|
|
|
|
`rocksdb_repair` config option set to true. This will tell RocksDB to attempt to
|
|
|
|
repair itself at launch. If this does not work, disable the option and continue
|
|
|
|
reading.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
RocksDB has the following recovery modes:
|
|
|
|
|
|
|
|
- `TolerateCorruptedTailRecords`
|
|
|
|
- `AbsoluteConsistency`
|
|
|
|
- `PointInTime`
|
|
|
|
- `SkipAnyCorruptedRecord`
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
By default, conduwuit uses `TolerateCorruptedTailRecords` as generally these may
|
|
|
|
be due to bad federation and we can re-fetch the correct data over federation.
|
|
|
|
The RocksDB default is `PointInTime` which will attempt to restore a "snapshot"
|
|
|
|
of the data when it was last known to be good. This data can be either a few
|
|
|
|
seconds old, or multiple minutes prior. `PointInTime` may not be suitable for
|
|
|
|
default usage due to clients and servers possibly not being able to handle
|
|
|
|
sudden "backwards time travels", and `AbsoluteConsistency` may be too strict.
|
|
|
|
|
|
|
|
`AbsoluteConsistency` will fail to start the database if any sign of corruption
|
|
|
|
is detected. `SkipAnyCorruptedRecord` will skip all forms of corruption unless
|
|
|
|
it forbids the database from opening (e.g. too severe). Usage of
|
|
|
|
`SkipAnyCorruptedRecord` voids any support as this may cause more damage and/or
|
|
|
|
leave your database in a permanently inconsistent state, but it may do something
|
|
|
|
if `PointInTime` does not work as a last ditch effort.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
With this in mind:
|
2024-06-27 02:04:28 +00:00
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
- First start conduwuit with the `PointInTime` recovery method. See the [example
|
|
|
|
config](configuration/examples.md) for how to do this using
|
|
|
|
`rocksdb_recovery_mode`
|
|
|
|
- If your database successfully opens, clients are recommended to clear their
|
|
|
|
client cache to account for the rollback
|
|
|
|
- Leave your conduwuit running in `PointInTime` for at least 30-60 minutes so as
|
|
|
|
much possible corruption is restored
|
|
|
|
- If all goes will, you should be able to restore back to using
|
|
|
|
`TolerateCorruptedTailRecords` and you have successfully recovered your database
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
## Debugging
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
Note that users should not really be debugging things. If you find yourself
|
|
|
|
debugging and find the issue, please let us know and/or how we can fix it.
|
|
|
|
Various debug commands can be found in `!admin debug`.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
#### Debug/Trace log level
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
conduwuit builds without debug or trace log levels by default for at least
|
|
|
|
performance reasons. This may change in the future and/or binaries providing
|
|
|
|
such configurations may be provided. If you need to access debug/trace log
|
|
|
|
levels, you will need to build without the `release_max_log_level` feature.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
#### Changing log level dynamically
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
conduwuit supports changing the tracing log environment filter on-the-fly using
|
|
|
|
the admin command `!admin debug change-log-level`. This accepts a string
|
|
|
|
**without quotes** the same format as the `log` config option.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
#### Pinging servers
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
conduwuit can ping other servers using `!admin debug ping`. This takes a server
|
|
|
|
name and goes through the server discovery process and queries
|
|
|
|
`/_matrix/federation/v1/version`. Errors are outputted.
|
2024-05-09 01:26:05 +00:00
|
|
|
|
|
|
|
#### Allocator memory stats
|
|
|
|
|
2024-08-24 03:13:43 +00:00
|
|
|
When using jemalloc with jemallocator's `stats` feature, you can see conduwuit's
|
|
|
|
jemalloc memory stats by using `!admin debug memory-stats`
|