Reliable PostgreSQL Backup & Restore

Overview

pgBackRest
Reliable PostgreSQL Backup & Restore

Introduction

pgBackRest aims to be a reliable, easy-to-use backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements.

pgBackRest v2.37 is the current stable release. Release notes are on the Releases page.

Please find us on GitHub and give us a star if you like pgBackRest!

Features

Parallel Backup & Restore

Compression is usually the bottleneck during backup operations but, even with now ubiquitous multi-core servers, most database backup solutions are still single-process. pgBackRest solves the compression bottleneck with parallel processing.

Utilizing multiple cores for compression makes it possible to achieve 1TB/hr raw throughput even on a 1Gb/s link. More cores and a larger pipe lead to even higher throughput.

Local or Remote Operation

A custom protocol allows pgBackRest to backup, restore, and archive locally or remotely via TLS/SSH with minimal configuration. An interface to query PostgreSQL is also provided via the protocol layer so that remote access to PostgreSQL is never required, which enhances security.

Multiple Repositories

Multiple repositories allow, for example, a local repository with minimal retention for fast restores and a remote repository with a longer retention for redundancy and access across the enterprise.

Full, Incremental, & Differential Backups

Full, differential, and incremental backups are supported. pgBackRest is not susceptible to the time resolution issues of rsync, making differential and incremental backups completely safe.

Backup Rotation & Archive Expiration

Retention polices can be set for full and differential backups to create coverage for any timeframe. WAL archive can be maintained for all backups or strictly for the most recent backups. In the latter case WAL required to make older backups consistent will be maintained in the archive.

Backup Integrity

Checksums are calculated for every file in the backup and rechecked during a restore. After a backup finishes copying files, it waits until every WAL segment required to make the backup consistent reaches the repository.

Backups in the repository are stored in the same format as a standard PostgreSQL cluster (including tablespaces). If compression is disabled and hard links are enabled it is possible to snapshot a backup in the repository and bring up a PostgreSQL cluster directly on the snapshot. This is advantageous for terabyte-scale databases that are time consuming to restore in the traditional way.

All operations utilize file and directory level fsync to ensure durability.

Page Checksums

PostgreSQL has supported page-level checksums since 9.3. If page checksums are enabled pgBackRest will validate the checksums for every file that is copied during a backup. All page checksums are validated during a full backup and checksums in files that have changed are validated during differential and incremental backups.

Validation failures do not stop the backup process, but warnings with details of exactly which pages have failed validation are output to the console and file log.

This feature allows page-level corruption to be detected early, before backups that contain valid copies of the data have expired.

Backup Resume

An aborted backup can be resumed from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. Since this operation can take place entirely on the backup server, it reduces load on the database server and saves time since checksum calculation is faster than compressing and retransmitting data.

Streaming Compression & Checksums

Compression and checksum calculations are performed in stream while files are being copied to the repository, whether the repository is located locally or remotely.

If the repository is on a backup server, compression is performed on the database server and files are transmitted in a compressed format and simply stored on the backup server. When compression is disabled a lower level of compression is utilized to make efficient use of available bandwidth while keeping CPU cost to a minimum.

Delta Restore

The manifest contains checksums for every file in the backup so that during a restore it is possible to use these checksums to speed processing enormously. On a delta restore any files not present in the backup are first removed and then checksums are taken for the remaining files. Files that match the backup are left in place and the rest of the files are restored as usual. Parallel processing can lead to a dramatic reduction in restore times.

Parallel, Asynchronous WAL Push & Get

Dedicated commands are included for pushing WAL to the archive and getting WAL from the archive. Both commands support parallelism to accelerate processing and run asynchronously to provide the fastest possible response time to PostgreSQL.

WAL push automatically detects WAL segments that are pushed multiple times and de-duplicates when the segment is identical, otherwise an error is raised. Asynchronous WAL push allows transfer to be offloaded to another process which compresses WAL segments in parallel for maximum throughput. This can be a critical feature for databases with extremely high write volume.

Asynchronous WAL get maintains a local queue of WAL segments that are decompressed and ready for replay. This reduces the time needed to provide WAL to PostgreSQL which maximizes replay speed. Higher-latency connections and storage (such as S3) benefit the most.

The push and get commands both ensure that the database and repository match by comparing PostgreSQL versions and system identifiers. This virtually eliminates the possibility of misconfiguring the WAL archive location.

Tablespace & Link Support

Tablespaces are fully supported and on restore tablespaces can be remapped to any location. It is also possible to remap all tablespaces to one location with a single command which is useful for development restores.

File and directory links are supported for any file or directory in the PostgreSQL cluster. When restoring it is possible to restore all links to their original locations, remap some or all links, or restore some or all links as normal files or directories within the cluster directory.

S3, Azure, and GCS Compatible Object Store Support

pgBackRest repositories can be located in S3, Azure, and GCS compatible object stores to allow for virtually unlimited capacity and retention.

Encryption

pgBackRest can encrypt the repository to secure backups wherever they are stored.

Compatibility with PostgreSQL >= 8.3

pgBackRest includes support for versions down to 8.3, since older versions of PostgreSQL are still regularly utilized.

Getting Started

pgBackRest strives to be easy to configure and operate:

Documentation for v1 can be found here. No further releases are planned for v1 because v2 is backward-compatible with v1 options and repositories.

Contributions

Contributions to pgBackRest are always welcome! Please see our Contributing Guidelines for details on how to contribute features, improvements or issues.

Support

pgBackRest is completely free and open source under the MIT license. You may use it for personal or commercial purposes without any restrictions whatsoever. Bug reports are taken very seriously and will be addressed as quickly as possible.

Creating a robust disaster recovery policy with proper replication and backup strategies can be a very complex and daunting task. You may find that you need help during the architecture phase and ongoing support to ensure that your enterprise continues running smoothly.

Crunchy Data provides packaged versions of pgBackRest for major operating systems and expert full life-cycle commercial support for pgBackRest and all things PostgreSQL. Crunchy Data is committed to providing open source solutions with no vendor lock-in, ensuring that cross-compatibility with the community version of pgBackRest is always strictly maintained.

Please visit Crunchy Data for more information.

Recognition

Primary recognition goes to Stephen Frost for all his valuable advice and criticism during the development of pgBackRest.

Crunchy Data has contributed significant time and resources to pgBackRest and continues to actively support development. Resonate also contributed to the development of pgBackRest and allowed early (but well tested) versions to be installed as their primary PostgreSQL backup solution.

Armchair graphic by Sandor Szabo.

Issues
  • Slow full-backup on deployment with millions of small files

    Slow full-backup on deployment with millions of small files

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.28

    2. PostgreSQL version: 9.5

    3. Operating system/version: CentOS 7

    4. Did you install pgBackRest from source or from a package? Package

    5. Please attach the following as applicable:

    • pgbackrest.conf
    [utv-db10]
    pg1-path=/var/lib/pgsql/9.5/data
    
    [global]
    process-max=4
    repo1-path=/
    repo1-retention-full=2
    repo1-cipher-type=none
    repo1-s3-bucket=utv-db10
    repo1-s3-endpoint=svc-filestor.lab.int
    repo1-s3-host=utv-db10.svc-filestor.lab.int
    repo1-s3-verify-ssl=n
    repo1-s3-key=svc-filestor
    repo1-s3-key-secret=<redacted>
    repo1-s3-region=us-east-1
    repo1-type=s3
    compress-type=zst
    start-fast=y
    buffer-size=8388608
    
    [global:archive-push]
    compress-level=3
    
    1. Describe the issue:

    (This is a follow-up from issue #1118)

    I'm evaluating pgbackrest as a replacement for my companys current backup system WAL-E. We have a schema-level style of sharding our tenants, where each of our database servers hosts 4000 customer schemas, spread over 16 databases. This has worked great for scalability, but presents a challenge for many backup systems - since this layout results in a lot of files in postgresql data directory.

    Output of ls -1RL /var/lib/pgsql/9.5/data/ | wc -l; 6602805

    So above 6 million files in pgdata, total size is around 77gb.

    While we previously had problems with pgbackrest memory-usage, when that issue was quickly resolved we ran into a secondary problem; full-backups taking a very long time to complete. On the server mentioned above, performing a full backup took over 10 hours. The root cause was identified by pgbackrest developers as pgbackrest processing+uploading each file individually in pgdata dir, causing a large overhead when targeting S3 storage.

    It was suggested in that issue (#1118) to employ a similar technique as WAL-E, i.e. compress several small files into a single compressed tarball before uploading, to avoid too much overhead. It was agreed that the implementation of such a feature should be tracked in a separate issue, which would be the one created here. With WAL-E, a full backup of the same server can be done in about 70 minutes.

    I can help with testing any potential fixes for this problem on our lab environment. Please let me know if I can provide any more information or assist in other ways.

    enhancement 
    opened by MannerMan 57
  • intermittent 15 minute slowdown in S3 uploads after system update

    intermittent 15 minute slowdown in S3 uploads after system update

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.23

    2. PostgreSQL version: 11.6

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-1058-aws x86_64) running on AWS i3.8xlarge (32 CPU 256GB RAM)

    4. Did you install pgBackRest from source or from a package? package

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s) [global] process-max=12 start-fast=y archive-async=y spool-path=/var/spool/pgbackrest repo1-type=s3 repo1-path=/pgbackrest-repo repo1-retention-full=8 repo1-retention-diff=20 repo1-s3-endpoint=s3.amazonaws.com repo1-s3-region=us-west-2 repo1-s3-bucket=pd-pg-warehouse repo1-s3-key=A**************** repo1-s3-key-secret=O********************** [pg-warehouse] db-path=/data/pg-warehouse/

      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port) (no problems here)

      • errors in the postgresql log file before or during the time you experienced the issue (no errors here)

      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log) see below for a portion of log with relevant information

    6. Describe the issue: probably this is similar to previously reported issue: https://github.com/pgbackrest/pgbackrest/issues/788

    After installing update on 2020-02-05 we are seeing intermittent slowdowns in S3 transfer. Backup transfer would run for few minutes and then sit for approximately 15 minutes after which I see the following messages:

    common/io/http/client::httpClientRequest: retry KernelError: tls failed syscall: [110] Connection timed out
    storage/s3/storage::storageS3Request: retry RequestTimeTooSkewed: The difference between the request time and the current time is too large.
    

    Here are the log records around such slowdown.

    2020-02-14 21:52:26.464 P00  DEBUG:     storage/storage::storageNewWrite: (this: {type: s3, path: {"/pgbackrest-repo"}, write: true}, fileExp: {"<REPO:BACKUP>/20200209-110003F_20200214-214800D/backup.manifest.copy"}, param.modeFile: 0000, param.modePath: 0000, param.user: null, param.group: null, param.timeModified: 0, param.noCreatePath: false, param.noSyncFile: false, param.noSyncPath: false, param.noAtomic: false, param.compressible: false)
    2020-02-14 21:52:26.464 P00  DEBUG:     storage/s3/storage::storageS3NewWrite: (this: {StorageS3}, file: {"/pgbackrest-repo/backup/pg-warehouse/20200209-110003F_20200214-214800D/backup.manifest.copy"})
    2020-02-14 21:52:26.464 P00  DEBUG:     storage/s3/storage::storageS3NewWrite: => {type: s3, name: {"/pgbackrest-repo/backup/pg-warehouse/20200209-110003F_20200214-214800D/backup.manifest.copy"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
    2020-02-14 21:52:26.464 P00  DEBUG:     storage/storage::storageNewWrite: => {type: s3, name: {"/pgbackrest-repo/backup/pg-warehouse/20200209-110003F_20200214-214800D/backup.manifest.copy"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
    2020-02-14 21:52:26.464 P00  DEBUG:     info/manifest::manifestSave: (this: {Manifest}, write: {IoWrite})
    2020-02-14 21:52:26.473 P00  DEBUG:     info/info::infoSave: (this: {Info}, write: {IoWrite}, callbackFunction: (function *), callbackData: *void)
    2020-02-14 22:08:07.016 P00  DEBUG:     common/io/http/client::httpClientRequest: retry KernelError: tls failed syscall: [110] Connection timed out
    2020-02-14 22:08:07.069 P00  DEBUG:     storage/s3/storage::storageS3Request: retry RequestTimeTooSkewed: The difference between the request time and the current time is too large.
    2020-02-14 22:08:07.297 P00  DEBUG:     info/info::infoSave: => void
    2020-02-14 22:08:07.297 P00  DEBUG:     info/manifest::manifestSave: => void
    2020-02-14 22:08:07.297 P00  DEBUG:     command/backup/backup::backupManifestSaveCopy: => void
    2020-02-14 22:08:07.298 P00  DEBUG:     command/backup/backup::backupJobResult: (manifest: {Manifest}, host: null, fileName: {"/data/pg-warehouse/base/16401/677988007.4"}, fileRemove: {[]}, job: {state: done, key: {"pg_data/base/16401/677988007.4"}, command: {command: backupFile}, code: 0, message: null, result: {VariantList}}, sizeTotal: 1604360011819, sizeCopied: 64736969131, pageSize: 8192)
    

    I can attach whole log file, but essentially the slowdown cases look exactly the same.

    I have read through the related issues: https://github.com/pgbackrest/pgbackrest/issues/707 https://github.com/pgbackrest/pgbackrest/issues/788 There seems to be issues related to kernel updates. I am attaching the update log after which the slowdown started to happen (everything was running smoothly for months before the update) [ubuntu_updates_20200204.txt] (https://github.com/pgbackrest/pgbackrest/files/4221931/ubuntu_updates_20200204.txt)

    I have opened a ticket with AWS support, so will update here once I have more information. But would be great to know if anyone else experience similar issues or there are any solutions.

    module [core] enhancement 
    opened by slava-pagerduty 54
  • archive command failed with exit code 45 after failover

    archive command failed with exit code 45 after failover

    1. pgBackRest version: 2.04

    2. PostgreSQL version: 10.4

    3. Operating system/version - Debian Stretch

    4. Did you install pgBackRest from source or from a package? Source

    5. Please attach the following as applicable:

    pgbackrest.conf

    [global]
    log-level-console=warn
    log-level-file=info
    log-level-stderr=warn
    repo-s3-bucket=<redacted>
    repo-s3-endpoint=<redacted>
    repo-s3-key=<redacted>
    repo-s3-key-secret=<redacted>
    repo-s3-region=<redacted>
    repo-type=s3
    retention-full=2000
    retention-diff=2000
    start-fast=y
    backup-standby=n
    repo-path=/my_repo_path
    process-max=4
    recovery-option=standby_mode=on
    
    [mydb]
    db-path=/data/pg/10
    db-port=5432
    

    archive command: /usr/bin/pgbackrest --config /conf/pgbackrest.conf --stanza=mydb archive-push %p >> /dev/null 2>&1

    Relevant PG log segment:

    2018-08-27 11:44:41,525 INFO: no action.  i am a secondary and i am following a leader
    <master killed hard>
    2018-08-27 11:44:53,599 INFO: promoted self to leader by acquiring session lock
    server promoting
    2018-08-27 11:44:53,604 INFO: cleared rewind state after becoming the leader
    2018-08-27 11:45:01,471 INFO: Lock owner: member_x; I am member_x
    2018-08-27 11:45:01,492 INFO: updated leader lock during promote
    2018-08-27 11:45:07 UTC [70]: [8-1] user=,db=,client= LOG:  received promote request
    2018-08-27 11:45:07 UTC [70]: [9-1] user=,db=,client= LOG:  redo done at 0/9F00E388
    2018-08-27 11:45:07 UTC [70]: [10-1] user=,db=,client= LOG:  last completed transaction was at log time 2018-08-27 03:35:57.108013+00
    2018-08-27 11:45:08 UTC [70]: [11-1] user=,db=,client= LOG:  selected new timeline ID: 7
    2018-08-27 11:45:10 UTC [70]: [12-1] user=,db=,client= LOG:  archive recovery complete
    2018-08-27 11:45:10 UTC [95]: [5-1] user=,db=,client= LOG:  checkpoint starting: force
    2018-08-27 11:45:10 UTC [68]: [4-1] user=,db=,client= LOG:  database system is ready to accept connections
    2018-08-27 11:45:10 UTC [95]: [6-1] user=,db=,client= LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.009 s; sync files=0, longest=0.000 s, average=0.000 s; distance=16327 kB, estimate=90157 kB
    2018-08-27 11:45:10 UTC [6021]: [1-1] user=,db=,client= LOG:  archive command failed with exit code 45
    2018-08-27 11:45:10 UTC [6021]: [2-1] user=,db=,client= DETAIL:  The failed archive command was: /usr/bin/pgbackrest --config /conf/pgbackrest.conf --stanza=mydb archive-push %p >> /dev/null 2>&1
    
    1. Describe the issue:

    This is a master/subscriber setup controlled by Patroni running in Kubernetes.

    The old master was killed hard, failover happened as expected, and when the new master comes backup it comes back with the archive-duplicate: 45 error

    Is this expected to work? I can see the potential for a race condition in here - wondering if this a pgBackRest issue or something I need to catch and handle myself.

    question 
    opened by bradnicholson 49
  • No error on stderr

    No error on stderr

    So, I'm trying to setup pgBackRest 1.11 with an instance of PostgreSQL 9.3.

    Both of them are running inside a Docker container, one with the PostgreSQL server and the other with pgBackRest. I'm trying to connect them and getting a very strange result.

    I've already setup SSH on both containers, with key verification and I'm successfully connecting manually between them.

    When I execute pgbackrest --stanza=main --log-level-console=detail backup I get this error:

    -------------------PROCESS START-------------------
    2016-12-07 10:47:08.297 P00   WARN: option retention-full is not set, the repository may run out of space
                                        HINT: to retain full backups indefinitely (without warning), set option 'retention-full' to the maximum.
    2016-12-07 10:47:08.297 P00   INFO: backup start 1.11: --db-host=banco --db-path=/var/lib/pgsql/9.3/data --db-user=postgres --repo-path=/var/lib/pgbackrest --stanza=main
    2016-12-07 10:47:08.308 P00   WARN: no prior backup exists, incr backup has been changed to full
    2016-12-07 10:47:09.417 P00   INFO: execute exclusive pg_start_backup() with label "pgBackRest backup started at 2016-12-07 10:47:08": backup begins after the next regular checkpoint completes
    2016-12-07 10:47:09.622 P00   INFO: backup start archive = 000000010000000000000008, lsn = 0/8000028
    2016-12-07 10:47:09.829 P00  ERROR: [103]: remote process terminated on banco host: no error on stderr
    2016-12-07 10:47:09.830 P00   INFO: backup stop
    
    -------------------PROCESS START-------------------
    2016-12-07 10:50:46.576 P00   WARN: option retention-full is not set, the repository may run out of space
                                        HINT: to retain full backups indefinitely (without warning), set option 'retention-full' to the maximum.
    2016-12-07 10:50:46.576 P00   INFO: backup start 1.11: --db-host=banco --db-path=/var/lib/pgsql/9.3/data --db-user=postgres --log-level-stderr=info --repo-path=/var/lib/pgbackrest --stanza=main
    2016-12-07 10:50:46.587 P00   WARN: no prior backup exists, incr backup has been changed to full
    2016-12-07 10:50:47.689 P00   INFO: execute exclusive pg_start_backup() with label "pgBackRest backup started at 2016-12-07 10:50:46": backup begins after the next regular checkpoint completes
    2016-12-07 10:50:47.797 P00  ERROR: [132]: raised on banco host: ERROR:  a backup is already in progress
                                               HINT:  Run pg_stop_backup() and try again.:
                                               select to_char(current_timestamp, 'YYYY-MM-DD HH24:MI:SS.US TZ'), pg_xlogfile_name(lsn), lsn::text from pg_start_backup('pgBackRest backup started at 2016-12-07 10:50:46', false) as lsn
    2016-12-07 10:50:47.800 P00   INFO: backup stop
    

    The second time I try to execute the backup, it says that "a backup is already in progress" which isn't true.

    The /etc/pgbackrest.conffound inside the PostgreSQL container:

    [main]
    db-path=/var/lib/pgsql/9.3/data
    
    [global]
    backup-host=backup
    backup-user=backrest
    repo-path=/var/lib/pgbackrest
    

    And the /etc/pgbackrest.conf found inside the pgBackRest container:

    [main]
    db1-host=banco
    db1-path=/var/lib/pgsql/9.3/data
    db1-user=postgres
    
    [global]
    repo-path=/var/lib/pgbackrest
    

    I've been able to execute both of them successfully inside one container, but I'd like to split them.

    question bug module [core] 
    opened by L30Bola 42
  • DB backup failure - S3 error 403: The difference between the request time and the current time is too large.

    DB backup failure - S3 error 403: The difference between the request time and the current time is too large.

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.14

    2. PostgreSQL version: PostgreSQL 11.3 (Ubuntu 11.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0, 64-bit

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: Ubuntu 18.04.2 LTS

    4. Did you install pgBackRest from source or from a package? PGDG packages

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)
    [global:archive-get]
    process-max=4
    
    [global:archive-push]
    process-max=4
    
    [global]
    process-max=4
    compress-level=3
    start-fast=y
    stop-auto=y
    buffer-size=16MB
    repo1-cipher-type=none
    repo1-path=/pgbackrest
    repo1-retention-diff=4
    repo1-retention-full=3
    spool-path=/var/spool/pgbackrest
    archive-async=y
    
    cat /etc/pgbackrest/conf.d/s3.conf
    [global]
    repo1-type=s3
    repo1-s3-region=us-east-1
    repo1-s3-endpoint=s3.us-east-1.amazonaws.com
    repo1-s3-key=XXX
    repo1-s3-key-secret=XXX
    repo1-s3-bucket=XXX
    
    - `postgresql.conf` settings applicable to pgBackRest (`archive_command`, `archive_mode`, `listen_addresses`, `max_wal_senders`, `wal_level`, `port`)
    

    issue not yet occurred on archive-push

    - errors in the postgresql log file before or during the time you experienced the issue
    

    no errors in log

    - log file in `/var/log/pgbackrest` for the commands run (e.g. `/var/log/pgbackrest/mystanza_backup.log`)
    
    1. Describe the issue: It looks like #707 except we are not on RHEL. Error is random, most often happen in US Virginia AWS dc, while in Canada, German and Great Britain AWS data centers i is less frequent.

    Backup starts as usual and run for varying period of time, before error is reported. Issue was seen on incremental, differential and also on full backups.

    Error in log file:

    *** request header ***
    PUT /pgbackrest/backup/anl_master_prod/20190720-010507F_20190723-010501D/pg_data/base/16385/17228.gz?partNumber=5&uploadId=CK7Qtr_tJBmRHt8XY.49314uTwHBwjTj21C3EtiTV2q_RfnoYD3BTIVQ891KrAltLCDwN22cYTpfDXxfMgXqEdDuNOY4OG.qJMuWGU9G_s0Y5A5J1pgbLKt55fAr67D0 HTTP/1.1
    authorization: <redacted>
    content-length: 20442031
    content-md5: q+bIuRyXej+8q7LpAqn2uA==
    host: finmason-db-backups.s3.us-east-1.amazonaws.com
    x-amz-content-sha256: 743c1be5b3a5de70188bfee45c7fa9248f3ffbbd1ae9f94025434f2b32b7b85d
    x-amz-date: 20190723T083128Z
    *** canonical request ***
    PUT
    /pgbackrest/backup/anl_master_prod/20190720-010507F_20190723-010501D/pg_data/base/16385/17228.gz
    partNumber=5&uploadId=CK7Qtr_tJBmRHt8XY.49314uTwHBwjTj21C3EtiTV2q_RfnoYD3BTIVQ891KrAltLCDwN22cYTpfDXxfMgXqEdDuNOY4OG.qJMuWGU9G_s0Y5A5J1pgbLKt55fAr67D0
    content-length:20442031
    content-md5:q+bIuRyXej+8q7LpAqn2uA==
    host:finmason-db-backups.s3.us-east-1.amazonaws.com
    x-amz-content-sha256:743c1be5b3a5de70188bfee45c7fa9248f3ffbbd1ae9f94025434f2b32b7b85d
    x-amz-date:20190723T083128Z
    
    content-length;content-md5;host;x-amz-content-sha256;x-amz-date
    743c1be5b3a5de70188bfee45c7fa9248f3ffbbd1ae9f94025434f2b32b7b85d
    *** signed headers ***
    content-length;content-md5;host;x-amz-content-sha256;x-amz-date
    *** string to sign ***
    AWS4-HMAC-SHA256
    20190723T083128Z
    20190723/us-east-1/s3/aws4_request
    5da1e68473dc78c9dfcccb15f2b6676782c28b07b78658043a6b804e9109cfbf
    *** response header ***
    x-amz-request-id: 258D42A42A59FF5C
    x-amz-id-2: 19ZzjBrmxxR1i7nc8qtw/gU/EyLZyK3VGB/aAiUmR4FJrBPNBIzJUj5mqIoE3Diu0+NAWPZ4D48=
    Content-Type: application/xml
    Transfer-Encoding: chunked
    Date: Tue, 23 Jul 2019 08:47:17 GMT
    Connection: close
    Server: AmazonS3
    *** response body ***
    <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message><RequestTime>20190723T083128Z</RequestTime><ServerTime>2019-07-23T08:47:18Z</ServerTime><MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds><RequestId>258D42A42A59FF5C</RequestId><HostId>19ZzjBrmxxR1i7nc8qtw/gU/EyLZyK3VGB/aAiUmR4FJrBPNBIzJUj5mqIoE3Diu0+NAWPZ4D48=</HostId></Error>
    2019-07-23 08:58:50.280 P00   INFO: backup command end: aborted with exception [039]
    

    So the message I can read is: The difference between the request time and the current time is too large. And the limit is MaxAllowedSkewMilliseconds: 900000.

    NTP is configured and working on the machine.

    Trace log show there is gap in trace messages for P00:

    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info(): bIgnoreMissing = true, strFile = /pgsql/anl_master_prod/pgcluster/11/pgdata/base/16385/2607_fsm
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info=>: oInfo = [object]
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->manifestStat=>: hFile = [hash]
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->manifestStat(): strFile = /pgsql/anl_master_prod/pgcluster/11/pgdata/base/16385/2836_vm
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info(): bIgnoreMissing = true, strFile = /pgsql/anl_master_prod/pgcluster/11/pgdata/base/16385/2836_vm
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info=>: oInfo = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Base->new(): strId = local-2 process
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Base->new=>: self = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Handle->new=>: self = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Buffered->new(): iTimeout = 5, lBufferMax = 16777216, oParent = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Filter->new(): oParent = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Filter->new=>: self = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Buffered->new=>: self = [object]
    2019-07-23 13:35:39.649 P00  DEBUG:     Protocol::Command::Master->close=>: iExitStatus = 0
    2019-07-23 13:35:39.649 P00  TRACE:         Protocol::Command::Master->close(): bComplete = <false>
    2019-07-23 13:35:39.649 P00  TRACE:         sending exit command to process
    2019-07-23 13:35:39.649 P00  TRACE:         Protocol::Base::Master->cmdWrite(): hParam = [undef], strCommand = exit
    2019-07-23 13:35:39.649 P00  TRACE:         Protocol::Base::Master->cmdWrite: strProtocolCommand = {"cmd":"exit","param":null}
    2019-07-23 13:35:39.649 P00  TRACE:         Protocol::Base::Master->cmdWrite=>
    2019-07-23 13:36:28.664 P00  TRACE:         Common::Io::Handle->new(): fhRead = [object], fhWrite = [undef], strId = local-1 process
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Base->new(): strId = local-1 process
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Base->new=>: self = [object]
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Handle->new=>: self = [object]
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Buffered->new(): iTimeout = 5, lBufferMax = 16777216, oParent = [object]
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Filter->new(): oParent = [object]
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Filter->new=>: self = [object]
    2019-07-23 13:36:28.665 P00  TRACE:         Common::Io::Buffered->new=>: self = [object]
    2019-07-23 13:36:28.665 P00  DEBUG:     Protocol::Command::Master->close=>: iExitStatus = 0
    2019-07-23 13:36:28.747 P00  TRACE:         Db->DESTROY()
    2019-07-23 13:36:28.748 P00  TRACE:         Db->DESTROY=>
    2019-07-23 13:36:28.748 P00  DEBUG:     common/exit::exitSafe: (result: 0, error: true, signalType: 0)
    2019-07-23 13:36:28.748 P00  TRACE:         protocol/helper::protocolFree: (void)
    2019-07-23 13:36:28.748 P00  TRACE:         protocol/helper::protocolFree: => void
    2019-07-23 13:36:28.748 P00  DEBUG:     Main::mainCleanup(): iExitCode = 39
    2019-07-23 13:36:28.748 P00  DEBUG:     Protocol::Helper::protocolDestroy(): bComplete = false, iRemoteIdx = [undef], strRemoteType = [undef]
    2019-07-23 13:36:28.748 P00  TRACE:         Protocol::Helper::protocolList(): iRemoteIdx = [undef], strRemoteType = [undef]
    2019-07-23 13:36:28.748 P00  TRACE:         Protocol::Helper::protocolList=>: oyProtocol = ()
    2019-07-23 13:36:28.748 P00  DEBUG:     Protocol::Helper::protocolDestroy=>: iExitStatus = 0
    2019-07-23 13:36:28.749 P00  TRACE:         Main::mainCleanup=>
    2019-07-23 13:36:28.749 P00  TRACE:         command/command::cmdEnd: (code: 39, errorMessage: {"aborted with exception [039]"})
    2019-07-23 13:36:28.749 P00   INFO: backup command end: aborted with exception [039]
    2019-07-23 13:36:28.749 P00  TRACE:         command/command::cmdEnd: => void
    2019-07-23 13:36:28.749 P00  DEBUG:     common/lock::lockRelease: (failOnNoLock: false)
    2019-07-23 13:36:28.749 P00  TRACE:         common/lock::lockReleaseFile: (lockHandle: 4, lockFile: {"/tmp/pgbackrest/anl_master_prod-backup.lock"})
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixNew: (path: {"/"}, modeFile: 0640, modePath: 0750, write: true, pathExpressionFunction: null)
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixNewInternal: (type: {"posix"}, path: {"/"}, modeFile: 0640, modePath: 0750, write: true, pathExpressionFunction: null, syncPath: true)
    2019-07-23 13:36:28.749 P00  TRACE:         storage/storage::storageNew: (type: {"posix"}, path: {"/"}, modeFile: 0640, modePath: 0750, write: true, pathExpressionFunction: null, driver: *void, interface: {StorageInterface})
    2019-07-23 13:36:28.749 P00  TRACE:         storage/storage::storageNew: => {type: posix, path: {"/"}, write: true}
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixNewInternal: => {type: posix, path: {"/"}, write: true}
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixNew: => {type: posix, path: {"/"}, write: true}
    2019-07-23 13:36:28.749 P00  TRACE:         storage/storage::storageRemove: (this: {type: posix, path: {"/"}, write: true}, fileExp: {"/tmp/pgbackrest/anl_master_prod-backup.lock"}, param.errorOnMissing: false)
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixRemove: (this: {StoragePosix *}, file: {"/tmp/pgbackrest/anl_master_prod-backup.lock"}, errorOnMissing: false)
    2019-07-23 13:36:28.749 P00  TRACE:         storage/posix/storage::storagePosixRemove: => void
    2019-07-23 13:36:28.749 P00  TRACE:         storage/storage::storageRemove: => void
    2019-07-23 13:36:28.749 P00  TRACE:         common/lock::lockReleaseFile: => void
    2019-07-23 13:36:28.749 P00  DEBUG:     common/lock::lockRelease: => true
    2019-07-23 13:36:28.749 P00  DEBUG:     common/exit::exitSafe: => 39
    2019-07-23 13:36:28.749 P00  DEBUG:     main::main: => 39
    

    This particular backup started at 2019-07-23 12:49:54, we can see ~500Mbps TX traffic on network which was there till 13:19, afterwards network TX fall step by step close to zero 13:25. While the backup was running, RX traffic on network was negligible .

    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->manifestStat=>: hFile = [hash]
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->manifestStat(): strFile = /pgsql/anl_master_prod/pgcluster/11/pgdata/base/16385/2836_vm
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info(): bIgnoreMissing = true, strFile = /pgsql/anl_master_prod/pgcluster/11/pgdata/base/16385/2836_vm
    2019-07-23 12:49:58.211 P00  TRACE:         Storage::Posix::Driver->info=>: oInfo = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Base->new(): strId = local-2 process
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Base->new=>: self = [object]
    2019-07-23 13:35:39.649 P00  TRACE:         Common::Io::Handle->new=>: self = [object]
    

    these are few corresponding lines from log. Trace file is stored to avoid log rotation, in case more details will be needed.

    Previously, we have been using pg_basebackup and aws s3 cp .... for backups and S3 copy works like a charm. As the database growth (3TB), we migrated to pgBackRest to be able backup directly to S3.

    Is there something we can do to prevent backup failures? Thanks for any advise.

    question 
    opened by aleszeleny 39
  • High memory usage on deployment with millions of files in PGDATA

    High memory usage on deployment with millions of files in PGDATA

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.28

    2. PostgreSQL version: 9.5

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: CentOS 7.8

    4. Did you install pgBackRest from source or from a package? Tried both 2.25 from packages and 2.28 from source

    5. Please attach the following as applicable:

    • pgbackrest.conf file(s):
    [utv-db10]
    pg1-path=/var/lib/pgsql/9.5/data
    
    [global]
    process-max=1
    repo1-path=/
    repo1-retention-full=2
    repo1-cipher-type=none
    repo1-s3-bucket=utv-db10
    repo1-s3-endpoint=svc-filestor.lab.int
    repo1-s3-host=utv-db10.svc-filestor.lab.int
    repo1-s3-verify-ssl=n
    repo1-s3-key=svc-filestor
    repo1-s3-key-secret=<redacted>
    repo1-s3-region=us-east-1
    repo1-type=s3
    compress-type=zst
    start-fast=y
    buffer-size=8388608
    
    [global:archive-push]
    compress-level=3
    
    • postgresql.conf
    archive_mode = on
    archive_command = '/usr/bin/pgbackrest --stanza=utv-db10 archive-push %p'
    max_wal_senders = 5
    wal_level = hot_standby
    
    • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)
    2020-07-24 14:25:53.144 P00   INFO: execute exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes
    2020-07-24 14:25:53.846 P00   INFO: backup start archive = 000000010000009400000058, lsn = 94/58000028
    2020-07-24 14:41:01.572 P00  ERROR: [094]: unable to allocate 8388616 bytes
    2020-07-24 14:41:01.578 P00   INFO: http statistics: objects 2, sessions 16, requests 9, retries 14, closes 0
    2020-07-24 14:41:01.578 P00   INFO: backup command end: aborted with exception [094]
    
    1. Describe the issue:

    Hi,

    I'm evaluating pgbackrest as a replacement for my companys current backup system WAL-E. We have a schema-level style of sharding our tenants, where each of our database servers hosts 4000 customer schemas, spread over 16 databases. This has worked great for scalability, but presents a challenge for many backup systems - since this layout results in a lot of files in postgresql data directory.

    Output of ls -1RL /var/lib/pgsql/9.5/data/ | wc -l; 6602805

    So above 6 million files, total size 77gb. Around 5 years ago we deployed WAL-E since it was one of the few backup systems that could handle so many files without a problem. However, since WAL-E is no longer maintained we're looking for alternatives. When testing pgbackrest, I'm running out of memory when performing a full-backup. It appears as if some kind of memory leak occurs when the process of uploading files to S3 kicks in. See graphs;

    Screenshot from 2020-07-24 14-43-41

    One can see pgbackrest slowly allocating around 1.5 gb of ram, but suddenly spikes to 4.7gb+ when it appears to begin uploading to S3. This results in pgbackrest hitting the memory commit limit and crashing.

    Server specs:

    • 12gb of ram
    • shared_buffers set to 3gb
    • 1800 hugepages (~ 3.6 gb to fit shared_buffers )
    • vm.overcommit = 2 (not allowed to overcommit)
    • 7.8 gb regular memory limit
    • CentOS 7
    • postgres 9.5
    • pgbackrest 2.28

    I have tested:

    • buffer-size: 1Mb, 32k, 16k, 8Mb
    • compress-type: gz, lz4, zst
    • pgbackrest version 2.25 & 2.28

    Without success.

    Could this be a bug, or is this use-case perhaps not supported?

    question 
    opened by MannerMan 36
  • time out in multi-thread backup

    time out in multi-thread backup

    I get the following errors when running a multi-threaded backup from a backup server against a test database server. It worked when --thread-max=1, then I re-ran with thread-max=5 and received the following ASSERTS. Is there a setting I'm missing ?

    [email protected]:/data$ pgbackrest --stanza=test --type=full --stop-auto backup 2016-06-20 11:35:59.465 T00 INFO: backup start: --archive-check --buffer-size=8388608 --compress --compress-level-network=6 --db-host=dvsnode4 --db-path=/data/pg/9.5/cluster --db-timeout=600 --db-user=postgres --log-level-console=detail --log-level-file=detail --repo-path=/data/pgbr --retention-archive=7 --retention-archive-type=incr --retention-diff=7 --retention-full=3 --stanza=test --start-fast --stop-auto --thread-max=5 --thread-timeout=600 --type=full

    2016-06-20 11:07:20.118 T05 INFO: backup file /data/pg/9.5/cluster/pg_tblspc/16514/PG_9.5_201510051/16515/1000014 (0B) 2016-06-20 11:07:20.119 T04 INFO: backup file /data/pg/9.5/cluster/pg_tblspc/16514/PG_9.5_201510051/16515/1000011 (0B) 2016-06-20 11:17:20.148 T01 ASSERT: expected message 'continue' from controller but timed out after 600 second(s) 2016-06-20 11:17:20.148 T03 ASSERT: expected message 'continue' from controller but timed out after 600 second(s) 2016-06-20 11:17:20.152 T02 ASSERT: expected message 'continue' from controller but timed out after 600 second(s) Thread 1 terminated abnormally: expected message 'continue' from controller but timed out after 600 second(s) at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 1. pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x32d08f0)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 1 pgBackRest::Protocol::ThreadGroup::threadGroupThread(0) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 1 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 1 at /usr/bin/pgbackrest line 14 thread 1. main::ANON('expected message 'continue' from controller but timed out a...') called at /usr/share/perl/5.18/Carp.pm line 101 thread 1 Carp::confess('expected message 'continue' from controller but timed out a...') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 1 pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x32d08f0)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 1 pgBackRest::Protocol::ThreadGroup::threadGroupThread(0) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 1 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 1 Thread 3 terminated abnormally: expected message 'continue' from controller but timed out after 600 second(s) at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 3. pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x3c7e868)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 3 pgBackRest::Protocol::ThreadGroup::threadGroupThread(2) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 3 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 3 at /usr/bin/pgbackrest line 14 thread 3. main::ANON('expected message 'continue' from controller but timed out a...') called at /usr/share/perl/5.18/Carp.pm line 101 thread 3 Carp::confess('expected message 'continue' from controller but timed out a...') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 3 pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x3c7e868)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 3 pgBackRest::Protocol::ThreadGroup::threadGroupThread(2) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 3 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 3 Thread 2 terminated abnormally: expected message 'continue' from controller but timed out after 600 second(s) at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 2. pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x370b588)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 2 pgBackRest::Protocol::ThreadGroup::threadGroupThread(1) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 2 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 2 at /usr/bin/pgbackrest line 14 thread 2. main::ANON('expected message 'continue' from controller but timed out a...') called at /usr/share/perl/5.18/Carp.pm line 101 thread 2 Carp::confess('expected message 'continue' from controller but timed out a...') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 2 pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x370b588)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 2 pgBackRest::Protocol::ThreadGroup::threadGroupThread(1) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 2 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 2 2016-06-20 11:17:20.191 T05 ASSERT: expected message 'continue' from controller but timed out after 600 second(s) 2016-06-20 11:17:20.191 T04 ASSERT: expected message 'continue' from controller but timed out after 600 second(s) Thread 4 terminated abnormally: expected message 'continue' from controller but timed out after 600 second(s) at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 4. pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x4246a08)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 4 pgBackRest::Protocol::ThreadGroup::threadGroupThread(3) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 4 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 4 at /usr/bin/pgbackrest line 14 thread 4. main::ANON('expected message 'continue' from controller but timed out a...') called at /usr/share/perl/5.18/Carp.pm line 101 thread 4 Carp::confess('expected message 'continue' from controller but timed out a...') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 4 pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x4246a08)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 4 pgBackRest::Protocol::ThreadGroup::threadGroupThread(3) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 4 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 4 Thread 5 terminated abnormally: expected message 'continue' from controller but timed out after 600 second(s) at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 5. pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x4813300)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 5 pgBackRest::Protocol::ThreadGroup::threadGroupThread(4) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 5 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 5 at /usr/bin/pgbackrest line 14 thread 5. main::ANON('expected message 'continue' from controller but timed out a...') called at /usr/share/perl/5.18/Carp.pm line 101 thread 5 Carp::confess('expected message 'continue' from controller but timed out a...') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 223 thread 5 pgBackRest::Protocol::ThreadGroup::threadMessageExpect('Thread::Queue=HASH(0x4813300)', 'continue') called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 156 thread 5 pgBackRest::Protocol::ThreadGroup::threadGroupThread(4) called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 5 eval {...} called at /usr/lib/perl5/pgBackRest/Protocol/ThreadGroup.pm line 53 thread 5 2016-06-20 11:18:54.739 T00 ERROR: [141]: remote process terminated: ERROR [141]: unable to read line after 600 seconds 2016-06-20 11:19:00.504 T00 INFO: backup stop

    bug module [core] 
    opened by ghost 36
  • pbackrest failed at the end of backup if ssh session is closed or stay unattended, after that we need to restart DB.

    pbackrest failed at the end of backup if ssh session is closed or stay unattended, after that we need to restart DB.

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.6

    2. PostgreSQL version: 9.6

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: 2 nodes with 2.6.32-696.13.2.el6.x86_64

    4. Did you install pgBackRest from source or from a package? From package

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)

      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)

      • errors in the postgresql log file before or during the time you experienced the issue

      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)

    6. Describe the issue:

    We ran the backup and if the session is closed or stay unattended backup failed.

    we are seeing the following error in the log:

    2022-01-25 03:04:26.956 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-01-25 03:34:27.019 P00 ERROR: [057]: query 'select lsn::text as lsn, pg_catalog.pg_xlogfile_name(lsn)::text as wal_segment_name, labelfile::text as backuplabel_file, spcmapfile::text as tablespacemap_file from pg_catalog.pg_stop_backup(false)' timed out after 1800000ms 2022-01-25 03:34:27.044 P00 INFO: backup command end: aborted with exception [057]

    question 
    opened by rubenscarelo 34
  • WAL archives missing.  Have async archiving on.

    WAL archives missing. Have async archiving on.

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.09

    2. PostgreSQL version: 9.6.9

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: db master: CentOS Linux release 7.5.1804 (Core) db sysnc slave streaming: CentOS Linux release 7.5.1804 (Core) db asysnc slave streaming: CentOS Linux release 7.5.1804 (Core) repository host: CentOS Linux release 7.6.1810 (Core) db server restoring to: CentOS Linux release 7.5.1804 (Core)

    4. Did you install pgBackRest from source or from a package? rpm package

    5. Please attach the following as applicable:

    • repository server

    [server1] db-path=/disk1/pg/9.6/data db-port=5432 db-host=server1 retention-full=1

    [server2] db-path=/disk1/pg/9.6/data db-port=5432 db-host=server2 retention-full=1

    [global] repo-path=/disk1/pgbackrest backup-user=postgres backup-cmd=/usr/bin/pgbackrest archive-timeout=600 archive-async=y

    • db master

    [server2] db-path=/disk1/pg/9.6/data db-port=5432

    [global] backup-host=repositoryserver backup-user=postgres backup-cmd=/usr/bin/pgbackrest archive-timeout=600 archive-async=y

    • postgresql.conf settings applicable to pgBackRest ( listen_addresses='*", wal_level = logical archive_mode = on archive_command = 'pgbackrest --stanza=server2 archive-push %p --log-level-console=debug' max_wal_senders = 20 wal_keep_segments = 40 max_worker_processes = 24 max_replication_slots = 20 random_page_cost = 1.0 maintenance_work_mem = 10485760 autovacuum_work_mem = -1 checkpoint_completion_target = 0.9 max_wal_size = 256

    synchronous_standby_names = 'sync_repl_standy1'

    shared_preload_libraries = 'pglogical,pg_stat_statements' pg_stat_statements.max = 30000 pg_stat_statements.track = all

    hot_standby = on)

    • errors in the postgresql log file before or during the time you experienced the issue 2019-01-23 21:21:58.576 CST,"postgres","postgres",53466,"[local]",5c492a36.d0da,41,"SELECT",2019-01-23 21:00:06 CST,360/1783364,0,WARNING,01000,"pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)",,"Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.",,,,,,,"pgBackRest [backup]"

    • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log) no errors in the backup or restore logs.

    1. Describe the issue: Get WAL error in Postgres log when try to start after the following restore command: 2019-01-23 21:13:53.202 CST,,,26359,,5c492d3b.66f7,61,,2019-01-23 21:12:59 CST,1/0,0,LOG,00000,"restored log file ""0000000100017E40000000BC"" from archive",,,,,,,,,"" 2019-01-23 21:13:53.346 CST,,,26359,,5c492d3b.66f7,62,,2019-01-23 21:12:59 CST,1/0,0,LOG,00000,"restored log file ""0000000100017E40000000BD"" from archive",,,,,,,,,"" 2019-01-23 21:13:55.766 CST,,,26359,,5c492d3b.66f7,63,,2019-01-23 21:12:59 CST,1/0,0,LOG,00000,"redo done at 17E40/BEFFF878",,,,,,,,,"" 2019-01-23 21:13:55.766 CST,,,26359,,5c492d3b.66f7,64,,2019-01-23 21:12:59 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2019-01-23 15:20:56.687791-06",,,,,,,,,"" 2019-01-23 21:13:57.789 CST,,,26359,,5c492d3b.66f7,65,,2019-01-23 21:12:59 CST,1/0,0,FATAL,XX000,"WAL ends before end of online backup",,"All WAL generated while online backup was taken must be available at recovery.",,,,,,,"" 2019-01-23 21:13:57.998 CST,,,26353,,5c492d38.66f1,2,,2019-01-23 21:12:56 CST,,0,LOG,00000,"startup process (PID 26359) exited with exit code 1",,,,,,,,,"" 2019-01-23 21:13:57.998 CST,,,26353,,5c492d38.66f1,3,,2019-01-23 21:12:56 CST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"" 2019-01-23 21:13:59.132 CST,,,26353,,5c492d38.66f1,4,,2019-01-23 21:12:56 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""
    question 
    opened by bryanconlon 33
  • Dry run implementation

    Dry run implementation

    Substitutes the PR #840 (that will be closed). Provides a possible implementation of the --dry-run command following suggestions in comments for #840. Related to issue #838 .

    module [core] enhancement 
    opened by fluca1978 30
  • ERROR: [029]: raised from local-2 protocol: invalid xml

    ERROR: [029]: raised from local-2 protocol: invalid xml

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.37

    2. PostgreSQL version: 10.15, 11.x 13.x

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: pgbackrest server : RHEL8, postgresql servers : RHEL7 and RHEL8

    4. Did you install pgBackRest from source or from a package? Package from PGDG

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)
      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)
      • errors in the postgresql log file before or during the time you experienced the issue
      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)
    6. Describe the issue: After an update from pgbackrest 2.34 to 2.37 somes servers have error "[029]: raised from local-2 protocol: invalid xml" Somes others servers haven't this error (same pgbackrest server). The postgresql servers are both RHEL7 and RHEL8 and postgresql 10, 11 and 13.

    bug module [core] 
    opened by vidierr 29
  • access pgbackrest with .pgpassfile ,

    access pgbackrest with .pgpassfile ,

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: pgBackRest 2.34

    2. PostgreSQL version: Postgres 10

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each:

    Linux ip-172-31-32-99.us-east-2.compute.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

    1. Did you install pgBackRest from source or from a package? package

    2. Please attach the following as applicable:

      • pgbackrest.conf file(s)
      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)
      • errors in the postgresql log file before or during the time you experienced the issue
      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)
    3. Describe the issue:

    I want to authenticate pgbackrest via .pgpass file instead of "trust" in hba.conf file . But while trying to check the connectivity , I got the below error message. Please help on it.

    [[email protected] ec2-user]# pgbackrest --stanza=dbtest --log-level-console=info check 2022-06-24 13:41:03.663 P00 INFO: check command begin 2.34: --exec-id=5239-43a0dcc1 --log-level-console=info --pg1-path=/opt/PostgreSQL-10/data --pg1-port=5432 --pg1-socket-path=/tmp --pg1-user=postgres --repo1-path=/opt/pgbackrest --stanza=dbtest WARN: unable to check pg-1: [DbConnectError] unable to connect to 'dbname='postgres' port=5432 user='postgres' host='/tmp'': fe_sendauth: no password supplied ERROR: [027]: no database found HINT: check indexed pg-path/pg-host configurations 2022-06-24 13:41:03.665 P00 INFO: check command end: aborted with exception [027] [[email protected] ec2-user]#

    =pgbackrest.conf:

    [email protected] pgbackrest]# cat pgbackrest.conf

    [global] repo1-path=/opt/pgbackrest repo1-retention-full=2

    [dbtest] pg1-path=/opt/PostgreSQL-10/data pg1-port=5432 pg1-user=postgres pg1-socket-path=/tmp [[email protected] pgbackrest]#

    ==DB Log mess log_message.txt age:

    2022-06-24 09:25:41.614 EDT,"postgres","postgres",4240,"[local]",62b5bb55.1090,2,"authentication",2022-06-24 09:25:41 EDT,3/22916,0,LOG,00000,"connection authorized: user=postgres database=postgres",,,,,,,,,"" 2022-06-24 09:28:15.216 EDT,"postgres","postgres",4240,"[local]",62b5bb55.1090,3,"idle",2022-06-24 09:25:41 EDT,,0,LOG,00000,"disconnection: session time: 0:02:33.602 user=postgres database=postgres host=[local]",,,,,,,,,"psql" 2022-06-24 09:28:20.503 EDT,,,4443,"[local]",62b5bbf4.115b,1,"",2022-06-24 09:28:20 EDT,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,"" 2022-06-24 09:28:47.118 EDT,,,4470,"[local]",62b5bc0f.1176,1,"",2022-06-24 09:28:47 EDT,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,"" 2022-06-24 09:41:03.664 EDT,,,5240,"[local]",62b5beef.1478,1,"",2022-06-24 09:41:03 EDT,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,""

    opened by daulat0 0
  • Multiple repo Questions

    Multiple repo Questions

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.39

    2. PostgreSQL version: 12.9

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: RHEL 7

    4. Did you install pgBackRest from source or from a package? source

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)
      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)
      • errors in the postgresql log file before or during the time you experienced the issue
      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)
    6. Describe the issue: I have multiple repo configuration. One on S3 and other on local filesystem. I want to delete some backup on local and some specific backups on S3 not the entire stanza. how to delete?

    question 
    opened by Kamal-Villupuram 11
  • Differencial backup failed

    Differencial backup failed

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: 2.39

    2. PostgreSQL version: 14.3

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: 4.1: Linux 3.10.0-1160.59.1.el7.x86_64 #1 SMP Wed Feb 23 16:47:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux 4.2: Executed on standby 4.3: stanza on GCP bucket 4.4: backup command begin 2.39: --archive-timeout=3600 --backup-standby --config=/var/lib/pgsql/pgbackrest_wal.conf --exec-id=21183-892ee11e --log-level-console=info --pg1-host= --pg2-host= --pg1-path=/pgsql/cluster/data --pg2-path=/pgsql/cluster/data --process-max=16 --repo1-bundle --repo1-cipher-pass= --repo1-cipher-type=aes-256-cbc --repo1-gcs-bucket=pgsql-us-central1-wal-xmatters-eng-prd --repo1-gcs-key= --repo1-path=/naprd3 --repo1-retention-full=8 --repo1-type=gcs --stanza=backup --start-fast --type=diff

    4. Did you install pgBackRest from source or from a package? from package

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)
      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)
      • errors in the postgresql log file before or during the time you experienced the issue
      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)
    6. Describe the issue: backup failed with: 2022-06-21 08:00:45.687 P00 INFO: check archive for prior segment 0000000400000E33000000E3 *** Error in `pgbackrest': double free or corruption (fasttop): 0x0000000000c1f160 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x81329)[0x7efcbe3e8329] pgbackrest[0x40d640] pgbackrest[0x40d72c] pgbackrest[0x406e85] pgbackrest[0x405932] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7efcbe389555] pgbackrest[0x405e53] ======= Memory map: ======== 00400000-004cc000 r-xp 00000000 08:02 17344605 /usr/bin/pgbackrest 006cb000-006cc000 r--p 000cb000 08:02 17344605 /usr/bin/pgbackrest 006cc000-006ce000 rw-p 000cc000 08:02 17344605 /usr/bin/pgbackrest 006ce000-006e8000 rw-p 00000000 00:00 0 00c1f000-06223000 rw-p 00000000 00:00 0 [heap] 7efcb4000000-7efcb4021000 rw-p 00000000 00:00 0 7efcb4021000-7efcb8000000 ---p 00000000 00:00 0 7efcba2c3000-7efcba2d8000 r-xp 00000000 08:02 33575048 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7efcba2d8000-7efcba4d7000 ---p 00015000 08:02 33575048 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7efcba4d7000-7efcba4d8000 r--p 00014000 08:02 33575048 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7efcba4d8000-7efcba4d9000 rw-p 00015000 08:02 33575048 /usr/lib64/libgcc_s-4.8.5-20150702.so.1 7efcba4d9000-7efcba4df000 r-xp 00000000 08:02 33609835 /usr/lib64/libnss_dns-2.17.so 7efcba4df000-7efcba6de000 ---p 00006000 08:02 33609835 /usr/lib64/libnss_dns-2.17.so 7efcba6de000-7efcba6df000 r--p 00005000 08:02 33609835 /usr/lib64/libnss_dns-2.17.so 7efcba6df000-7efcba6e0000 rw-p 00006000 08:02 33609835 /usr/lib64/libnss_dns-2.17.so 7efcba6e0000-7efcba6ec000 r-xp 00000000 08:02 33609837 /usr/lib64/libnss_files-2.17.so 7efcba6ec000-7efcba8eb000 ---p 0000c000 08:02 33609837 /usr/lib64/libnss_files-2.17.so 7efcba8eb000-7efcba8ec000 r--p 0000b000 08:02 33609837 /usr/lib64/libnss_files-2.17.so 7efcba8ec000-7efcba8ed000 rw-p 0000c000 08:02 33609837 /usr/lib64/libnss_files-2.17.so 7efcba8ed000-7efcba8f3000 rw-p 00000000 00:00 0 7efcba8f3000-7efcba8f5000 r-xp 00000000 08:02 33600797 /usr/lib64/libfreebl3.so 7efcba8f5000-7efcbaaf4000 ---p 00002000 08:02 33600797 /usr/lib64/libfreebl3.so 7efcbaaf4000-7efcbaaf5000 r--p 00001000 08:02 33600797 /usr/lib64/libfreebl3.so 7efcbaaf5000-7efcbaaf6000 rw-p 00002000 08:02 33600797 /usr/lib64/libfreebl3.so 7efcbaaf6000-7efcbab56000 r-xp 00000000 08:02 33609912 /usr/lib64/libpcre.so.1.2.0 7efcbab56000-7efcbad56000 ---p 00060000 08:02 33609912 /usr/lib64/libpcre.so.1.2.0 7efcbad56000-7efcbad57000 r--p 00060000 08:02 33609912 /usr/lib64/libpcre.so.1.2.0 7efcbad57000-7efcbad58000 rw-p 00061000 08:02 33609912 /usr/lib64/libpcre.so.1.2.0 7efcbad58000-7efcbad5f000 r-xp 00000000 08:02 33609849 /usr/lib64/librt-2.17.so 7efcbad5f000-7efcbaf5e000 ---p 00007000 08:02 33609849 /usr/lib64/librt-2.17.so 7efcbaf5e000-7efcbaf5f000 r--p 00006000 08:02 33609849 /usr/lib64/librt-2.17.so 7efcbaf5f000-7efcbaf60000 rw-p 00007000 08:02 33609849 /usr/lib64/librt-2.17.so 7efcbaf60000-7efcbaf68000 r-xp 00000000 08:02 33609823 /usr/lib64/libcrypt-2.17.so 7efcbaf68000-7efcbb167000 ---p 00008000 08:02 33609823 /usr/lib64/libcrypt-2.17.so 7efcbb167000-7efcbb168000 r--p 00007000 08:02 33609823 /usr/lib64/libcrypt-2.17.so 7efcbb168000-7efcbb169000 rw-p 00008000 08:02 33609823 /usr/lib64/libcrypt-2.17.so 7efcbb169000-7efcbb197000 rw-p 00000000 00:00 0 7efcbb197000-7efcbb1bb000 r-xp 00000000 08:02 33614084 /usr/lib64/libselinux.so.1

    bug 
    opened by banlex73 20
  •  pgbackrest Problems collecting all wals and sending confirmation to postgres (database at least 110TB)

    pgbackrest Problems collecting all wals and sending confirmation to postgres (database at least 110TB)

    Please provide the following information when submitting an issue (feature requests or general comments can skip this):

    1. pgBackRest version: pgBackRest 2.39

    2. PostgreSQL version: 13.3

    3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: Oracle Linux Server 7.9

    4. Did you install pgBackRest from source or from a package? Package on pgdg-common

    5. Please attach the following as applicable:

      • pgbackrest.conf file(s)
      • postgresql.conf settings applicable to pgBackRest (archive_command, archive_mode, listen_addresses, max_wal_senders, wal_level, port)
      • errors in the postgresql log file before or during the time you experienced the issue
      • log file in /var/log/pgbackrest for the commands run (e.g. /var/log/pgbackrest/mystanza_backup.log)

    postgresql.conf archive_command: pgbackrest --stanza=prod_backup archive-push /var/lib/pgsql/13/data/%p archive_mode: 'on' listen_addresses='*' max_wal_senders: 10 wal_level: replica port:5432

    pgbackrest.conf

    [global] repo1-host=repo01 repo1-host-user=postgres repo1-path=/backups/pgsql_lc/pgbackrest process-max=10 start-fast=y compress-level=1 archive-async=y spool-path=/var/spool/pgbackrest log-path=/var/log/pgbackrest/ log-level-file=info log-level-stderr=error log-level-console=error

    [prod_backup] pg1-path=/var/lib/pgsql/13/data

    [global:archive-push] process-max=10 compress-level=1

    1. Describe the issue:

    Hi

    My question:

    I have a Postgresql cluster that currently stores 110TB of data. As the database in question is very large for security we store 24h of wal file inside pg_wal and from time to time we delete these wals (after they have already been applied to the replica), we know that this is not the best way to managements the wals, but as the load on this database is very large, for now this was the only alternative found to have some security.

    Here is some information about our pg_wal

    Current status of the database • Current base size – 110TB • Current size of pg_wal – 7.78TB • Retention of wal files – 24h • Average wal files per minute - 622 • Average number of wal files per day – 33,024 • Total wal files inside pg_wal (not considering archive_status) – 518,689

    To make a full backup via pgbackrest, we have the need to bring the entire range of wals necessary to make the backup consistent, the problem is that both the synchronous and asynchronous approach of pgbackrest find situations that cause us to slow down and consequently hinder us a lot and they seem to make backing up the base unfeasible, but I want to point out that the problem is not pgbackrest, it seems to be more the current settings and the size of our base.

    First Approach used to collect wal files - Synchronous • A thread to collect a wal file and send an “OK” informing postgres that that thread has been archived (.ready to .done).

    Problems: Speed ​​of collecting wals much slower than generating wals Inconsistent full backup due to lack of wals Slowness due to using a thread to collect several files, in addition to confirmation to postgres to end the collection process.

    Second Approach used to collect wal files – Asynchronous • Configuration set to 10 threads to collect the available wals, the parallelized wals collection gave us a lot of agility, but in the postgres confirmation step, only one thread informs that that segment was archived (.ready to .done) which ends up taking time, even if the confirmation of each file takes 0ms, 1ms and 5ms (maximum) the amount is so large that it ends up taking a long time to finish confirming everything that was collected.

    o Problems: Very good fetching speed, but slow postgres archiving speed Inconsistent full backup due to lack of wals

    Example scenario: Using 10 threads and an asynchronous approach we collected in 2h a total of 100 thousand wal files, but it took many more than 2 hours for postgres to confirm the archiving of the 100 thousand segments collected in the first execution, with that, the next execution that should be faster ended up with more accumulated wals and the agility we had in the collection ended up being inefficient.

    What we thought to solve the problem: We thought of reducing the retention of wals from 24h to 1h or less, so there would be less files to collect and less confirmation time which can help us solve the problem, but before proceeding with this action or any other that appears, we We would like to know if you know of any alternatives and have any suggestions to give us, would it be possible to evaluate our case within your availability?

    question 
    opened by mazocollo 5
  • Backup to S3 in different account

    Backup to S3 in different account

    When EC2 instance role is being used (repo1-s3-key-type=auto) but S3 is in a different account, it's not possible to assume the remote role, becuase repo1-s3-role only takes the role's name. One solution could be to also allow passing the full arn (or adding a repo1-s3-role-arn new parameter). https://aws.amazon.com/pt/premiumsupport/knowledge-center/s3-instance-access-bucket/

    opened by ManuelPombo 0
  • Add retry to storagePosixPathRemove().

    Add retry to storagePosixPathRemove().

    Add a retries to handle the case where readdir() skips over existing files when files are simultaneously being deleted.

    This is a bit of a band aid. There are lots of other places where this readdir() issue could bite us.

    NOTE: this could use better testing. There is 100% coverage because of the way to code is written, but a regression of the retry would likely not be detected.

    question enhancement 
    opened by dwsteele 0
Releases(release/2.39)
  • release/2.39(May 16, 2022)

    Bug Fixes:

    • Fix error thrown from FINALLY() causing an infinite loop. (Reviewed by Stephen Frost.)
    • Error on all lock failures except another process holding the lock. (Reviewed by Reid Thompson, Geir Råness. Reported by Geir Råness.)

    Features:

    • Backup file bundling for improved small file support. (Reviewed by Reid Thompson, Stefan Fercot, Chris Bandy.)
    • Verify command to validate the contents of a repository. (Contributed by Cynthia Shang, Reid Thompson. Reviewed by David Steele, Stefan Fercot.)
    • PostgreSQL 15 support. (Reviewed by Stefan Fercot.)
    • Show backup percent complete in info output. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Auto-select backup for restore command --type=lsn. (Contributed by Reid Thompson. Reviewed by Stefan Fercot, David Steele.)
    • Suppress existing WAL warning when archive-mode-check is disabled. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Add AWS IMDSv2 support. (Contributed by Nuno Pires. Reviewed by David Steele.)

    Improvements:

    • Allow repo-hardlink option to be changed after full backup. (Reviewed by Reid Thompson.)
    • Increase precision of percent complete logging for backup and restore. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Improve path validation for repo-* commands. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Improve stop command to honor stanza option. (Contributed by Reid Thompson. Reviewed by David Steele. Suggested by ragaoua.)
    • Improve error message for invalid repo-azure-key. (Contributed by Reid Thompson. Reviewed by David Steele. Suggested by Seth Daniel.)
    • Add hint to check the log on archive-get/archive-push async error. (Reviewed by Reid Thompson.)
    • Add ClockError for unexpected clock skew and timezone changes. (Reviewed by Greg Sabino Mullane, Stefan Fercot. Suggested by Greg Sabino Mullane.)
    • Strip extensions from history manifest before showing in error message. (Reviewed by Stefan Fercot.)
    • Add user:group to lock permission error. (Reviewed by Reid Thompson.)

    Documentation Bug Fixes:

    • Fix incorrect reference to stanza-update in the user guide. (Fixed by Abubakar Mohammed. Reviewed by David Steele.)
    • Fix example for repo-gcs-key-type option in configuration reference. (Reviewed by Reid Thompson.)
    • Fix tls-server-auth example and add clarifications. (Reviewed by Reid Thompson.)

    Documentation Improvements:

    • Simplify messaging around supported versions in the documentation. (Reviewed by Stefan Fercot, Reid Thompson, Greg Sabino Mullane.)
    • Add option type descriptions. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Add FAQ about backup types and restore speed. (Contributed by David Christensen. Reviewed by Reid Thompson.)
    • Document required base branch for pull requests. (Contributed by David Christensen. Reviewed by Reid Thompson.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.38(Mar 6, 2022)

    IMPORTANT NOTE: Repository size reported by the info command is now entirely based on what pgBackRest has written to storage. Previously, in certain cases, pgBackRest could detect if additional compression was being applied by the storage but this is no longer supported.

    Bug Fixes:

    • Retry errors in S3 batch file delete. (Reviewed by Reid Thompson. Reported by Alex Richman.)
    • Allow case-insensitive matching of HTTP connection header values. (Reviewed by Reid Thompson. Reported by Rémi Vidier.)

    Features:

    • Add support for AWS S3 server-side encryption using KMS. (Contributed by Christoph Berg. Reviewed by David Steele, Tharindu Amila.)
    • Add archive-missing-retry option. (Reviewed by Stefan Fercot.)
    • Add backup type filter to info command. (Contributed by Stefan Fercot. Reviewed by David Steele.)

    Improvements:

    • Retry on page validation failure during backup. (Reviewed by Stephen Frost, David Christensen.)
    • Handle TLS servers that do not close connections gracefully. (Reviewed by Rémi Vidier, David Christensen, Stephen Frost.)
    • Add backup LSNs to info command output. (Contributed by Stefan Fercot. Reviewed by David Steele.)
    • Automatically strip trailing slashes for repo-ls paths. (Contributed by David Christensen. Reviewed by David Steele.)
    • Do not retry fatal errors. (Reviewed by Reid Thompson.)
    • Remove support for PostgreSQL 8.3/8.4. (Reviewed by Reid Thompson, Stefan Fercot.)
    • Remove logic that tried to determine additional file system compression. (Reviewed by Reid Thompson, Stefan Fercot.)

    Documentation Bug Fixes:

    • Move repo options in TLS documentation to the global section. (Reported by Anton Kurochkin.)
    • Remove unused backup-standby option from stanza commands. (Reported by Stefan Fercot.)
    • Fix typos in help and release notes. (Fixed by Daniel Gustafsson. Reviewed by David Steele.)

    Documentation Improvements:

    • Add aliveness check to systemd service configuration. (Suggested by Yogesh Sharma.)
    • Add FAQ explaining WAL archive suffix. (Contributed by Stefan Fercot. Reviewed by David Steele.)
    • Note that replications slots are not restored. (Contributed by Reid Thompson. Reviewed by David Steele, Stefan Fercot. Suggested by Christophe Courtois.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.37(Jan 3, 2022)

    IMPORTANT NOTE: If the restore command is unable to find a backup that matches a specified time target then an error will be thrown, whereas before a warning was logged.

    Bug Fixes:

    • Fix restore delta link mapping when path/file already exists. (Reviewed by Reid Thompson. Reported by Younes Alhroub.)
    • Fix socket leak on connection retries. (Reviewed by Reid Thompson. Reported by James Coleman.)

    Features:

    • Add TLS server. (Reviewed by Stephen Frost, Reid Thompson, Andrew L'Ecuyer.)
    • Add --cmd option. (Contributed by Reid Thompson. Reviewed by Stefan Fercot, David Steele. Suggested by Virgile CREVON.)

    Improvements:

    • Check archive immediately after backup start. (Reviewed by Reid Thompson, David Christensen.)
    • Add timeline and checkpoint checks to backup. (Reviewed by Stefan Fercot, Reid Thompson.)
    • Check that clusters are alive and correctly configured during a backup. (Reviewed by Stefan Fercot.)
    • Error when restore is unable to find a backup to match the time target. (Reviewed by Reid Thompson, Douglas J Hunley. Suggested by Douglas J Hunley.)
    • Parse protocol/port in S3/Azure endpoints. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Add warning when checkpoint_timeout exceeds db-timeout. (Contributed by Stefan Fercot. Reviewed by David Steele.)
    • Add verb to HTTP error output. (Contributed by Christoph Berg. Reviewed by David Steele.)
    • Allow y/n arguments for boolean command-line options. (Contributed by Reid Thompson. Reviewed by David Steele.)
    • Make backup size logging exactly match info command output. (Contributed by Reid Thompson. Reviewed by David Steele. Suggested by Mahomed Hussein.)

    Documentation Improvements:

    • Display size option default and allowed values with appropriate units. (Reviewed by Reid Thompson.)
    • Fix typos and improve documentation for the tablespace-map-all option. (Reviewed by Reid Thompson. Suggested by Reid Thompson.)
    • Remove obsolete statement about future multi-repository support. (Suggested by David Christensen.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.36(Nov 1, 2021)

    Bug Fixes:

    • Allow "global" as a stanza prefix. (Reviewed by Stefan Fercot. Reported by Younes Alhroub.)
    • Fix segfault on invalid GCS key file. (Reviewed by Stephen Frost. Reported by Henrik Feldt.)

    Improvements:

    • Allow link-map option to create new links. (Reviewed by Don Seiler, Stefan Fercot, Chris Bandy. Suggested by Don Seiler.)
    • Increase max index allowed for pg/repo options to 256. (Reviewed by Cynthia Shang.)
    • Add WebIdentity authentication for AWS S3. (Reviewed by James Callahan, Reid Thompson, Benjamin Blattberg, Andrew L'Ecuyer.)
    • Report backup file validation errors in backup.info. (Contributed by Stefan Fercot. Reviewed by David Steele.)
    • Add recovery start time to online backup restore log. (Reviewed by Tom Swartz, Stefan Fercot. Suggested by Tom Swartz.)
    • Report original error and retries on local job failure. (Reviewed by Stefan Fercot.)
    • Rename page checksum error to error list in info text output. (Reviewed by Stefan Fercot.)
    • Add hints to standby replay timeout message. (Reviewed by Cynthia Shang, Stefan Fercot. Suggested by Leigh Downs.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.35(Aug 23, 2021)

    IMPORTANT NOTE: The log level for copied files in the backup/restore commands has been changed to detail. This makes the info log level less noisy but if these messages are required then set the log level for the backup/restore commands to detail.

    Bug Fixes:

    • Detect errors in S3 multi-part upload finalize. (Reviewed by Cynthia Shang, Marco Montagna. Reported by Marco Montagna, Lev Kokotov, Anderson A. Mallmann.)
    • Fix detection of circular symlinks. (Reviewed by Stefan Fercot. Reported by Rohit Raveendran.)
    • Only pass selected repo options to the remote. (Reviewed by David Christensen, Cynthia Shang. Reported by Greg Sabino Mullane, David Christensen.)

    Improvements:

    • Binary protocol. (Reviewed by Cynthia Shang.)
    • Automatically create data directory on restore. (Contributed by Stefan Fercot. Reviewed by David Steele. Suggested by Chris Bandy.)
    • Allow restore --type=lsn. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang. Suggested by James Coleman.)
    • Change level of backup/restore copied file logging to detail. (Reviewed by Stefan Fercot. Suggested by Jens Wilke.)
    • Loop while waiting for checkpoint LSN to reach replay LSN. (Contributed by Stefan Fercot. Reviewed by David Steele. Suggested by Fatih Mencutekin.)
    • Log backup file total and restore size/file total. (Reviewed by Cynthia Shang.)

    Documentation Bug Fixes:

    • Fix incorrect host names in user guide. (Reviewed by Stefan Fercot. Reported by Greg Sabino Mullane.)

    Documentation Improvements:

    • Update contributing documentation and add pull request template. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Rearrange backup documentation in user guide. (Reviewed by Cynthia Shang.)
    • Clarify restore --type behavior in command reference. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Fix documentation and comment typos. (Contributed by Eric Radman. Reviewed by David Steele.)

    Test Suite Improvements:

    • Add check for test path inside repo path. (Reviewed by Greg Sabino Mullane. Suggested by Greg Sabino Mullane.)
    • Add CodeQL static code analysis. (Reviewed by Cynthia Shang.)
    • Update tests to use standard patterns. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.34(Jun 7, 2021)

    Bug Fixes:

    • Fix issues with leftover spool files from a prior restore. (Reviewed by Cynthia Shang, Stefan Fercot, Floris van Nee. Reported by Floris van Nee.)
    • Fix issue when checking links for large numbers of tablespaces. (Reviewed by Cynthia Shang, Avinash Vallarapu. Reported by Avinash Vallarapu.)
    • Free no longer needed remotes so they do not timeout during restore. (Reviewed by Cynthia Shang. Reported by Francisco Miguel Biete.)
    • Fix help when a valid option is invalid for the specified command. (Reviewed by Stefan Fercot. Reported by Cynthia Shang.)

    Features:

    • Add PostgreSQL 14 support. (Reviewed by Cynthia Shang.)
    • Add automatic GCS authentication for GCE instances. (Reviewed by Jan Wieck, Daniel Farina.)
    • Add repo-retention-history option to expire backup history. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang, David Steele.)
    • Add db-exclude option. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)

    Improvements:

    • Change archive expiration logging from detail to info level. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Remove stanza archive spool path on restore. (Reviewed by Cynthia Shang, Stefan Fercot.)
    • Do not write files atomically or sync paths during backup copy. (Reviewed by Stephen Frost, Stefan Fercot, Cynthia Shang.)

    Documentation Improvements:

    • Update contributing documentation. (Contributed by Cynthia Shang. Reviewed by David Steele, Stefan Fercot.)
    • Consolidate RHEL/CentOS user guide into a single document. (Reviewed by Cynthia Shang.)
    • Clarify that repo-s3-role is not an ARN. (Contributed by Isaac Yuen. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.33(Apr 5, 2021)

    Bug Fixes:

    • Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.)
    • Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.)
    • Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.)
    • Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.)

    Features:

    • Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.)
    • GCS support for repository storage. (Reviewed by Cynthia Shang.)
    • Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.)

    Improvements:

    • Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)
    • Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.)
    • Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.)
    • Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.)
    • Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.)

    Documentation Improvements:

    • Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.)
    • Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.)
    • Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.32(Feb 8, 2021)

    Bug Fixes:

    • Fix resume after partial delete of backup by prior resume. (Reviewed by Cynthia Shang. Reported by Tom Swartz.)

    Features:

    • Add repo-ls command. (Reviewed by Cynthia Shang, Stefan Fercot.)
    • Add repo-get command. (Contributed by Stefan Fercot, David Steele. Reviewed by Cynthia Shang.)
    • Add archive-mode-check option. (Contributed by Stefan Fercot. Reviewed by David Steele, Michael Banck.)

    Improvements:

    • Improve archive-get performance. (Reviewed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.31(Dec 7, 2020)

    Bug Fixes:

    • Allow [, #, and space as the first character in database names. (Reviewed by Stefan Fercot, Cynthia Shang. Reported by Jefferson Alexandre.)
    • Create standby.signal only on PostgreSQL 12 when restore type is standby. (Fixed by Stefan Fercot. Reviewed by David Steele. Reported by Keith Fiske.)

    Features:

    • Expire history files. (Contributed by Stefan Fercot. Reviewed by David Steele.)
    • Report page checksum errors in info command text output. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)
    • Add repo-azure-endpoint option. (Reviewed by Cynthia Shang, Brian Peterson. Suggested by Brian Peterson.)
    • Add pg-database option. (Reviewed by Cynthia Shang.)

    Improvements:

    • Improve info command output when a stanza is specified but missing. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang, David Steele. Suggested by uspen.)
    • Improve performance of large file lists in backup/restore commands. (Reviewed by Cynthia Shang, Oscar.)
    • Add retries to PostgreSQL sleep when starting a backup. (Reviewed by Cynthia Shang. Suggested by Vitaliy Kukharik.)

    Documentation Improvements:

    • Replace RHEL/CentOS 6 documentation with RHEL/CentOS 8.
    Source code(tar.gz)
    Source code(zip)
  • release/2.30(Oct 5, 2020)

    Bug Fixes:

    • Error with hints when backup user cannot read pg_settings. (Reviewed by Stefan Fercot, Cynthia Shang. Reported by Mohamed Insaf K.)

    Features:

    • PostgreSQL 13 support. (Reviewed by Cynthia Shang.)

    Improvements:

    • Improve PostgreSQL version identification. (Reviewed by Cynthia Shang, Stephen Frost.)
    • Improve working directory error message. (Reviewed by Stefan Fercot.)
    • Add hint about starting the stanza when WAL segment not found. (Contributed by David Christensen. Reviewed by David Steele.)
    • Add hint for protocol version mismatch. (Reviewed by Cynthia Shang. Suggested by loop-evgeny.)

    Documentation Improvements:

    • Add note that pgBackRest versions must match when running remotely. (Reviewed by Cynthia Shang. Suggested by loop-evgeny.)
    • Move info command text to the reference and link to user guide. (Reviewed by Cynthia Shang. Suggested by Christophe Courtois.)
    • Update yum repository path for CentOS/RHEL user guide. (Contributed by Heath Lord. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.29(Aug 31, 2020)

    Bug Fixes:

    • Suppress errors when closing local/remote processes. Since the command has completed it is counterproductive to throw an error but still warn to indicate that something unusual happened. (Reviewed by Cynthia Shang. Reported by argdenis.)
    • Fix issue with = character in file or database names. (Reviewed by Bastian Wegge, Cynthia Shang. Reported by Brad Nicholson, Bastian Wegge.)

    Features:

    • Automatically retrieve temporary S3 credentials on AWS instances. (Contributed by David Steele, Stephen Frost. Reviewed by Cynthia Shang, David Youatt, Aleš Zelený, Jeanette Bromage.)
    • Add archive-mode option to disable archiving on restore. (Reviewed by Stephen Frost. Suggested by Stephen Frost.)

    Improvements:

    • PostgreSQL 13 beta3 support. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace.
    • Asynchronous list/remove for S3/Azure storage. (Reviewed by Cynthia Shang, Stephen Frost.)
    • Improve memory usage of unlogged relation detection in manifest build. (Reviewed by Cynthia Shang, Stephen Frost, Brad Nicholson, Oscar. Suggested by Oscar, Brad Nicholson.)
    • Proactively close file descriptors after forking async process. (Reviewed by Stephen Frost, Cynthia Shang.)
    • Delay backup remote connection close until after archive check. (Contributed by Floris van Nee. Reviewed by David Steele.)
    • Improve detailed error output. (Reviewed by Cynthia Shang.)
    • Improve TLS error reporting. (Reviewed by Cynthia Shang, Stephen Frost.)

    Documentation Bug Fixes:

    • Add none to compress-type option reference and fix example. (Reported by Ugo Bellavance, Don Seiler.)
    • Add missing azure type in repo-type option reference. (Fixed by Don Seiler. Reviewed by David Steele.)
    • Fix typo in repo-cipher-type option reference. (Fixed by Don Seiler. Reviewed by David Steele.)

    Documentation Improvements:

    • Clarify that expire must be run regularly when expire-auto is disabled. (Reviewed by Douglas J Hunley. Suggested by Douglas J Hunley.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.28(Jul 20, 2020)

    Bug Fixes:

    • Fix restore --force acting like --force --delta. This caused restore to replace files based on timestamp and size rather than overwriting, which meant some files that should have been updated were left unchanged. Normal restore and restore --delta were not affected by this issue. (Reviewed by Cynthia Shang.)

    Features:

    • Azure support for repository storage. (Reviewed by Cynthia Shang, Don Seiler.)
    • Add expire-auto option. This allows automatic expiration after a successful backup to be disabled. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang, David Steele.)

    Improvements:

    • Asynchronous S3 multipart upload. (Reviewed by Stephen Frost.)
    • Automatic retry for backup, restore, archive-get, and archive-push. (Reviewed by Cynthia Shang.)
    • Disable query parallelism in PostgreSQL sessions used for backup control. (Reviewed by Stefan Fercot.)
    • PostgreSQL 13 beta2 support. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace.
    • Improve handling of invalid HTTP response status. (Reviewed by Cynthia Shang.)
    • Improve error when pg1-path option missing for archive-get command. (Reviewed by Cynthia Shang.)
    • Add hint when checksum delta is enabled after a timeline switch. (Reviewed by Matt Bunter, Cynthia Shang.)
    • Use PostgreSQL instead of postmaster where appropriate. (Reviewed by Cynthia Shang.)

    Documentation Bug Fixes:

    • Fix incorrect example for repo-retention-full-type option. (Reported by Höseyin Sönmez.)
    • Remove internal commands from HTML and man command references. (Reported by Cynthia Shang.)

    Documentation Improvements:

    • Update PostgreSQL versions used to build user guides. Also add version ranges to indicate that a user guide is accurate for a range of PostgreSQL versions even if it was built for a specific version. (Reviewed by Stephen Frost.)
    • Update FAQ for expiring a specific backup set. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Update FAQ to clarify default PITR behavior. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.27(May 26, 2020)

    Bug Fixes:

    • Fix issue checking if file links are contained in path links. (Reviewed by Cynthia Shang. Reported by Christophe Cavallié.)
    • Allow pg-path1 to be optional for synchronous archive-push. (Reviewed by Cynthia Shang. Reported by Jerome Peng.)
    • The expire command now checks if a stop file is present. (Fixed by Cynthia Shang. Reviewed by David Steele.)
    • Handle missing reason phrase in HTTP response. (Reviewed by Cynthia Shang. Reported by Tenuun.)
    • Increase buffer size for lz4 compression flush. (Reviewed by Cynthia Shang. Reported by Eric Radman.)
    • Ignore pg-host* and repo-host* options for the remote command. (Reviewed by Cynthia Shang. Reported by Pavel Suderevsky.)
    • Fix possibly missing pg1-* options for the remote command. (Reviewed by Cynthia Shang. Reported by Andrew L'Ecuyer.)

    Features:

    • Time-based retention for full backups. The --repo-retention-full-type option allows retention of full backups based on a time period, specified in days. (Contributed by Cynthia Shang, Pierre Ducroquet. Reviewed by David Steele.)
    • Ad hoc backup expiration. Allow the user to remove a specified backup regardless of retention settings. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Zstandard compression support. Note that setting compress-type=zst will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest. (Reviewed by Cynthia Shang.)
    • bzip2 compression support. Note that setting compress-type=bz2 will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest. (Contributed by Stephen Frost. Reviewed by David Steele, Cynthia Shang.)
    • Add backup/expire running status to the info command. (Contributed by Stefan Fercot. Reviewed by David Steele.)

    Improvements:

    • Expire WAL archive only when repo-retention-archive threshold is met. WAL prior to the first full backup was previously expired after the first full backup. Now it is preserved according to retention settings. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    • Add local MD5 implementation so S3 works when FIPS is enabled. (Reviewed by Cynthia Shang, Stephen Frost. Suggested by Brian Almeida, John Kelley.)
    • PostgreSQL 13 beta1 support. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace. (Reviewed by Cynthia Shang.)
    • Reduce buffer-size default to 1MiB. (Reviewed by Stephen Frost.)
    • Throw user-friendly error if expire is not run on repository host. (Contributed by Cynthia Shang. Reviewed by David Steele.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.26(Apr 20, 2020)

    Bug Fixes:

    • Remove empty subexpression from manifest regular expression. MacOS was not happy about this though other platforms seemed to work fine. (Fixed by David Raftis.)

    Improvements:

    • Non-blocking TLS implementation. (Reviewed by Slava Moudry, Cynthia Shang, Stephen Frost.)
    • Only limit backup copy size for WAL-logged files. The prior behavior could possibly lead to postgresql.conf or postgresql.auto.conf being truncated in the backup. (Reviewed by Cynthia Shang.)
    • TCP keep-alive options are configurable. (Suggested by Marc Cousin.)
    • Add io-timeout option.
    Source code(tar.gz)
    Source code(zip)
  • release/2.25(Mar 26, 2020)

    Features:

    • Add lz4 compression support. Note that setting compress-type=lz4 will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest. (Reviewed by Cynthia Shang.)
    • Add --dry-run option to the expire command. Use dry-run to see which backups/archive would be removed by the expire command without actually removing anything. (Contributed by Cynthia Shang, Luca Ferrari.)

    Improvements:

    • Improve performance of remote manifest build. (Suggested by Jens Wilke.)
    • Fix detection of keepalive options on Linux. (Contributed by Marc Cousin.)
    • Add configure host detection to set standards flags correctly. (Contributed by Marc Cousin.)
    • Remove compress/compress-level options from commands where unused. These commands (e.g. restore, archive-get) never used the compress options but allowed them to be passed on the command line. Now they will error when these options are passed on the command line. If these errors occur then remove the unused options. (Reviewed by Cynthia Shang.)
    • Limit backup file copy size to size reported at backup start. If a file grows during the backup it will be reconstructed by WAL replay during recovery so there is no need to copy the additional data. (Reviewed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.24(Feb 26, 2020)

    Bug Fixes:

    • Prevent defunct processes in asynchronous archive commands. (Reviewed by Stephen Frost. Reported by Adam Brusselback, ejberdecia.)
    • Error when archive-get/archive-push/restore are not run on a PostgreSQL host. (Reviewed by Stephen Frost. Reported by Jesper St John.)
    • Read HTTP content to eof when size/encoding not specified. (Reviewed by Cynthia Shang. Reported by Christian ROUX.)
    • Fix resume when the resumable backup was created by Perl. In this case the resumable backup should be ignored, but the C code was not able to load the partial manifest written by Perl since the format differs slightly. Add validations to catch this case and continue gracefully. (Reported by Kacey Holston.)

    Features:

    • Auto-select backup set on restore when time target is specified. Auto-selection is performed only when --set is not specified. If a backup set for the given target time cannot not be found, the latest (default) backup set will be used. (Contributed by Cynthia Shang.)

    Improvements:

    • Skip pg_internal.init temp file during backup. (Reviewed by Cynthia Shang. Suggested by Michael Paquier.)
    • Add more validations to the manifest on backup. (Reviewed by Cynthia Shang.)

    Documentation Improvements:

    • Prevent lock-bot from adding comments to locked issues. (Suggested by Christoph Berg.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.23(Jan 27, 2020)

    Bug Fixes:

    • Fix missing files corrupting the manifest. If a file was removed by PostgreSQL during the backup (or was missing from the standby) then the next file might not be copied and updated in the manifest. If this happened then the backup would error when restored. (Reviewed by Cynthia Shang. Reported by Vitaliy Kukharik.)

    Improvements:

    • Use pkg-config instead of xml2-config for libxml2 build options. (Contributed by David Steele, Adrian Vondendriesch.)
    • Validate checksums are set in the manifest on backup/restore. (Reviewed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.22(Jan 21, 2020)

    Bug Fixes:

    • Fix error in timeline conversion. The timeline is required to verify WAL segments in the archive after a backup. The conversion was performed base 10 instead of 16, which led to errors when the timeline was ≥ 0xA. (Reported by Lukas Ertl, Eric Veldhuyzen.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.21(Jan 15, 2020)

    Bug Fixes:

    • Fix options being ignored by asynchronous commands. The asynchronous archive-get/archive-push processes were not loading options configured in command configuration sections, e.g. [global:archive-get]. (Reviewed by Cynthia Shang. Reported by Urs Kramer.)
    • Fix handling of \ in filenames. \ was not being properly escaped when calculating the manifest checksum which prevented the manifest from loading. Since instances of \ in cluster filenames should be rare to nonexistent this does not seem likely to be a serious problem in the field.

    Features:

    • pgBackRest is now pure C.
    • Add pg-user option. Specifies the database user name when connecting to PostgreSQL. If not specified pgBackRest will connect with the local OS user or PGUSER, which was the previous behavior. (Contributed by Mike Palmiotto.)
    • Allow path-style URIs in S3 driver.

    Improvements:

    • The backup command is implemented entirely in C. (Reviewed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.20(Dec 12, 2019)

    Bug Fixes:

    • Fix archive-push/archive-get when PGDATA is symlinked. These commands tried to use cwd() as PGDATA but this would disagree with the path configured in pgBackRest if PGDATA was symlinked. If cwd() does not match the pgBackRest path then chdir() to the path and make sure the next cwd() matches the result from the first call. (Reported by Stephen Frost, Milosz Suchy.)
    • Fix reference list when backup.info is reconstructed in expire command. Since the backup command is still using the Perl version of reconstruct this issue will not express unless 1) there is a backup missing from backup.info and 2) the expire command is run directly instead of running after backup as usual. This unlikely combination of events means this is probably not a problem in the field.
    • Fix segfault on unexpected EOF in gzip decompression. (Reported by Stephen Frost.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.19(Nov 12, 2019)

    Bug Fixes:

    • Fix remote timeout in delta restore. When performing a delta restore on a largely unchanged cluster the remote could timeout if no files were fetched from the repository within protocol-timeout. Add keep-alives to prevent remote timeout. (Reported by James Sewell, Jens Wilke.)
    • Fix handling of repeated HTTP headers. When HTTP headers are repeated they should be considered equivalent to a single comma-separated header rather than generating an error, which was the prior behavior. (Reported by donicrosby.)

    Improvements:

    • JSON output from the info command is no longer pretty-printed. Monitoring systems can more easily ingest the JSON without linefeeds. External tools such as jq can be used to pretty-print if desired. (Contributed by Cynthia Shang.)
    • The check command is implemented entirely in C. (Contributed by Cynthia Shang.)

    Documentation Improvements:

    • Document how to contribute to pgBackRest. (Contributed by Cynthia Shang.)
    • Document maximum version for auto-stop option. (Contributed by Brad Nicholson.)

    Test Suite Improvements:

    • Fix container test path being used when --vm=none. (Suggested by Stephen Frost.)
    • Fix mismatched timezone in expect test. (Suggested by Stephen Frost.)
    • Don't autogenerate embedded libc code by default. (Suggested by Stephen Frost.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.18(Oct 1, 2019)

    Features:

    • PostgreSQL 12 support.
    • Add info command set option for detailed text output. The additional details include databases that can be used for selective restore and a list of tablespaces and symlinks with their default destinations. (Contributed by Cynthia Shang. Suggested by Stephen Frost, ejberdecia.)
    • Add standby restore type. This restore type automatically adds standby_mode=on to recovery.conf for PostgreSQL < 12 and creates standby.signal for PostgreSQL ≥ 12, creating a common interface between PostgreSQL versions. (Reviewed by Cynthia Shang.)

    Improvements:

    • The restore command is implemented entirely in C. (Reviewed by Cynthia Shang.)

    Documentation Improvements:

    • Document the relationship between db-timeout and protocol-timeout. (Contributed by Cynthia Shang. Suggested by James Chanco Jr.)
    • Add documentation clarifications regarding standby repositories. (Contributed by Cynthia Shang.)
    • Add FAQ for time-based Point-in-Time Recovery. (Contributed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.17(Sep 3, 2019)

    Bug Fixes:

    • Improve slow manifest build for very large quantities of tables/segments. (Reported by Jens Wilke.)
    • Fix exclusions for special files. (Reported by CluelessTechnologist, Janis Puris, Rachid Broum.)

    Improvements:

    • The stanza-create/update/delete commands are implemented entirely in C. (Contributed by Cynthia Shang.)
    • The start/stop commands are implemented entirely in C. (Contributed by Cynthia Shang.)
    • Create log directories/files with 0750/0640 mode. (Suggested by Damiano Albani.)

    Documentation Bug Fixes:

    • Fix yum.p.o package being installed when custom package specified. (Reported by Joe Ayers, John Harvey.)

    Documentation Improvements:

    • Build pgBackRest as an unprivileged user. (Suggested by Laurenz Albe.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.16(Aug 5, 2019)

    Bug Fixes:

    • Retry S3 RequestTimeTooSkewed errors instead of immediately terminating. (Reported by sean0101n, Tim Garton, Jesper St John, Aleš Zelený.)
    • Fix incorrect handling of transfer-encoding response to HEAD request. (Reported by Pavel Suderevsky.)
    • Fix scoping violations exposed by optimizations in gcc 9. (Reported by Christian Lange, Ned T. Crigler.)

    Features:

    • Add repo-s3-port option for setting a non-standard S3 service port.

    Improvements:

    • The local command for backup is implemented entirely in C. (Contributed by David Steele, Cynthia Shang.)
    • The check command is implemented partly in C. (Reviewed by Cynthia Shang.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.15.1(Jun 27, 2019)

    Bug Fixes:

    • Fix archive retention expiring too aggressively. (Fixed by Cynthia Shang. Reported by Mohamad El-Rifai.)

    Improvements:

    • The expire command is implemented entirely in C. (Contributed by Cynthia Shang.)
    • The local command for restore is implemented entirely in C.
    • Remove hard-coded PostgreSQL user so $PGUSER works. (Suggested by Julian Zhang, Janis Puris.)
    • Honor configure --prefix option. (Suggested by Daniel Westermann.)
    • Rename repo-s3-verify-ssl option to repo-s3-verify-tls. The new name is preferred because pgBackRest does not support any SSL protocol versions (they are all considered to be insecure). The old name will continue to be accepted.

    Documentation Improvements:

    • Add FAQ to the documentation. (Contributed by Cynthia Shang.)
    • Use wal_level=replica in the documentation for PostgreSQL ≥ 9.6. (Suggested by Patrick McLaughlin.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.14(May 20, 2019)

    Bug Fixes:

    • Fix segfault when process-max > 8 for archive-push/archive-get. (Reported by Jens Wilke.)

    Improvements:

    • Bypass database checks when stanza-delete issued with force. (Contributed by Cynthia Shang. Suggested by hatifnatt.)
    • Add configure script for improved multi-platform support.

    Documentation Features:

    • Add user guides for CentOS/RHEL 6/7.
    Source code(tar.gz)
    Source code(zip)
  • release/2.13(Apr 19, 2019)

    Bug Fixes:

    • Fix zero-length reads causing problems for IO filters that did not expect them. (Reported by brunre01, jwpit, Tomasz Kontusz, guruguruguru.)
    • Fix reliability of error reporting from local/remote processes.
    • Fix Posix/CIFS error messages reporting the wrong filename on write/sync/close.
    Source code(tar.gz)
    Source code(zip)
  • release/2.12(Apr 11, 2019)

    IMPORTANT NOTE: The new TLS/SSL implementation forbids dots in S3 bucket names per RFC-2818. This security fix is required for compliant hostname verification.

    Bug Fixes:

    • Fix issues when a path option is / terminated. (Reported by Marc Cousin.)
    • Fix issues when log-level-file=off is set for the archive-get command. (Reported by Brad Nicholson.)
    • Fix C code to recognize host:port option format like Perl does. (Reported by Kyle Nevins.)
    • Fix issues with remote/local command logging options.

    Improvements:

    • The archive-push command is implemented entirely in C.
    • Increase process-max limit to 999. (Suggested by Rakshitha-BR.)
    • Improve error message when an S3 bucket name contains dots.

    Documentation Improvements:

    • Clarify that S3-compatible object stores are supported. (Suggested by Magnus Hagander.)
    Source code(tar.gz)
    Source code(zip)
  • release/2.11(Mar 11, 2019)

    Bug Fixes:

    • Fix possible truncated WAL segments when an error occurs mid-write. (Reported by blogh.)
    • Fix info command missing WAL min/max when stanza specified. (Fixed by Stefan Fercot.)
    • Fix non-compliant JSON for options passed from C to Perl. (Reported by Leo Khomenko.)

    Improvements:

    • The archive-get command is implemented entirely in C.
    • Enable socket keep-alive on older Perl versions. (Contributed by Marc Cousin.)
    • Error when parameters are passed to a command that does not accept parameters. (Suggested by Jason O'Donnell.)
    • Add hints when unable to find a WAL segment in the archive. (Suggested by Hans-Jürgen Schönig.)
    • Improve error when hostname cannot be found in a certificate. (Suggested by James Badger.)
    • Add additional options to backup.manifest for debugging purposes. (Contributed by blogh.)

    Documentation Improvements:

    • Update default documentation version to PostgreSQL 10.
    Source code(tar.gz)
    Source code(zip)
  • release/2.10(Feb 9, 2019)

    Bug Fixes:

    • Add unimplemented S3 driver method required for archive-get. (Reported by mibiio.)
    • Fix check for improperly configured pg-path. (Reported by James Chanco Jr.)
    Source code(tar.gz)
    Source code(zip)
Owner
pgBackRest
Reliable PostgreSQL Backup & Restore
pgBackRest
PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL.

PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL. It extends PostgreSQL to become a share-nothing distributed database, which supports global data consistency and ACID across database nodes, distributed SQL processing, and data redundancy and high availability through Paxos based replication. PolarDB is designed to add values and new features to PostgreSQL in dimensions of high performance, scalability, high availability, and elasticity. At the same time, PolarDB remains SQL compatibility to single-node PostgreSQL with best effort.

Alibaba 2.3k Jun 30, 2022
Incremental backup with strong cryptographic confidentiality baked into the data model.

Incremental backup with strong cryptographic confidentiality baked into the data model.

Rich Felker 105 Jun 22, 2022
The official C++ client API for PostgreSQL.

libpqxx Welcome to libpqxx, the C++ API to the PostgreSQL database management system. Home page: http://pqxx.org/development/libpqxx/ Find libpqxx on

Jeroen Vermeulen 643 Jun 25, 2022
YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. It is best to fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.

yugabyte 6.6k Jul 1, 2022
A PostgreSQL extension providing an async networking interface accessible via SQL using a background worker and curl.

pg_net is a PostgreSQL extension exposing a SQL interface for async networking with a focus on scalability and UX.

Supabase 41 Jun 19, 2022
A framework to monitor and improve the performance of PostgreSQL using Machine Learning methods.

pg_plan_inspector pg_plan_inspector is being developed as a framework to monitor and improve the performance of PostgreSQL using Machine Learning meth

suzuki hironobu 168 Jul 1, 2022
Prometheus exporter for PostgreSQL

pgexporter pgexporter is a Prometheus exporter for PostgreSQL. pgexporter will connect to one or more PostgreSQL instances and let you monitor their o

null 15 Apr 17, 2022
PostgreSQL extension for pgexporter

pgexporter_ext pgexporter_ext is an extension for PostgreSQL to provide additional Prometheus metrics for pgexporter. Features Disk space metrics See

null 4 Apr 13, 2022
The PostgreSQL client API in modern C++

C++ client API to PostgreSQL {#mainpage} Dmitigr Pgfe (PostGres FrontEnd, hereinafter referred to as Pgfe) - is a C++ client API to PostgreSQL servers

Dmitry Igrishin 134 Jun 3, 2022
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.

QTL QTL is a C ++ library for accessing SQL databases and currently supports MySQL, SQLite, PostgreSQL and ODBC. QTL is a lightweight library that con

null 155 Jun 26, 2022
C++ client library for PostgreSQL

Welcome to taoPQ taoPQ is a lightweight C++ client library for accessing a PostgreSQL➚ database. It has no dependencies beyond libpq➚, the C applicati

The Art of C++ 213 Jun 22, 2022
recovery postgresql table data by update/delete/rollback/dropcolumn command

recovery postgresql table data by update/delete/rollback/dropcolumn command

RadonDB 4 Mar 30, 2022
pgagroal is a high-performance protocol-native connection pool for PostgreSQL.

pgagroal is a high-performance protocol-native connection pool for PostgreSQL.

Agroal 524 Jun 15, 2022
xxhash functions for PostgreSQL

pg_xxhash PostgreSQL ❤️ xxhash Tested with xxhash 0.8.1 and PostgreSQL 14.1 on Linux and macOS. Think twice before even considering to use it in any s

Igor Hatarist 5 Mar 11, 2022
Distributed PostgreSQL as an extension

What is Citus? Citus is a PostgreSQL extension that transforms Postgres into a distributed database—so you can achieve high performance at any scale.

Citus Data 6.8k Jun 28, 2022
High-performance time-series aggregation for PostgreSQL

PipelineDB has joined Confluent, read the blog post here. PipelineDB will not have new releases beyond 1.0.0, although critical bugs will still be fix

PipelineDB 2.5k Jun 24, 2022
upstream module that allows nginx to communicate directly with PostgreSQL database.

About ngx_postgres is an upstream module that allows nginx to communicate directly with PostgreSQL database. Configuration directives postgres_server

RekGRpth 1 Apr 29, 2022
Modern cryptography for PostgreSQL using libsodium.

pgsodium pgsodium is an encryption library extension for PostgreSQL using the libsodium library for high level cryptographic algorithms. pgsodium can

Michel Pelletier 252 May 17, 2022
Open Source Oracle Compatible PostgreSQL.

IvorySQL is advanced, fully featured, open source Oracle compatible PostgreSQL with a firm commitment to always remain 100% compatible and a Drop-in r

null 86 Jun 20, 2022