=head1 NAME Pg::Reindex - rebuild postgresql indexes concurrently without locking. =head1 VERSION Version 0.01 =head1 SYNOPSIS =head3 Use as a module use Pg::Reindex qw(prepare rebuild); prepare($dbh, \@namespaces, \@tables, \@indexes); rebuild($dbh, \%options, $dryrun); =head3 Run as a perl script perl `perldoc -l Pg::Reindex` \ [--help] \ [--server=localhost] \ [--port=5432] \ [--user=postgres] \ [--password=PASSWORD] \ [--table=TABLE] ... \ [--namespace=NAMESPACE] ... \ [--index=INDEX] ... \ [--[no]validate] \ [--high_txn_lag=BYTES] \ [--log_txn_lag=BYTES] \ [--[no]dryrun] \ [prepare|continue] =head1 DESCRIPTION Postgresql indexes should be rebuilt on a regular basis for good performance. This can be done with the C<REINDEX> command, however, building indexes this way requires an exclusive lock on the table. On the other hand, using C<CREATE INDEX CONCURRENTLY> avoids this lock. C<Pg::Reindex> builds new indexes using C<CREATE INDEX CONCURRENTLY>. Then it starts a transaction for each index in which it drops the old index and renames the new one. It handles normal indexes and C<PRIMARY KEY>, C<FOREIGN KEY> and C<UNIQUE> constraints. =head2 Streaming replication and throttling Before creating the next index, the streaming replication lag is checked to be below a certain limit. If so, nothing special happens and the index is built. Otherwise, C<rebuild> waits for the replicas to catch up. When the lag drops under a second limit, the C<rebuild> does not immediately continue. Instead it waits for another 30 seconds and checks the lag every second within that period. Only if the lag stays below the limit for the whole time, execution is continued. This grace period is to deal with the fact that a wal sender process may suddenly disappear and reappear after a few seconds. Without the grace period the program may encounter a false drop below the limit and hence continue. For large indexes this adds a lot of lag. =head1 USING AS A MODULE To use Pg::Reindex as a module, first you need to load the Pg::Reindex module: use Pg::Reindex qw(prepare rebuild); use strict; (The C<use strict;> isn't required but is strongly recommended.) Then you need to L</prepare> the indexes that you want rebuilt. You can filter by combinations of namespace, tables, and indexes. prepare($dbh, \@opt_namespaces,\@opt_tables, \@opt_indexes); After "preparing" the set of indexes to be rebuilt, then you rebuild them: rebuild( $dbh, { ThrottleOn => 10000000, ThrottleOff => 100000, Validate => 1 }, $opt_dryrun); =head2 SUBROUTINES/METHODS =head3 prepare C<prepare> determines the list of indexes that would be re-indexed, and sets up the data structures used by C<rebuild>. C<prepare> must be called before C<rebuild> is called. C<prepare> creates a new schema named C<reindex> with 2 tables, C<worklist> and C<log>. C<Worklist> is created as C<UNLOGGED> table. C<prepare> saves information on all indexes that need to be rebuilt to C<worklist>. The information in C<worklist> is used by C<rebuild>. =over =item $dbh DBI database handle to the database whose indexes are to be reindexed. =item \@namespaces Rebuild only indexes in the C<namespaces>. If C<namespaces> is empty, indexes in all namespaces except the following are considered: those beginning with C<pg_>, in C<information_schema>i, or are C<sequences> namespaces. =item \@tables Rebuild only indexes that belong to the specified tables. =item \@indexes List of indexes to reindex. =back If C<tables>, C<namespaces> and C<indexes> are given simultaneously, only indexes satisfying all conditions are considered. =head3 rebuild =over =item $dbh DBI database handle to the database whose indexes are to be reindexed. =item \%options ThrottleOn ThrottleOff Validate =item $dryrun =back =head1 USING AS A PERL SCRIPT To use Pg::Reindex as a perl script you need to have perl run it. The command below would do that by using C<perldoc> to determine C<Pg::Reindex>'s location. perl `perldoc -l Pg::Reindex` \ [--help] \ [--server=localhost] \ [--port=5432] \ [--user=postgres] \ [--password=PASSWORD] \ [--table=TABLE] ... \ [--namespace=NAMESPACE] ... \ [--index=INDEX] ... \ [--[no]validate] \ [--high_txn_lag=BYTES] \ [--log_txn_lag=BYTES] \ [--[no]dryrun] \ [prepare|continue] =head2 OPTIONS Options can be abbreviated. =over 4 =item --server Hostname / IP address or directory path to use to connect to the Postgres server. If you want to use a local UNIX domain socket, specify the socket directory path. Default: localhost =item --port The port to connect to. Default: 5432 =item --user The user. Default: postgres =item --password a file name or open file descriptor where to read the password from. If the parameter value consists of only digits, it's evaluated as file descriptor. There is no default. A convenient way to specify the password on the BASH command line is reindex.pl --password=3 3<<<my_secret That way the password appears in F<.bash_history>. But that file is usually only readable to the owner. =item --table Reindex only indexes that belong to the specified table. This option can be given multiple times. If C<--table>, C<--namespace> and C<--index> are given simultaneously, only indexes satisfying all conditions are considered. =item --namespace Without this option only namespaces are considered that are not in beginning with C<pg_>. Also C<information_schema> or C<sequences> namespaces are omitted. If C<--table>, C<--namespace> and C<--index> are given simultaneously, only indexes satisfying all conditions are considered. =item --index If C<--table>, C<--namespace> and C<--index> are given simultaneously, only indexes satisfying all conditions are considered. =item --[no]validate validate C<FOREIGN KEY> constraints or leave them C<NOT VALID>. Default it to validate. =item --[no]dryrun don't modify the database but print the essential SQL statements. =item --high-txn-lag the upper limit streaming replicas may lag behind in bytes. Default is 10,000,000. =item --low-txn-lag the lower limit in bytes when execution may be continued after it has been interrupted due to exceeding C<high_txn_lag>. Default is 100,000 =item --help print this help =back =head1 AUTHOR BINARY, C<< <binary at cpan.org> >> =head1 BUGS Please report any bugs or feature requests to C<bug-pg-reindex at rt.cpan.org>, or through the web interface at L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Pg-Reindex>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 LICENSE AND COPYRIGHT Copyright 2015 Binary Ltd. This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L<http://www.perlfoundation.org/artistic_license_2_0> Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.