[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Q: How do I configure a slave if the master is already running and I do not want to stop it?
A: There are several options. If you have taken a backup of the
master at some point and recorded the binlog name and offset ( from the
output of SHOW MASTER STATUS
) corresponding to the snapshot, do
the following:
mysql> CHANGE MASTER TO -> MASTER_HOST='master_host-name', -> MASTER_USER='master_user_name', -> MASTER_PASSWORD='master_pass', -> MASTER_LOG_FILE='recorded_log_name', -> MASTER_LOG_POS=recorded_log_pos; |
START SLAVE
on the slave.
If you do not have a backup of the master already, here is a quick way to do it consistently:
FLUSH TABLES WITH READ LOCK
gtar zcf /tmp/backup.tar.gz /var/lib/mysql
(or a variation of this)
SHOW MASTER STATUS
- make sure to record the output - you will need it
later
UNLOCK TABLES
An alternative is taking an SQL dump of the master instead of a binary
copy like above; for this you can use mysqldump --master-data
on your master and later run this SQL dump into your slave. However, this is
slower than makeing a binary copy.
No matter which of the two methods you use, afterwards follow the instructions for the case when you have a snapshot and have recorded the log name and offset. You can use the same snapshot to set up several slaves. As long as the binary logs of the master are left intact, you can wait as long as several days or in some cases maybe a month to set up a slave once you have the snapshot of the master. In theory the waiting gap can be infinite. The two practical limitations is the diskspace of the master getting filled with old logs, and the amount of time it will take the slave to catch up.
You can also use LOAD DATA FROM
MASTER
. This is a convenient command that takes a snapshot,
restores it to the slave, and adjusts the log name and offset on the slave
all at once. In the future, LOAD DATA FROM MASTER
will be the
recommended way to set up a slave. Be warned, howerver, that the read
lock may be held for a long time if you use this command. It is not yet
implemented as efficiently as we would like to have it. If you have
large tables, the preferred method at this time is still with a local
tar
snapshot after executing FLUSH TABLES WITH READ LOCK
.
Q: Does the slave need to be connected to the master all the time?
A: No, it does not. The slave can go down or stay disconnected for hours or even days, then reconnect and catch up on the updates. For example, you can set up a master/slave relationship over a dial-up link where the link is up only sporadically and for short periods of time. The implication of this is that at any given time the slave is not guaranteed to be in sync with the master unless you take some special measures. In the future, we will have the option to block the master until at least one slave is in sync.
Q: How do I know how late a slave is compared to the master? In other words, how do I know the date of the last query replicated by the slave?
A:
If the slave is 4.1.1 or newer, read the Seconds_Behind_Master
column in SHOW SLAVE STATUS
. For older versions, the following
applies.
This is possible only if the slave SQL thread exists
(that is, if it shows up in SHOW PROCESSLIST
, see section 6.3 Replication Implementation Details)
(in MySQL 3.23: if the slave thread exists, that is, shows up in
SHOW PROCESSLIST
),
and if it has executed at least one event
from the master. Indeed, when the slave SQL thread executes an event
read from the master, this thread modifies its own time to the event's
timestamp (this is why TIMESTAMP
is well replicated). So in the
Time
column in the output of SHOW PROCESSLIST
, the
number of seconds displayed for the slave SQL thread is the number of
seconds between the timestamp of the last replicated event and the
real time of the slave machine. You can use this to determine the date
of the last replicated event. Note that if your slave has been
disconnected from the master for one hour, then reconnects,
you may immediately see Time
values like 3600 for the slave SQL
thread in SHOW PROCESSLIST
... This would be because the slave
is executing queries that are one hour old.
Q: How do I force the master to block updates until the slave catches up?
A: Use the following procedure:
mysql> FLUSH TABLES WITH READ LOCK; mysql> SHOW MASTER STATUS; |
Record the log name and the offset from the output of the SHOW
statement.
MASTER_POS_WAIT()
function are the values recorded
in the previous step:
mysql> SELECT MASTER_POS_WAIT('log_name', log_offset); |
The SELECT
statement will block until the slave reaches the specified
log file and offset. At that point, the slave will be in sync with the master
and the statement will return.
mysql> UNLOCK TABLES; |
Q: What issues should I be aware of when setting up two-way replication?
A: MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that will make the update of client A work differently than it did on co-master 1. Thus, when the update of client A will make it to co-master 2, it will produce tables that are different than what you have on co-master 1, even after all the updates from co-master 2 have also propagated. So you should not co-chain two servers in a two-way replication relationship, unless you are sure that your updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.
You must also realize that two-way replication actually does not improve performance very much (if at all), as far as updates are concerned. Both servers need to do the same amount of updates each, as you would have one server do. The only difference is that there will be a little less lock contention, because the updates originating on another server will be serialized in one slave thread. Even so, this benefit might be offset by network delays.
Q: How can I use replication to improve performance of my system?
A: You should set up one server as the master and direct all
writes to it. Then configure as many slaves as you have the money and
rackspace for, and distribute the reads among the master and the slaves.
You can also start the slaves with --skip-bdb
,
--low-priority-updates
and --delay-key-write=ALL
to get speed improvements for the slave. In this case the slave will
use non-transactional MyISAM
tables instead of BDB
tables
to get more speed.
Q: What should I do to prepare client code in my own applications to use performance-enhancing replication?
A: If the part of your code that is responsible for database access has been properly abstracted/modularised, converting it to run with a replicated setup should be very smooth and easy. Just change the implementation of your database access to send all writes the the master, and to send reads to either the master or a slave. If your code does not have this level of abstraction, setting up a replicated system will give you the opportunity and motivation to it clean up. You should start by creating a wrapper library or module with the following functions:
safe_writer_connect()
safe_reader_connect()
safe_reader_query()
safe_writer_query()
safe_
in each function name means that the function will take care
of handling all the error conditions.
You can use different names for the
functions. The important thing is to have a unified interface for connecting
for reads, connecting for writes, doing a read, and doing a write.
You should then convert your client code to use the wrapper library. This may be a painful and scary process at first, but it will pay off in the long run. All applications that use the approach just described will be able to take advantage of a master/slave configuration, even one involving multiple slaves. The code will be a lot easier to maintain, and adding troubleshooting options will be trivial. You will just need to modify one or two functions, for example, to log how long each query took, or which query, among your many thousands, gave you an error.
If you have
written a lot of code already, you may want to automate the conversion
task by using the replace
utility that comes with the
standard distribution of MySQL, or just write your own Perl script.
Hopefully, your code follows some recognizable pattern. If not, then
you are probably better off rewriting it anyway, or at least going
through and manually beating it into a pattern.
Q: When and how much can MySQL replication improve the performance of my system?
A: MySQL replication is most beneficial for a system with frequent reads and infrequent writes. In theory, by using a single-master/multiple-slave setup, you can scale the system by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added
benefits begin to level out, and how much you can improve performance
of your site, you need to know your query patterns, and empirically
(by benchmarking) determine the relationship between the throughput
on reads (reads per second, or max_reads
) and on writes
(max_writes
) on a typical master and a typical slave. The
example here will show you a rather simplified calculation of what you
can get with replication for a hypothetical system.
Let's say that system load consists of 10% writes and 90% reads, and we
have determined max_reads
to be 1200 - 2 * max_writes
.
In other words, the system can do 1200 reads per second with no
writes, the average write is twice as slow as average read,
and the relationship is
linear. Let us suppose that the master and each slave have the same
capacity, and that we have 1 master and N slaves. Then we have for each
server (master or slave):
reads = 1200 - 2 * writes
(from benchmarks)
reads = 9* writes / (N + 1)
(reads split, but writes go
to all servers)
9*writes/(N+1) + 2 * writes = 1200
writes = 1200/(2 + 9/(N+1)
This analysis yields the following conclusions:
Note that these computations assume infinite network bandwidth and neglect several other factors that could turn out to be significant on your system. In many cases, you may not be able to perform a computation similar to the one above that will accurately predict what will happen on your system if you add N replication slaves. However, answering the following questions should help you decide whether and how much replication will improve the performance of your system:
Q: How can I use replication to provide redundancy/high availability?
A: With the currently available features, you would have to set up a master and a slave (or several slaves), and write a script that will monitor the master to see whether it is up, and instruct your applications and the slaves of the master change in case of failure. Some suggestions:
CHANGE MASTER TO
command.
bind
you can use `nsupdate' to dynamically update your DNS.
--log-bin
option and without
--log-slave-updates
. This way the slave will be ready to become a
master as soon as you issue STOP SLAVE
; RESET MASTER
, and
CHANGE MASTER TO
on the other slaves.
For example, consider you have the following setup ("M" means the
master, "S" the slaves, "WC" the clients that issue database
writes and reads; clients that issue only database reads are not
represented, because they need not switch):
WC \ v WC----> M / | \ / | \ v v v S1 S2 S3 |
S1 (like S2 and S3) is a slave running with --log-bin
and
without --log-slave-updates
. As the only writes executed on S1
are those replicated from M, the binary log on S1 is empty
(remember, S1 runs without --log-slave-updates
).
Then, for some reason, M becomes unavailable, and you want S1 to
become the new master (that is, direct all WC to S1, and make S2 and S3
replicate S1).
Make sure that all slaves have processed any queries in their relay log.
On each slave, issue STOP SLAVE IO_THREAD
, then check the output
of SHOW PROCESSLIST
until you see Has read all relay log
.
When this is true for all slaves, they can be reconfigured to the new setup.
Issue STOP SLAVE
on all slaves, RESET MASTER
on the slave
being promoted to master, and CHANGE MASTER
on the other slaves.
No WC accesses M. Instruct all WC to direct their queries
to S1. From now on, all queries sent by WC to S1 are written to the binary log
of S1. The binary log of S1 contains exactly every writing query sent
to S1 since M died.
On S2 (and S3) do STOP SLAVE
, CHANGE MASTER TO
MASTER_HOST='S1'
(where 'S1'
is replaced by the real hostname of
S1). To CHANGE MASTER
, add all information about how to connect
to S1 from S2 or S3 (user, password, port). In CHANGE MASTER
,
no need to specify
the name of S1's binary log or binary log position to read from: we
know it is the first binary log, from position 4, and these are the
defaults of CHANGE MASTER
. Finally do START SLAVE
on S2
and S3, and now you have this:
WC / | WC | M(unavailable) \ | \ | v v S1<--S2 S3 ^ | +-------+ |
When M is up again, you just have to issue on it the same CHANGE
MASTER
as the one issued on S2 and S3, so that M becomes a slave of
S1 and picks all the WC writes it has missed while it was down. Now to make
M a master again (because it is the most powerful machine, for example),
follow the preceding procedure as if S1 was unavailable and M was to be the
new master; then during the procedure don't forget to run RESET
MASTER
on M before making S1, S2, S3 slaves of M, or they may pick
old WC writes from before M's unavailibility.
We are currently working on integrating an automatic master election system into MySQL, but until it is ready, you will have to create your own monitoring tools.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |