MySQL Musings: python

Showing posts with label python. Show all posts

Tuesday, May 27, 2014

MySQL Fabric: Musings on Release 1.4.3

As you might have noticed in the press release, we just released MySQL Utilities 1.4.3, containing MySQL Fabric, as a General Availability (GA) release. This concludes the first chapter of the MySQL Fabric story.

It all started with the idea that it should be as easy to manage and setup a distributed deployments with MySQL servers as it is to manage the MySQL servers themselves. We also noted that some of the features that were most interesting were sharding and high-availability. Since we also recognized that every user had different needs and needed to customize the solution, we set of to create a framework that would support sharding and high-availability, but also other solutions.

With the release of 1.4.3, we have a range of features that are now available to the community, and all under an open source license and wrapped in an easy-to-use package:

High-availability support using built-in slave promotion in a master-slave configuration.
A framework with an execution machinery, monitoring, and interfaces to support management of large server farms.
Sharding using hash and range sharding. Range sharding is currently limited to integers, but hash sharding support anything that looks like a string.
Shard management support to move and split shards.
Support for failure detectors, both built-in and custom ones.
Connectors with built-in load balancing and fail-over in the event of a master failure.

Beyond MySQL Fabric 1.4.3

As the MySQL Fabric story develop, we have a number of challenges ahead.

Loss-less Fail-over. MySQL 5.7 have extended the support for semi-sync so that transactions that are not replicated to a slave server will not be committed. With this support, we can truly have a loss-less fail-over so that you cannot lose a transaction if a single server fails.

More Fabric-aware connectors. We currently have support for Connector/J, Connector/PHP, and Connector/Python, but one common request is to have support for a Fabric-aware C API. This is both for applications developed using C/C++, but also to add Fabric support to connectors based on the MySQL C API, such as the Perl and Ruby connector.

Multi-Node Fabric Instance. Many have pointed out that the Fabric node is a single point of failure, and it is instead a single node, but if the Fabric node goes down, the system do not stop working. Since the connectors cache the data, they can "run on the cache" for the time it takes for the Fabric node to be brought up again. Procedures being executed will stop, but once the Fabric node is on-line again, execution will resume from where it left off. To ensure that the meta-data (the information about the servers in the farm) is not lost in the event of a machine failure, MySQL Cluster can be used as storage engine, and will then ensure that your meta-data is safe.

There are, however, a few advantages in having support for multiple Fabric nodes:

The most obvious advantage is that execution can fail-over to another node and there will be no interruption in the execution of procedures. If the fail-over is built-in, you avoid the need for external clusterware to manage several Fabric nodes.
If you have several Fabric nodes available to deliver data, you improve responsiveness to bursts in meta-data requests. This can happen if you have a large bunch of connectors brought on-line at the same time.
If you have multiple data centers, having a local version of the data to serve the applications deployed in the same center improve locality of data and avoid an unnecessary round-trip over WAN to fetch some meta-data.
With several nodes to execute management procedures, you can improve scaling by being able to execute several management procedures in parallel. This would require some solution to avoid that that procedures do no step over each other.

Location Awareness. In deployments spread over several data-centers, the location of all the components suddenly become important. There is no reason for a connector to be directed to a remote server when a local one suffices, but that require some sort of location awareness in the model allowing the location of servers (or other components) to be given.

Extending the model by adding data centers is not enough though. The location of components withing a data center might be important. For example, if a connector is located in a particular rack in the data center, going to a different rack to fetch data might be undesirable. For this reason, the location awareness need to be hierarchical and support several levels, e.g., continent, city, data center, hall, rack, etc.

Multi-Shard Queries. Sharding can improve performance significantly since it split the data horizontally across several machines and each query therefore go directly to the right shard of the data. In some cases, however, you also need to send queries to multiple shards. There are a few reasons for this:

You do not have the shard key available, so you want to search all servers for some object of interest. This of course affect performance, but in some cases there are few alternatives. Consider, for example, searching for a person given name and address when the database is sharded on the SSN.
You want to generate a report of several items in the database, for example, find all customers above 50 that have more than 2 cars.
You want a summary of some statistic over the database, for example, generate a histogram over the age of all your customers.

Session Consistency Guarantees. As Alfranio point out, when you use multiple servers in your farm, and transactions are sent to different servers at different times, it might well be that you write one transaction that goes to the master of a group and then try to read something from the same group. If the write transactions have not reached the server that you read from, then you might get an incorrect result from your transaction. In some cases, this is fine, but in other cases, you have certain guarantees that you want to have on your session. For example, you want to ensure that anything you write will also be available when you read in transactions following the write, you might want to guarantee that multiple reads read later data all the time (called "read monotonicity"), or other forms of guarantees on the result sets you get back from the distributed database. This might require connectors to wait for transactions to reach slaves before reading, but this should be transparent to the application.

This is just a small set of the possibilities for the future, so it is really going to be interesting to see how the MySQL Fabric story develops.

You can download MySQL Utilities (which includes MySQL Fabric) from http://dev.mysql.com/downloads/tools/utilities
You can read MySQL Utilities documentation at http://dev.mysql.com/doc/mysql-utilities/1.4/en/index.html
You can report bugs or request features on http://bugs.mysql.com
MySQL Forum Fabric, Sharding, HA, Utilities

Tuesday, April 01, 2014

MySQL Fabric 1.4.2 Released

As you saw in the press release, MySQL Fabric 1.4.2 is now released! If you're interested in learning more about MySQL Fabric, there is a session April 3, 2014 11:10–12pm titled Sharding and Scale-out using MySQL Fabric in Ballroom G. MySQL Fabric is a relatively new project in the MySQL ecosystem and it focuses on building a framework for working with large deployments of MySQL Servers. The architecture of MySQL Fabric is such that it allows extensions to be added and the first two extensions that we added were support for high-availability using High-Availability groups (HA groups) and sharding to manage very large databases. The first version of sharding have hash and range sharding implemented as well as procedures for moving and splitting shards.
A critical part of working with a collection of servers is the ability to route transactions to the correct servers, and for efficiency reasons we quite early decided to put this routing logic into the connectors. This avoid one extra network hop and hence improve performance by reducing latency, but it does require that the connectors containing routing logic, caches, and support for fetching data from MySQL Fabric. Putting the routing logic into the connector also make it easy to extend the API to add new support that applications can require.
MySQL Fabric 1.4.2 is distributed as part of MySQL Utilities 1.4.2. To avoid confusion, we have changed the version numbering to match the version of MySQL Utilities it is distributed in.

You can download MySQL Utilities 1.4.2 from http://dev.mysql.com/downloads/tools/utilities
You can read MySQL Utilities 1.4.2 documentation at http://dev.mysql.com/doc/mysql-utilities/1.4/en/index.html

We have just done a few public releases, even though we did a few internal releases as well, but a brief history of our releases this far is:

MySQL Fabric 1.4.0
- First public release
- High-Availability groups for modeling farms
- Event-driven Executor for execution of management procedures.
- Simple failure detector with fail-over procedures.
- Hash and Range sharding allowing management of large databases.
- Shard move and shard split to support management of a sharded database.
- Connector interfaces to support federated database systems.
- Fabric-aware Connector/Python (labs)
- Fabric-aware Connector/J (labs)
- Fabric-aware Connector/PHP (labs)
MySQL Fabric 1.4.1
- More solid scale-out support in connectors and MySQL Fabric
- Improvements to the Executor to avoid stalling reads
- Connector/Python 1.2.0 containing:

Labs release of Connector/J with Fabric-support

MySQL Fabric 1.4.2

Credentials in MySQL Fabric
External failure reporting interfaces supporting external failure detectors
Support for unreliable failure detectors in MySQL Fabric
Credentials support in Connector/Python
Connector/Python 1.2.1 containing:

Failure reporting
Credentials Support

Connector/J 5.1.30 containing Fabric support

Do you want to participate?

There is a lot you can do if you want to help improve MySQL Fabric.

If you find bugs or want specific feature, please report a bug at http://bugs.mysql.com
MySQL Forum Fabric, Sharding, HA, Utilities

Blogs about MySQL Fabric

Monday, February 20, 2012

Pythonic Database API: Now with Launchpad

In a previous post, I demonstrated a simple Python database API with a syntax similar to jQuery. The goal was to provide a simple API that would allow Python programmers to use a database without having to resort to SQL, nor having to use any of the good, but quite heavy, ORM implementations that exist. The code was just an experimental implementation, and I was considering putting it up on Launchpad.
I did some basic cleaning of the code, turned it into a Python package, and pushed it to Launchpad. I also added some minor changes, such as introducing a define function to define new tables instead of automatically creating one when an insert was executed. Automatically constructing a table from values seems neat, but in reality it is quite difficult to ensure that it has the right types for the application. Here is a small code example demonstrating how to use the define function together with some other operations.

import mysql.api.simple as api

server = api.Server(host="example.com")

server.test_api.tbl.define(
    { 'name': 'more', 'type': int },
    { 'name': 'magic', 'type': str },
)

items = [
    {'more': 3, 'magic': 'just a test'},
    {'more': 3, 'magic': 'just another test'},
    {'more': 4, 'magic': 'quadrant'},
    {'more': 5, 'magic': 'even more magic'},
]

for item in items:
    server.test_api.tbl.insert(item)

The table is defined by providing a dictionary for each row that you want in the table. The two most important fields in the dictionary is name and type. The name field is used to supply a name for the field, and the type field is used to provide a type of the column. The type is denoted using a basic Python type constructor, which then maps internally to a SQL type. So, for example, int map to the SQL INT type, and bool map to the SQL type BIT(1). This choice of deciding to use Python types are simply because it is more natural for a Python programmer to define the tables from the data that the programmer want to store in the database. I this case, I would be less concerned with how the types are mapped, just assuming that it is mapped in a way that works. It is currently not possible to register your own mappings, but that is easy to add.So, why provide the type object and not just a string with the type name? The idea I had here is that since Python has introspection (it is a dynamic language after all), it would be possible to add code that read the provided type objects and do things with them, such as figuring out what fields there are in the type. It's not that I plan to implement this, but even though this is intended to be a simple database interface, there is no reason to tie ones hands from start, so this simple approach will provide some flexibility if needed in the future.

Monday, January 23, 2012

MySQL: Python, Meta-Programming, and Interceptors

I recently found Todd's posts on interceptors which allow callbacks (called interceptors) to be registered with the connector so that you can intercept a statement execution, commit, or any of the many extension points supported by Connector/Java. This is a language feature that allow you to implement a number of new features without having to change the application code such as load-balancing policies, profiling queries or transactions, or debugging an application.

Since Python is a dynamic language, it is easy to add interceptors to any method in Connector/Python, without having to extend the connector with specific code. This is something that is possible in dynamic languages such as Python, Perl, JavaScript, and even some lesser known languages such as Lua and Self. In this post, I will describe how and also give an introduction to some of the (in my view) more powerful features of Python.

In order to create an interceptor, you need to be able to do these things:

Catch an existing method in a class and replace it with a new one.
Call the original function, if necessary.
For extra points: catch an existing method in an object and replace a new one.

You will in this post see how all three of these problems are solved in Python. You will see and use decorators to be able to define methods in existing classes and object, and closures to be able to call the original version of the methods. By picking this approach, it will not be necessary to change the implementation: in fact, you can use this code to replace any method in any class, not only in Connector/Python.

Table 1. Attributes for methods
	Method Instance
Name	Unbound	Bound
`__name__`	Name of Method
`im_func`	"Inner" function of the method
`im_self`	`None`	Class instance for the method
`im_class`	Class that the method belongs to

In addition to being able to replace methods in the class, we would also like to be able to replace methods in instances of a class ("objects" in the traditional sense). This is useful to create specialized objects, for example for tracking particular cases where a method is used.

In order to understand how the replacement works, you should understand that in Python (and the dynamic languages mentioned above), all objects can have attributes, including classes, functions, and a bunch of other esoteric constructions. Each type of object has a set of pre-defined attributes with well-defined meaning. For classes (and class instances), methods are stored as attributes of the class (or class instance) and can therefore be replaced with other methods that you build dynamically. However, it requires some tinkering to take an existing "normal" function definition and "imbue" it with whatever "tincture" that makes it behave as a method of the class or class instance.

Depending on where the method comes from, it can be either unbound and bound. Unbound methods are roughly equivalent to member function pointers in C++: they reference a function, but not the instance. In contrast, bound methods have an instance tied to it, so when you call them, they already know what instance they belong to and will use it. Methods have a set of attributes, of which the four in Table 1 interests us. If a method is fetched from a class (to be precise, from a class object), it will be unbound and im_self will be None. If the method is fetched from a class instance, it will be bound and im_self will be set to the instance it belongs to. These attributes are all the "tincture" you need make our own instance methods. The code for doing the replacement described above is simply:

import functools, types

def replace_method(orig, func):
    functools.update_wrapper(func, orig.im_func)
    new = types.MethodType(func, orig.im_self, orig.im_class)
    obj = orig.im_self or orig.im_class
    setattr(obj, orig.__name__, new)

The function uses two standard modules to make the job simpler, but the steps are:

Copy the meta-information from the original method function to the new function using update_wrapper. This copies the name, module information, and documentation from the original method function to make it look like the original method.
Create a new method instance from the method information of the original method using the constructor MethodType, but replace the "inner" function with the new function.
Install the new instance method in the class or instance by replacing the attribute denoting the original method with the new method. Depending on whether the function is given a bound or unbound instance, either the method in the class or in the instance is replaced.

Using this function you can now replace a method in a class like this:

from mysql.connector import MySQLCursor

def my_execute(self, operation, params=None):
  ...

replace_method(MySQLCursor.execute, my_execute)

This is already pretty useful, but note that you can also replace only a specific instance as well by using replace_method(cursor.execute, my_execute). It was not necessary to change anything inside Connector/Python to intercept a method there, so you can actually apply this to any method in any of the classes in Connector/Python that you already have available. In order to make it even easier to use you'll see how to define a decorator that will install the function in the correct place at the same time as it is defined. The code for defining a decorator and an example usage is:

import functools, types
from mysql.connector import MySQLCursor

def intercept(orig):
    def wrap(func):
        functools.update_wrapper(func, orig.im_func)
        meth = types.MethodType(func, orig.im_self, orig.im_class)
        obj = orig.im_self or orig.im_class
        setattr(obj, orig.__name__, meth)
        return func
    return wrap

# Define a function using the decorator
@intercept(MySQLCursor.execute)
def my_execute(self, operation, params=None):
  ...

The @intercept line before the definition of my_execute is where the new descriptor is used. The syntax is a shorthand that can be used to do some things with the function when defining it. It behaves as if the following code had been executed:

def _temporary(self, operation, params=None):
  ...
my_execute = intercept(MySQLCursor.execute)(_temporary)

As you can see here, whatever is given after the @ is used as a function and called with the function-being-defined as argument. This explains why the wrap function is returned from the decorator (it will be called with a reference to the function that is being defined), and also why the original function is returned from the wrap function (the result will be assigned to the function name).

Using a statement interceptor, you can catch the execution of statements and do some special magic on them. In our case, let's define an interceptor to catch the execution of a statement and log the result using the standard logging module. If you read the wrap function carefully, you probably noted that it uses a closure to access the value of orig when the decorator was called, not the value it happen to have when the wrap function is executed. This feature is very useful since a closure can also be used to get access to the original execute function and call it from within the new function. So, to intercept an execute call and log information about the statement using the logging module, you could use code like this:

from mysql.connector import MySQLCursor
original_execute = MySQLCursor.execute
@intercept(MySQLCursor.execute)
def my_execute(self, operation, params=None):
    if params is not None:
        stmt = operation % self._process_params(params)
    else:
        stmt = operation
    result = original_execute(self, operation, params)
    logging.debug("Executed '%s', rowcount: %d", stmt, self.rowcount)
    logging.debug("Columns: %s", ', '. join(c[0] for c in self.description))
    return result

Now with this, you could implement your own caching layer to, for example, do a memcached lookup before sending the statement to the server for execution. I leave this as an exercises to the reader, or maybe I'll show you in a later post. &smiley; Implementing a lifecycle interceptor is similar, only that you replace, for example, the commit or rollback calls. However, implementing an exception interceptor is not obvious. Catching the exception is straightforward and can be done using the intercept decorator:

original_init = ProgrammingError.__init__
@intercept(ProgrammingError.__init__)
def catch_error(self, msg, errno):
    logging.debug("This statement didn't work: '%s', errno: %d", msg, errno)
    original_init(self, msg, errno=errno)

However, in order to do something more interesting, such as asking for some additional information from the database, it is necessary to either get hold of the cursor that was used to execute the query, or at least the connection. It is possible to dig through the interpreter stack, or try to override one of the internal methods that Connector/Python uses, but since that is very dependent on the implementation, I will not present that in this post. It would be good if the cursor is passed down to the exception constructor, but this requires some changes to the connector code.

Even though I have been programming in dynamic languages for decades (literally) it always amaze me how easy it is to accomplish things in these languages. If you are interested in playing around with this code, you can always fetch Connector/Python on Launchpad and try out the examples above. Some links and other assorted references related to this post are:

Connector/Python is found at launchpad.net/myconnpy
Geert has a number of excellent posts on Connector/Python under geert.vanderkelen.org. Also, as you might already know, he is now working with developing Connector/Python and he's always interested in comments and suggestions. :)
Todd's Blog mysqlblog.fivefarmers.com is always interesting to read, and these articles on interceptors are the ones I read

Monday, September 26, 2011

Python Interface to MySQL

There has been a lot of discussions lately about various non-SQL languages that provide access to databases without having to resort to using SQL. I wondered how difficult it would be to implement such an interface, so as an experiment, I implemented a simple interface in Python that similar to the document-oriented interfaces available elsewhere. The interface generate SQL queries to query the database, but does not require any knowlegdge of SQL to use. The syntax is inspired by JQuery, but since JQuery works with documents, the semantics is slightly different.

A simple example would look like this:

from native_db import *
server = Server(host='127.0.0.1')
server.test.t1.insert({'more': 3, 'magic': 'just a test', 'count': 0})
server.test.t1.insert({'more': 3, 'magic': 'just another test', 'count': 0})
server.test.t1.insert({'more': 4, 'magic': 'quadrant', 'count': 0})
server.test.t1.insert({'more': 5, 'magic': 'even more magic', 'count': 0})
for row in server.test.t1.find({'more': 3}):
  print "The magic is:", row['magic']
server.test.t1.update({'more': 3}, {'count': 'count+1'})
for row in server.test.t1.find({'more': 3}, ['magic', 'count']):
  print "The magic is:", row['magic'], "and the count is", row['count']
server.test.t1.delete({'more': 5})

The first line define a server to communicate with, which is simply done by creating a Server object with the necessary parameters. The constructor accepts the normal parameters for Connector/Python (which is what I'm using internally), but the user defaults to whatever getpass.getuser() returns, and the host default to 127.0.0.1, even though I've provided it here.

After that, the necessary methods are overridden so that server.database.table will refer to the table with name table in database with name database on the given server. One possibility would be to just skip the database and go directly on the table (using some default database name), but since this is just an experiment, I did this instead. After that, there are various methods defined to support searching, inserting, deleting, and updating.

Since this is intended to be a simple interface, autocommit is on. Each of the functions generate a single SQL statement, so they will be executed atomically if you're using InnoDB.

table.insert(row): This function will insert the contents of the dictionary into the table. using the keys of the dictionary as column names. If the table does not exist, it will be created with a "best effort" guess of what types to use for the columns.
table.delete(condition): This function will remove all rows in the table that matches the supplied dictionary. Currently, only equality mapping is supported, but see below for how it could be extended.
table.find(condition, fields="*"): This will search the table and return an iterable to the rows that match condition. If fields is supplied (as a list of field names), only those fields are returned.
table.update(condition, update): This will search for rows matching condition and update each matching row according to the update dictionary. The values of the dictionary is used on the right side of the assignments of the UPDATE statement, so expressions can be given here as strings.

That's all folks!

The code is available at http://mats.kindahl.net/python/native_db.py if you're interested in trying it out. The code is very basic, and there's potential for a lot of extensions. If there's interest, I could probably create a repository somewhere.

Note that this is not a replacement for an ORM library. The intention is not to allow storing arbitrary objects in the database: the intention is to be able to query the database using a Python interface without resorting to using SQL.

I'm just playing around and testing some things out, and I'm not really sure if there is any interest in anything like this, so what do you think? Personally, I have no problems with using SQL, but since I'm working with MySQL on a daily basis, I'm strongly biased on the subject. For simple jobs, this is probably easier to work with than a "real" SQL interface, but it cannot handle as complex queries as SQL can (at least not without extensions).

There is a number of open issues for the implementation (this is just a small list of obvious ones):

Only equality searching supported

Searching can only be done with equality matches, but it is trivial to extend to support more complex comparisons. To allow more complex conditions, the condition supplied to find, delete, and update can actually be a string, in which case it is used "raw".

Conditions could be extended to support something like {'more': '>3'}, or a more object-oriented approach would be to support something similar to {'more': operator.gt(3)}.

No support for indexes

There's no support for indexes yet, but that can easily be added. The complication is what kind of indexes should be generated.

For example, right now rows are identified by their content, but if we want unique rows to be handled as a set? Imagine the following (not supported) query where we insert :

server.test.t1.insert(content with some more=3).find({'more': eq(3)})

In this case, we have to fetch the row identifiers for the inserted rows to be able to manipulate exactly those rows and none other. Not sure how to do this right now, but auto-inventing a row-identifier would mean that tables lacking it cannot be handled naturally.

Creating and dropping tables

The support for creation of tables is to create tables automatically if they do not exist. A simple heuristic is used to figure out the table definition, but this has obvious flaws if later inserts have more fields than the first one.

To support extending the table, one would have to generate an ALTER TABLE statement to "fix" the table.

There is no support for dropping tables... or databases.

Friday, December 18, 2009

MySQL Replicant: a library for controlling replication deployments

Keeping a MySQL installation up and running can be quite tricky at times, especially when having many servers to manage and monitor. In the replication tutorials at the annual MySQL Users' Conference, we demonstrate how to set up replication appropriately and also how to handle various issues that can arise. Many of these procedures are routine: bring down the server, edit the configuration file, bring the server up again, start a mysql client and add a user, etc.

It has always annoyed me that these procedures are perfect candidates for automation, but that we do not have the necessary interfaces to manipulate an entire installation of MySQL servers.

If there were an interface with a relatively small set of primitives—re-directing servers, bringing servers down, add a line to the configuration file, etc.—it would be possible to create pre-canned procedures that can just be executed.

To that end, I started writing on a library that would provide an interface like this. Although more familiar with Perl, Python was picked for this project, since it seems to be widely used by many database administrators (it's just a feeling I have, I have no figures to support it) and just to have a cool name on the library, we call it MySQL Replicant and it is (of course) available at Launchpad.

So what do we want to achieve with having a library like this? Well... the goal is to to provide an generic interface to complete installations and thereby make administration of large installations easy.

By providing such an interface, it will allow description of procedures in an executable format, namely as Python scripts.

In addition to making it easy to implement common tasks for experienced database administrators, it also promotes sharing by providing a way to write complete scripts for solving common problems. Having a pool of such scripts makes it easier for newcomers to get up and running.

The basic idea is that you create a model of the installation on a computer and then manipulate the model. When doing these manipulations, the appropriate commands—either as SQL commands to a running server or shell commands to the host where the server is running—will then be sent to the servers in the installation to configure them correctly.

So, to take small example, how does the code for re-directing a bunch of servers to a master look?

import mysqlrep, my_servers
for slave in my_server.slaves:
   mysqlrep.change_master(slave, my_servers.master)

In this case, the installation is defined in a separate file and is imported as a Python module. Right now, the interface for specifying a topology is quite rough, but this is going to change.

from mysqlrep import Server, User, Linux

servers = [Server(server_id=1, host="server1.example.com",
                  sql_user=User("mysql_replicant", "xyzzy"),
                  ssh_user=User("mysql_replicant"),
                  machine=Linux()),
           Server(server_id=2, host="server2.example.com",
                  sql_user=User("mysql_replicant", "xyzzy"),
                  ssh_user=User("mysql_replicant"),
                  machine=Linux()),
           Server(server_id=3, host="server3.example.com",
                  sql_user=User("mysql_replicant", "xyzzy"),
                  ssh_user=User("mysql_replicant"),
                  machine=Linux()),
           Server(server_id=4, host="server4.example.com",
                  sql_user=User("mysql_replicant", "xyzzy"),
                  ssh_user=User("mysql_replicant"),
                  machine=Linux())]
master = servers[0]
slaves = servers[1:]

Here, the Server class represents a server and to be able to do it's job, it is necessary to have one MySQL account on the server and one shell account on the host machine. Right now, it is also necessary to specify the server ID, but the plan is to just require the host, port, socket, SQL account name, and SSH account information. The remaining information can then be fetched from the configuration file of the server. Each server have a small set of primitives on top of which everything else is built:

Server.sql(SQL command): Execute the SQL command and return a result set.
Server.ssh(command list): Execute the command given by the command list return an iterator to the result output.
Server.start(): Start the server
Server.stop(): Stop the server.

There is a small set of commands defined on top of these primitives that can be used. Here is a list of just a few of them, but there are some more in the library at Launchpad.

change_master(slave, master, position=None): Change the master of slave to be master and start replicating from position.
fetch_master_pos(server): Fetch the master position of server, which is the position where the last executed statement ends in the binary log.
fetch_slave_pos(server): Fetch the slave position of server, which is the position where the last executed event ends.
flush_and_lock_database(server): Flush all tables on server and lock the database for read.
unlock_database(server): Unlock a previously locked database.

Using these primitives, it is easy to clone a master by executing the code below. For this example, I use the quite naive method of backing up a database by creating an archive of the database files and copying them to the new slave.

from mysqlrep import flush_and_lock_database, fetch_master_position
from subprocess import call

flush_and_lock_database(master)
position = fetch_master_position(master)
master.ssh("tar Pzcf " + backup_name + " /usr/var/mysql")
unlock_database(master)
call(["scp", source.host + ":" + backup_name, slave.host + ":."])
slave.stop()
slave.ssh("tar Pzxf " + backup_name + " /usr/var/mysql")
slave.start()
start_replication(slave)

What do you think? Would this be a valuable project to pursue? Here are some links related to this post: