Monday, January 21, 2008

Stored Procedures in MySQL

Stored procedures were introduced in the MySQL database server in version 5.0. It has been interesting to see the growth of stored procedures in the MySQL community because it is completely different with how stored procedures grew in the Oracle community.

Stored Procedures in Oracle
Stored procedures were introduced in Oracle 7, back in the client/server glory days. DBAs and developers using Oracle Forms and Reports seen the immediate benefit of using stored procedures in the database and PL/SQL stored procedures became very popular. A large part of this was due to developers considered themselves "database developers" so they tried to always leverage the features in the database. The PL/SQL language has grown through all the releases of Oracle and now in Oracle 11g Database Server, PL/SQL is still extremely popular and experienced Oracle DBAs and developers find tremendous benefits in writing stored procedures.

Stored Procedures in MySQL
I have not seen a great adoption of stored procedures in MySQL. I believe one of the reasons is due to the community of developers writing code for MySQL and open source applications. The core developers who write applications for MySQL are using Perl, Python, Ruby on Rails and PHP. They think first of how to write applications leveraging the benefits of the language they are working with and not the database. They write feature and GUI rich applications that run in the middle tier. So it is not natural for most of they to look and leverage the features in the MySQL database server. As web applications grow larger and access larger data sets this approach can effect performance as applications grow popular.

Remember the Wisdom of the Ancient Ones
If you study history or watch the history channel you are probably aware that long time ago, there were a group of ancient ones who had acquired great knowledge that has slowly been lost through the ages. They were called the Main Framers. One of the core beliefs of these ancient ones was that the closer you put your applications to the data the faster they run if they are data intensive. Slowly with the growth of client/server and then multi-tiered applications, this great wisdom has been slowly forgotten over time. There are new generations of developers who now even question if the people known as the Main Framers even really existed or if it is a myth.

The Benefits of using Stored Procedures
When writing data intensive algorithms, the most efficient place to run this code is in the database server. Runtime environments leverage the execution of the application code, not the database code. It does not make sense to access data, send it across the network, then filter and massage it in the middle tier application code. It is best to run this data intensive code in the database server, filter it and massage it in the database server and send what needs to be displayed to the middle tier. This approach will reduce network traffic and leverage what the database server and the middle tier each do best.

Growth of Stored Procedures in MySQL
Version 5 of MySQL introduced stored procedures with basic functionality. In version 6 of MySQL and future releases you can expect to see a lot of enhancements that will continue to leverage running data intensive code in the database server.

It would be wise for open source developers to look at how they can filter and massage data in the database server and send to the middle tier only the data they need to process and display for their application.

9 comments:

ryan said...

What sort of feature have you heard are planned for stored procs in MySQL 6?

I would love to be able to pass around a rowset, or use the output from a stored proc as the input for another without having to rely on temporary session tables.

Unknown said...

Ryan,
I do not know all the details of which features will make the final cut for version 6.0. I can tell you that everyone understands that stored procedures were introduced with basic functionality and that stored procedures are going to be an area of higher focus for future releases.

Below, I have follow-up on passing cursors with stored procedures (sorry not the answer you were looking for).

29.4.18: Can I pass a cursor as an IN parameter to a stored procedure?

In MySQL 6.0, cursors are available inside stored procedures only.

29.4.19: Can I return a cursor as an OUT parameter from a stored procedure?

In MySQL 6.0, cursors are available inside stored procedures only. However, if you do not open a cursor on a SELECT, the result will be sent directly to the client. You can also SELECT INTO variables.

Anonymous said...

"When writing data intensive algorithms, the most efficient place to run this code is in the database server"

Not necessarily. If you want to truly scale, you want to use the database for reading and writing data, and not processing it. As much work as you can do in the application tiers, the better, because, in general, it is easier and cheaper to add capacity on the application tier than on the database tier.

The reason stored procedures became popular with Oracle and other RDBMS was because of the old client-server and scale-up paradigms; these paradigms have changed now to one of scale-out, commodity architecture. One reason MySQL became so popular was because it *didn't* try to do everything in the database; architects could scale their applications on commodity hardware because they could add application capacity without needing to upgrade to expensive database servers...

A third reason to strongly consider not using stored procedures for modern web applications is that MySQL's stored routine compile cache is on *the connection thread* and not in a global allocation area like it is with Oracle and others. This means each connecting thread will allocate resources for the stored routine compilation cache. In addition, languages like PHP, which generally do not use persistent connections, get very little benefit (and in fact a degradation in performance in some cases) when you use stored procedures, because you open a connection, compile the routine code, then immediately throw away the compile cache on disconnect of the PHP-MySQL resource...

Just my two cents...

-jay

Unknown said...

Jay,

You raised some good points. Get a bunch of DBAs and developers together and this debate can rage all night. A good point about Oracle. Oracle has optimizd the PL/SQL engine, optimizer and compiler for the code and native compilation allows stored procedures to run blazing fast in Oracle.

The point of the article is that most developers are not even considering using stored procedures in MySQL. Our development has gone to far to the other extreme. When doing consulting I often see developers bringing across hundreds of thousands and millions of records across the Internet for tables that are growing and then doing the filtering in the application. Performance continues to degrade due to the increasing number of records that are coming across the network. As a database person, I'd prefer to filter the data in the database server to minimize the data going across the network that will not be displayed. Also, I believe MySQL is going to improve the performance of stored procedures so more data processing in the database server will increase in performance.
A lot of companies are being hurt today by inefficient database code being written in web applications. If you look at data intensive processing, most database servers will process data entensive code faster that application environments. If/when native compilation comes to stored procedures in MySQL, data manipulation will be as fast in the database server. Remember if you are going to send large data streams down to the application server, the application server environment had better manipulate that data fast enough to justify all the data going across the network that will not be displayed.

ryan said...

Jay, I agree with your analysis about how oracle-style stored procedures don't always make sense on MySQL. However we find it useful absorb the (very minor) performance hit in order to have clear separation between database and application code.

You can build up a set stored procs and then optimize or otherwise alter the SQL without having to redeploy your application code, which is a nice benefit. And the DBA or architect can quickly see what is going on without having to dig through pages of PHP or Java code.

Unknown said...

Ryan,
I was talking to Roland Bouman about not being able to use cursors outside of stored procedures with the current GA release of MySQL and he offered his solution that I thought I would share with you.

"Well, it can't be done like you can in oracle - i.e. you cannot pass the handle to a cursor. However, there are ways around it - you can:

1) put the result in a TEMPORARY table. Usually people use the MEMORY engine to get the equivalent of a PLSQL table (associative array). Obviously the procs need to be aware of the existence of the temporary table as nothing is explicitly passed as a procedure parameter.
To pass a cursor, nothing really changes - the 'receiving' proc simply declares a cursor, referencing the temporary table in the query expression of the cursor declaration.

2) if it is not possible or convenient to copy results to a temporary table, you can pass a SQL string that makes up the query that delivers the resultset. The receiving proc can then use PREPARE to create a VIEW, and declare a cursor on that view if so required. Obviously the proc needs to clean up the VIEW, or declare it with CREATE OR REPLACE. It's a hack, but it works ;)"

Right now the limit of using cursors only in a stored procedure is forcing us to be a little more creative with our code.

ryan said...

George,

That's precisely what we do. The one problem we have with the temporary table solution is that in case of replication failure, the slaves could resume with an empty or missing temporary table, so anything in the binlog that depends on the temporary table is unreliable.

You can work around that like PeterZ suggests by using real tables, but then you need to grant your web accounts DROP privileges which is risky. Thankfully 5.1 will have data-based replication instead of just statement-based, so that will plug the problem I mentioned with temporary and slaves.

If you're also going to MySQLConf 2008 we'll have to try to convince more of our fellow MySQL DBAs of the value of stored procs. Then maybe future versions will add some of these useful features and improve cursor performance. :-)

Unknown said...

Ryan,

I think you have started an important thread regarding stored procedures. We may be adding a presentation for stored procedures to address the areas you have brought to attention.

Thanks, you have highlighted some important areas we need to address within the product and the community.

Anonymous said...

Using stored procedures in MySQL is a no brainer, the advantages are so obvious.
When using inline queries (where sql query is in the code) the MySQL database server has to determine the query path each time its runs causing a considerable performance overhead, it also parses the data between where ever the application server and MySQL database server is based (network bandwidth, bad bad when can be avoided!)
When using a stored procedures the procedure is compiled first time it runs and the query optimizer does not have to determine a query path for next time.

Yes MS SQL Server have has had some fancy work done on it so it remembers any inline query and does not have to keep reworking the query path and optimizer each time but it does not work that way with MySQL. So with SQL Server stored proc’s and inline queries theoretically offer the same performance.

There are also many other advantage for stored procedures but the above covers the main ones yes keeping the data querying logic close to the data source makes real good sense!

I’ve been using MySQL with asp.net it does work dam brilliantly I cant think of any reason why I’d fork out a large amount of $$$ for SQL Server when I can upgrade to MySQL enterprise for a comparatively small amount of money.