Why XML: Most of the applications out there use xml as the transport medium so someday or other we have to face it.
Tutorial For whom :
- This tutorial is for all the people who want an easy way to parse xml, extract data and put the data into database.
- It’s for the people who migrated from T-SQL and missing the excellent support of xml in mysql.( Atleast I miss it) This will serve as a workaround till the mysql core develops.
Tools needed : MySql database ( I tested it on v 5.5.8), Toad for MySql (optional)
Knowledge required on : extractValue() function of mysql, basic xml knowledge.
NOTE: All the important data will be highlighted in red colour and has courier new font.
Lets start with gaining some basic knowledge of extractValue() function used in MySql.
Well extractValue() as the name suggests it extracts text(cdata) from the xml element.
For more information please have a look at MySql XML Functions Documentation
A basic example of extractValue()
SELECT EXTRACTVALUE('<root><element>data</element></root>','/root/element') AS data;
Make sure that you connect to some database to test and type the above code into your Toad editor window.
You can see the result as below
<params> <param> <paramname>paramname 1</paramname> <paramvalue>paramvalue 1</paramvalue> </param> <param> <paramname>paramname 2</paramname> <paramvalue>paramvalue 2</paramvalue> </param> <param> <paramname>paramname 3</paramname> <paramvalue>paramvalue 3</paramvalue> </param> </params>
There are 6 steps in doing that
1) Setting up xml as input parameter for stored procedure.
Create a stored procedure something like this
DROP PROCEDURE IF EXISTS newdb1.import_xml; CREATE PROCEDURE newdb1.`import_xml`(params TEXT) BEGIN END;
Here params is given as text because our xml can grow very long in real time scenarios.
2) Declaring necessary variables.
DECLARE rowcount INT UNSIGNED DEFAULT 0; DECLARE paramcount INT UNSIGNED; DECLARE paramname VARCHAR(255); DECLARE paramvalue VARCHAR(255);
Here rowcount is to iterate through param’s in xml, paramcount is number of param elements in xml, paramname and paramvalue are variables to store extracted data from respective elements.
3) A temporary table to store the extracted data.
CREATE TEMPORARY TABLE temp(paramname VARCHAR(255), paramvalue VARCHAR(2000));
Please note that paramname has given 255 characters and paramvalue is given 2000 because paramname is value of variable name in business context and we don’t expect people write stories as variable names and paramvalue is value of variable and here we can expect large data so we gave it 2000. It depends on your business context which length suffices you. Please change as needed.
4) Count the number of params in whole xml.
SET paramcount := EXTRACTVALUE(params,'COUNT(/params/param)');
This will give count of params in xml which in our case is 3.
5) Iterate over each param, extract paramname, paramvalue and inserting them into temp table.
WHILE rowcount < paramcount DO SET rowcount := rowcount + 1; SET paramname := CONCAT('/params/param[',rowcount,']/paramname[1]'); SET paramvalue := CONCAT('/params/param[',rowcount,']/paramvalue[1]'); INSERT INTO temp(paramname, paramvalue) VALUES(EXTRACTVALUE(params, paramname), EXTRACTVALUE(params, paramvalue)); END WHILE;
Here is the interesting part.
A while loop which iterates over all the param elements, extracts data from elements and then it inserts into temp table.
Note : If we use concat(‘params/param/paramname’) then we will get 3 matching elements and if we extract, then all the data in those 3 elements will get concatenated separated by comma which we don’t want. So we iterate over elements based on our row count.
Below is the result if we use concat as above
6) Deleting temporary table for the subsequent uses.
DROP TABLE temp;
This deletes the temporary table created and it is a good practice to remove resources when we don’t need them anymore. We can have a select statement before deleting to see what data we have got in our table. But in real time scenarios we might not need them.
Note : Please comment all the statements which are for testing, unnecessary select statements and unload all unnecessary resources from memory after completion of task.
Now the final stored procedure with all the bits attached together.
DROP PROCEDURE IF EXISTS newdb1.import_xml; CREATE PROCEDURE newdb1.`import_xml`( params TEXT ) /* '<params> <param> <paramname>paramname 1</paramname> <paramvalue>paramvalue 1</paramvalue> </param> <param> <paramname>paramname 2</paramname> <paramvalue>paramvalue 2</paramvalue> </param> <param> <paramname>paramname 3</paramname> <paramvalue>paramvalue 3</paramvalue> </param> </params>' */ BEGIN DECLARE rowcount INT UNSIGNED DEFAULT 0; DECLARE paramcount INT UNSIGNED; DECLARE paramname VARCHAR(255); DECLARE paramvalue VARCHAR(255); CREATE TEMPORARY TABLE temp(paramname VARCHAR(255), paramvalue VARCHAR(2000)); -- calculate the number of row elements. SET paramcount := EXTRACTVALUE(params,'COUNT(/params/param)'); -- loop through all the row elements WHILE rowcount < paramcount DO SET rowcount := rowcount + 1; SET paramname := CONCAT('/params/param[',rowcount,']/paramname[1]'); SET paramvalue := CONCAT('/params/param[',rowcount,']/paramvalue[1]'); INSERT INTO temp(paramname, paramvalue) VALUES (EXTRACTVALUE(params, paramname), EXTRACTVALUE(params, paramvalue)); END WHILE; SELECT * FROM temp; DROP TABLE temp; END;
One can call the stored procedure as below. I used concat just to make it visually better for readers but there is no need for that.
CALL import_xml(CONCAT('<params>' '<param><paramname>paramname 1</paramname><paramvalue>paramvalue 1</paramvalue></param>' '<param><paramname>paramname 2</paramname><paramvalue>paramvalue 2</paramvalue></param>' '<param><paramname>paramname 3</paramname><paramvalue>paramvalue 3</paramvalue></param>' '</params>') );
The final output looks something like this
Since we have the data in temp table we can do anything with it. After this stage we don’t use xml to get and set data.