Tuesday, 29 April 2014

Handy HTA to format SQL results as Table for email

Have you ever tried to email the results of a SQL query in SSMS and found that the format is a mess, just tab delimited text?  I've just searched for neat way to get the results to show as a table in a new email, and it seems that plenty of people have resorted to messing about in Programmability to wrangle SQL into creating HTML formatted text.

However, I remember from my experiments with PHP that if you cut-n-paste from a table on a webpage, the table format is preserved onto the email. I decided to play with a new HTA to show the clipboard contents....


... convert to HTML and show that in a new webpage (another HTA just for simple display purposes). From there, with any CSS tarting up you like, it can then be copied into an email, looking most spiffing. Shame that the tr:nth-child(odd/even) CSS doesn't work in HTA/MSIE for alternating row colours!

 The HTA below gives results like this :


 I hope you find this useful, either functionally or for educational purposes :)

Instructions

  1. Ctrl-C your SSMS SQL query results
  2. Double-click the HTA file/shortcut
  3. Observe the clipboard data is correct, press the button
  4. Marvel at the new HTA showing your data looking more presentable
  5. Cut-n-paste to your email. Optional step: Allow yourself a smile.

With thanks to Notepad++ and the NppExport.."Copy HTML to clipboard" plugin for the syntax colouring, here is my HTA file :  Clip Table.hta


<head> <title>My SQL Table pretty-upper</title> <HTA:APPLICATION ID="oHTA" APPLICATIONNAME="HTA Test" SCROLL="no" SINGLEINSTANCE="yes" WINDOWSTATE="maximize" > </head> <SCRIPT LANGUAGE="javascript"> function NoContextMenu(){ alert("Sorry, but right-click is not allowed."); }; function replaceAll(find, replace, str) { return str.replace(new RegExp(find, 'g'), replace); }; function Window_onLoad(){ // Use the following line to disable right-click menu and prevent View Source document.oncontextmenu = new Function ( "NoContextMenu(); return false" ); // Resize and center window.resizeTo(screen.availWidth*.75, screen.availHeight*.75); window.moveTo(screen.availWidth*0.125, screen.availHeight*0.125); // Read Clipboard, convert tabs and newlines to HTML if (window.clipboardData) // Internet Explorer { txt_str1.value = window.clipboardData.getData("Text"); contents = window.clipboardData.getData("Text"); contents = replaceAll("\t","</TD><TD>",contents); // Tabs : table column boundaries contents = replaceAll("\r\n","</TD></TR>\r\n<TR><TD>",contents); // CrLf : new table row contents = "<TR><TD>" +contents +"</TD></TR></TABLE></HTML>"; // wrap up, except for TABLE/CSS }; }; function fn_run_button(){ txt_str1.value = contents; // Write a new smarter table in new HTA in %TEMP% folder var oWSS = new ActiveXObject("wscript.shell"); hta_file = oWSS.ExpandEnvironmentStrings("%TEMP%") + "\\testhta.hta"; // alert(hta_file); // line by line alternative : var a = escaped.split("\r\n"); for (var i = 0; i < oFil.length; i++) { *use* a[i] } // Write to file ( \x22 in the CSS is escape code for " ) var oFSO = new ActiveXObject("Scripting.FileSystemObject"); var oFil = oFSO.CreateTextFile(hta_file, true); oFil.WriteLine("<HTML><style type=\x22text/css\x22>"); oFil.WriteLine("#box-table { font-family: \x22Lucida Sans Unicode\x22, \x22Lucida Grande\x22, Sans-Serif;"); oFil.WriteLine(" font-size: 12px; text-align: left; border-collapse: collapse; border-top: 3px solid #9baff1; border-bottom: 3px solid #9baff1; }"); oFil.WriteLine("#box-table th { font-size: 13px; font-weight: normal; background: #b9c9fe; border-right: 2px solid #9baff1; "); oFil.WriteLine("border-left: 2px solid #9baff1; border-bottom: 2px solid #9baff1; color: #039; }"); oFil.WriteLine("#box-table td { border-right: 1px solid #aabcfe; border-left: 1px solid #aabcfe; border-bottom: 1px solid #aabcfe; color: #669; }"); oFil.WriteLine("</style> <table id=\x22box-table\x22 >"); oFil.WriteLine(contents); oFil.Close(); alert( "continue...") oWSS.run("cmd /C start " +hta_file, 0,false); oWSS = null; window.close(); }; window.onload=Window_onLoad; </SCRIPT> <BODY bgcolor="#90BBEE" > <font face="Lucida Sans"> <CENTER> <h1>SQL Tables made better for emailing</h1> This creates a new HTA from the clipboard data, which looks better and can be pasted into an email. <P><P> Clipboard <BR> <textarea name="txt_str1" cols="120" rows="40" ></textarea> <P> <input type="button" value="Create smarter table" name="run_button" onClick="fn_run_button()"> </CENTER> </BODY>

Wednesday, 26 March 2014

Do the SQL Shuffle

A little puzzle arose when I was looking an anonymising client data. At one point I wondered if simply shuffling around some fairly innocuous text fields would be enough to separate some items from other parts of the data. The answer was no, but I still spent some time in the evening figuring out how I'd do it if I needed to - remind me to Get A Life at some point! :)

How to shuffle?


Suppose you have a simple table with an ID field, and a text description. You want to shuffle the 'des' data around so that they all still exist, but in different rows to where they started.

ID

DES
11

line 11
12

line 12
13

line 13
14

line 14
15

line 15
16

line 16
17

line 17
21

line 21
23

line 23

You can play with this using a temporary table (don't forget to DROP it later) :


-- create temp test data
CREATE TABLE #DataTable (
ID int,
Des varchar(10) );

WITH DATA AS (
   SELECT 11 AS ID, 'line 11' AS dez UNION ALL
   SELECT 12, 'line 12' UNION ALL
   SELECT 13, 'line 13' UNION ALL
   SELECT 14, 'line 14' UNION ALL
   SELECT 15, 'line 15' UNION ALL
   SELECT 16, 'line 16' UNION ALL
   SELECT 17, 'line 17' UNION ALL
   SELECT 21, 'line 21' UNION ALL
   SELECT 23, 'line 23' )
INSERT INTO #DataTable (ID, Des)
SELECT * FROM DATA


--  Insert Shuffle code here :)

SELECT * FROM #DataTable;
DROP TABLE #DataTable;


After finding that Rand() only executes once per SQL statement and is therefore useless for a random value per row..... eventually I found that  Abs(CheckSum(NewID())) %90 is a handy way to get the kind of random values I was after. Deciding to try a multi-stage CTE approach, I got this far :


With d1 AS
(
SELECT  -- shove a random value in
  ID,
  Des,
  Abs(CheckSum(NewID())) % 90 as Randm
FROM #DataTable
),
d2 as
(
SELECT  -- now get what we need with order sequences
  ID as ID,
  Row_number() OVER (ORDER BY ID) AS OrigRow,
  Row_number() OVER (ORDER BY Randm) as NewRow,
  Des as ChangedDes
FROM d1
)
SELECT * FROM d2 ORDER BY OrigRow



This gave me the following, and you can play with it at
http://sqlfiddle.com/#!3/fcba5/4/0

ID ORIGROW NEWROW CHANGEDDES
11 1 8 line 11
12 2 1 line 12
13 3 7 line 13
14 4 2 line 14
15 5 5 line 15
16 6 6 line 16
17 7 9 line 17
21 8 3 line 21
23 9 4 line 23

After some deep thought (very deep for that time of night) before retiring to my chamber, I came up with the idea of joining one row to another from the CTEs, from old order to new random order, and linking that to the original row being updated. :


-- Shuffle
With d1 AS
(
SELECT
  ID, Des, Abs(CheckSum(NewID())) % 90 as Randm
FROM #DataTable
),
d2 as
(
SELECT
  ID as ID,
  Row_number() OVER (ORDER BY ID) AS OrigRow,
  Row_number() OVER (ORDER BY Randm) as NewRow,
  Des as ChangedDes
FROM d1
)
UPDATE #DataTable
SET Des =
 (SELECT swap.ChangedDes
  FROM d2 swap
   INNER JOIN d2 old ON
     swap.NewRow = old.OrigRow
  WHERE old.ID = #DataTable.ID
 )
FROM #DataTable



That does the job nicely, and runs quickly enough too. See it in action at
http://sqlfiddle.com/#!3/fcba5/6/0

If you have any better ways to do it, I'm all ears. Comments welcomed, as always!

Monday, 24 March 2014

Anonymising Client Data

(I'm back working again after many long months unemployed, so I should be able to post little odds and ends here again)

If you want to demo your application using real data that you've brought back to the office from a client site, it's rather important to remove real names and addresses, etc., from the data before letting other parties set eyes on it. It won't inspire much confidence to show a potential client another client's data!

If you're using MS SQL Server then you may have the AdventureWorks database to hand, chock full of random names that can be used to overwrite the names and addresses that you wish to remove. Getting a random number to work in a simple UPDATE statement is a bit tricky because the RAND() is only evaluated once - so you'll get a different result each time but all the rows will be set the same :(

So I called upon my favourite CTE know-how and cooked up the following. It's a cascading multi-step CTE affair. First it retrieves the Max(ContactID)from the Contact table, so that the 0-to-1 random number can be scaled up to the full range of the table from which we'll be grabbing random names.

Next, we create a "ChangeData1" temporary table of IDs for the table we wish to change, which will also contain a random row ID from the Contacts table (a starting point to look one up) thanks to getting a random number from the checksum of a NewID(), modulus the maximum row number.

At this point you can then use the next stage of the CTEs to grab a real row number (using TOP 1 where the row ID > the random number) and JOIN to that row, and you'll have full access to the AdventureWorks row to change a number of fields in one UPDATE.

Or you can have this simpler example, where ChangeData2 becomes a list of IDs and one new field from the randomly selected AdventureWorks row.

USE [MyDatabase];

With AW_Names as  -- Get Maximum ID for range of random numbers
( SELECT Max(ContactID) as MaxID
  FROM [AdventureWorks].[Person].[Contact]
),
ChangeData1 as  -- Get ID numbers for table we want to change, and a random num for each row
(
 SELECT
   ID,
   ABS(CHECKSUM(NewId())) % (SELECT MaxID FROM AW_Names) as Rando
 FROM myTable
),
ChangeData2 as  -- Get ID numbers for table we want to change, and a name from AdventureWorks for each random field
(
 SELECT
   ID AS ChangeID,
   ( SELECT TOP 1 LastName FROM [AdventureWorks].[Person].[Contact]
     WHERE ContactID > Rando
   ) As NewField
 FROM ChangeData1
)
UPDATE myTable
SET
  Lastname = ChangeData2.NewField
FROM ChangeData2 WHERE ChangeData2.ChangeID=ID





It seems to run acceptably fast, even a more complex version that updates numerous fields at once, and randomises a Date of Birth column too. Thousands of records safely cleansed in the blink of an eye. This has become my starting point for a long script that will cleanse a number of tables.

Alternatively, you could take a look at Anonimatron, which was my first port of call before I decided to go it alone...

Anonimatron

This free software (a Java .jar) can whizz through tables and fields that you specify in an XML config file, using a variety of new data types (crazy ancient Roman names for example!) and can remember how it did it so that future runs will change the same fields to the same new values if you wish to keep certain parts of your data familiar.

See the author's blog for a guide : http://rolfje.wordpress.com/2013/11/03/anonimatron-quick-start/

Having updated my Java installation (so that .jar programs.can run), I then downloaded Anonimatron from http://sourceforge.net/projects/anonimatron/

I used the folder within the zip download to unzip into a folder :
C:\Program Files\Anonimatron\anonimatron-1.7\
I took a look at the batch file supplied, and converted it to Windows line feeds, and ran it. It complained about memory, not being able to "reserve enough space for object heap". Not a great intro to the joys of Java, but an opportunity to learn...

I eventually figured out what the -Xmx2G parameter was doing, so I tried -Xmx1600m instead, and got that much working. Keeping the original batch safe, I was now using a copy called ano.bat.

Realising that I wanted another batch to play with (to call the main one) I set up ano2.bat to call the first batch as required :
ano.bat -config config.xml -synonyms synonyms.xml

This was enough to do the trick, after I got it connecting to MS SQL, which called for the installation of a free 'JDBC' driver download and install.
Grab Microsoft JDBC Driver 4.0 for SQL Server  from http://www.microsoft.com/en-us/download/details.aspx?id=11774
and then place a copy of sqljdbc4.jar from that installation into
C:\Program Files\Anonimatron\anonimatron-1.7\jdbcdrivers
alongside the MySql and PostgreSQL ones.
If you wish to use integratedSecurity=true in the connection string, you'll need to copy sqljdbc_auth.dll to the main C:\Program Files\Anonimatron\anonimatron-1.7\  folder (or sort out the library path issues!)

Example

If this command line works :
sqlcmd -S myServer\test -d myDatabase -Q "SELECT ID,LastName FROM myTable"
then this XML file config.xml should work - in this case changing a name column to wacky Life of Brian names like Augusnus, Dibocus, Ebodabogubinus etc.!


<?xml version="1.0" encoding="UTF-8"?>
<configuration jdbcurl="jdbc:sqlserver://myServer;instanceName=test;DatabaseName=myDatabase;integratedSecurity=true;">

  <table name="myTable">
    <column name="LastName" type="ROMAN_NAME" />
  </table>

</configuration>


The command parameter -configexample is used to display help, running for me via the batch file:
ano.bat -configexample

This reveals the following possibilities for anonymising your data fields (and you can also write you own 'anonymizers' too, to bolt in)  :

DEFAULT_TYPES
java.lang.String
java.util.Date

CUSTOM_TYPES
RANDOMCHARACTERS
DUTCHBANKACCOUNT
UUID
ELVEN_NAME
ROMAN_NAME
PREFETCHCHARACTERS
RANDOMDIGITS
EMAIL_ADDRESS
STRING
BURGERSERVICENUMMER

The author is responsive to requests for assistance on his blog, or via the dev website, so it's worth considering Anonimatron if it seems to suit your needs. I'm going down the route of rolling my own, any other suggestions are welcomed if you'd like to comment. Thanks.