Clean up corrupted SQL files

This is vbscript code I used to clean up massive SQL files that had random corruption that appeared as long strings of repeated characters,

'Written by Ben Penney 'Script requires sub-folders CLEAN, BAD and ERROR to be in the script path Option Explicit Dim objShell, objFSO Set objShell = WScript.CreateObject("WScript.Shell") Set objFSO = CreateObject("Scripting.FileSystemObject") Dim objFile, objInput, objOutput, objBad, objError Dim intLineNum, strRead, blnLineGood, CharNum, CharNum2 For Each objFile in objFSO.GetFolder(objShell.CurrentDirectory).Files If LCase(Right(objFile.Name, 3)="sql") Then Set objInput = objFSO.OpenTextFile(objFile.Name) Set objOutput = objFSO.CreateTextFile("CLEAN\" & objFile.Name) Set objBad = objFSO.CreateTextFile("BAD\" & objFile.Name) Set objError = objFSO.CreateTextFile("ERROR\" & objFile.Name) intLineNum = 1 Do While Not objInput.AtEndOfStream strRead = objInput.ReadLine blnLineGood = True ' ---------- START: Check for repeat characters ---------- For CharNum = 1 to Len(strRead) - 102 If Mid(strRead,CharNum,1) <> " " Then For CharNum2 = 1 to 100 blnLineGood = False ' Assume bad line unless we prove it is not If Mid(strRead,CharNum,1) <> Mid(strRead,CharNum+CharNum2,1) Then blnLineGood = True ' Line is not bad Exit For End If Next If blnLineGood = False Then Exit For End If Next ' ----------- END: Check for repeat characters ----------- ' ---------- START: Check for SQL syntax ---------- If Left(strRead,6) <> "INSERT" And Right(strRead,2) <> ")," And Right(strRead,2) <> ");" Then blnLineGood = False If Left(strRead,6) <> "INSERT" And Left(strRead,1) <> "(" Then blnLineGood = False ' ----------- END: Check for SQL syntax ----------- ' ---------- START: Write line to output files. Catch errors. ---------- On Error Resume Next Err.Clear If blnLineGood = True Then objOutput.WriteLine strRead Else objBad.WriteLine strRead End If If Err.Number <> 0 Then objError.WriteLine "error writing line " & intLineNum On Error Goto 0 ' ----------- END: Write line to output files. Catch errors. ----------- intLineNum = intLineNum + 1 Loop objInput.Close objOutput.Close objBad.Close objError.Close End If Next Wscript.Echo "All Done."

 **Script requires sub-folders CLEAN, BAD and ERROR to be in the script path**

It is pretty rough but basically it checks each character in each line for repeated characters (ignoring spaces). If it finds the same character repeated 100 times it will output that line to the BAD file.

Additionally it checks that each line starts in ( and ends in either ), or ); (excluding INSERT lines). If it fails these tests it will output the line to the BAD file otherwise the line goes into the CLEAN file.

The ERROR file is there to catch errors with writing the line to either file which seem to pop up occasionally. The line numbers of the failed line are recorded in this file.

Can be easily tweaked to suit any sort of syntax check you were interested in.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s