Script to split large files

This VBscript will split all files with the matching extension (‘sql’ in this example) in the script folder into separate files of size intSize (1gb in this example). Files will not be exactly of that size but will be slightly larger depending on your line sizes and how often the script checks the output file size (can be adjusted by modifying the line with the ‘Mod’ expression in it). I did try doing it in PowerShell but it took roughly 10,000 times as long to process!

'Written by Ben Penney https://sysadminben.wordpress.com Option Explicit Dim strExtension, intSize strExtension = "sql" 'The file type to split intSize = 1000000000 'The size to split into in bytes Dim objShell, objFSO Set objShell = WScript.CreateObject("WScript.Shell") Set objFSO = CreateObject("Scripting.FileSystemObject") Dim objFile, objInput, objOutput, objOutputFile, objError Dim intLineNum, intFilenum, strRead For Each objFile in objFSO.GetFolder(objShell.CurrentDirectory).Files If LCase(Right(objFile.Name, Len(strExtension))=strExtension) Then intFilenum = 1 Set objInput = objFSO.OpenTextFile(objFile.Name) Set objOutput = objFSO.CreateTextFile(Replace(objFile.Name,"." & strExtension,intFilenum & "." & strExtension)) Set objOutputFile = objFSO.GetFile(Replace(objFile.Name,"." & strExtension,intFilenum & "." & strExtension)) Set objError = objFSO.CreateTextFile(Replace(objFile.Name,"." & strExtension,"ERROR." & strExtension)) intLineNum = 1 Do While Not objInput.AtEndOfStream strRead = objInput.ReadLine If intLineNum Mod 10000 = 0 Then 'Put this in as it slowed down the process checking size every line If objOutputFile.Size > intSize Then objOutput.Close Set objOutput = objFSO.CreateTextFile(Replace(objFile.Name,"." & strExtension,intFilenum & "." & strExtension)) Set objOutputFile = objFSO.GetFile(Replace(objFile.Name,"." & strExtension,intFilenum & "." & strExtension)) End If End If Err.Clear On Error Resume Next objOutput.WriteLine strRead If Err.Number 0 Then objError.WriteLine "error writing line " & intFilenum On Error Goto 0 intLineNum = intLineNum + 1 Loop objInput.Close objOutput.Close objError.Close End If Next Wscript.Echo "All Done."

Advertisements

Clean up corrupted SQL files

This is vbscript code I used to clean up massive SQL files that had random corruption that appeared as long strings of repeated characters,

'Written by Ben Penney https://sysadminben.wordpress.com 'Script requires sub-folders CLEAN, BAD and ERROR to be in the script path Option Explicit Dim objShell, objFSO Set objShell = WScript.CreateObject("WScript.Shell") Set objFSO = CreateObject("Scripting.FileSystemObject") Dim objFile, objInput, objOutput, objBad, objError Dim intLineNum, strRead, blnLineGood, CharNum, CharNum2 For Each objFile in objFSO.GetFolder(objShell.CurrentDirectory).Files If LCase(Right(objFile.Name, 3)="sql") Then Set objInput = objFSO.OpenTextFile(objFile.Name) Set objOutput = objFSO.CreateTextFile("CLEAN\" & objFile.Name) Set objBad = objFSO.CreateTextFile("BAD\" & objFile.Name) Set objError = objFSO.CreateTextFile("ERROR\" & objFile.Name) intLineNum = 1 Do While Not objInput.AtEndOfStream strRead = objInput.ReadLine blnLineGood = True ' ---------- START: Check for repeat characters ---------- For CharNum = 1 to Len(strRead) - 102 If Mid(strRead,CharNum,1) <> " " Then For CharNum2 = 1 to 100 blnLineGood = False ' Assume bad line unless we prove it is not If Mid(strRead,CharNum,1) <> Mid(strRead,CharNum+CharNum2,1) Then blnLineGood = True ' Line is not bad Exit For End If Next If blnLineGood = False Then Exit For End If Next ' ----------- END: Check for repeat characters ----------- ' ---------- START: Check for SQL syntax ---------- If Left(strRead,6) <> "INSERT" And Right(strRead,2) <> ")," And Right(strRead,2) <> ");" Then blnLineGood = False If Left(strRead,6) <> "INSERT" And Left(strRead,1) <> "(" Then blnLineGood = False ' ----------- END: Check for SQL syntax ----------- ' ---------- START: Write line to output files. Catch errors. ---------- On Error Resume Next Err.Clear If blnLineGood = True Then objOutput.WriteLine strRead Else objBad.WriteLine strRead End If If Err.Number <> 0 Then objError.WriteLine "error writing line " & intLineNum On Error Goto 0 ' ----------- END: Write line to output files. Catch errors. ----------- intLineNum = intLineNum + 1 Loop objInput.Close objOutput.Close objBad.Close objError.Close End If Next Wscript.Echo "All Done."


 **Script requires sub-folders CLEAN, BAD and ERROR to be in the script path**

It is pretty rough but basically it checks each character in each line for repeated characters (ignoring spaces). If it finds the same character repeated 100 times it will output that line to the BAD file.

Additionally it checks that each line starts in ( and ends in either ), or ); (excluding INSERT lines). If it fails these tests it will output the line to the BAD file otherwise the line goes into the CLEAN file.

The ERROR file is there to catch errors with writing the line to either file which seem to pop up occasionally. The line numbers of the failed line are recorded in this file.

Can be easily tweaked to suit any sort of syntax check you were interested in.

Thanks,
Ben

MySQL handy commands

After spending 3 days so far manually recovering databases in MySQL I thought I would make my own cheat sheet for useful commands.

List size of all databases in MySql,

SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP BY table_schema;

List the row count of all tables in a database (estimate only),

SELECT table_name, table_rows FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'DB_NAME';

List the row count of all tables in a database accurately (still not sure how to add the quotes to the output correctly),

mysql -B -uusername -ppassword --disable-column-names --execute "SELECT CONCAT('SELECT ""',table_name,'"" AS table_name, COUNT(*) AS exact_row_count FROM ""',table_schema,'"".""',table_name,'"" UNION ') FROM INFORMATION_SCHEMA.TABLES WHERE table_schema = 'DB_NAME';"

This will output lines such as,

SELECT DB_NAME AS table_name, COUNT(*) AS exact_row_count FROM DB_NAME.TABLE_NAME UNION

Grab the output leaving off the last UNION and finish the command with a ‘;’

Return the number of tables in the database,

SELECT COUNT(*) FROM information_schema.tables WHERE table_schema = 'DB_NAME';

Import all .sql files in current directory (can modify wildcard search to be more exclusive),

find . -name '*.sql' | awk '{ print "source",$0 }' | mysql --batch DB_NAME