Powershell – Removing data in brackets in string

I was given a task to auto generate email addresses based on first name and last name. Given that some people like to put data in their names in brackets to denote nicknames or maiden names one of the tasks was to reliably remove this data before working out the email address.

Trusty Google comes up with the following solution from this blog,

"string" -replace "\([^\)]+\)",""

This is a pretty elegant solution that produces the following results,

TEST 1
PS> "test(adsf)" -replace "\([^\)]+\)",""
test
TEST 2
PS> "test(adsf)(test)" -replace "\([^\)]+\)",""
test
TEST 3
PS> "test(adsf) (test)" -replace "\([^\)]+\)",""
test
TEST 4
PS> "test(adsf)xx(test)" -replace "\([^\)]+\)",""
testxx
TEST 5
PS> "test()" -replace "\([^\)]+\)",""
test()
TEST 6
PS> "test(a(dsf) (t)est)" -replace "\([^\)]+\)",""
test est)
TEST 7
PS> "test(adsf" -replace "\([^\)]+\)",""
test(adsf
TEST 8
PS> "test)" -replace "\([^\)]+\)",""
test)

Since I want a solution that removes every bracket no matter what I needed to take this a little further.

By replacing the + from the RegEx string with a * this will allow empty strings between the bracket start and end which resolves TEST 5,

TEST 5
PS> "test()" -replace "\([^\)]*\)",""
test

This works because the + symbol says I should have one or more of the character before me in the string which is [^/)] meaning not end bracket. By replacing it with the * symbol I am saying you can have none or more of those characters.

By adding a * character to the end of the RegEx string we can also accommodate the string with the missing closing bracket making the assumption that there should have been a closing bracket at the end of the string (ignore this if you do not want to make this assumption),

TEST 7
PS> "test(adsf" -replace "\([^\)]*\)*",""
test

As for the other two failures, we will just have to take it on the chin and accept that we can’t resolve everything and simply remove the extra characters with an additional replace statement,

"string" -replace "\([^\)]*\)*","" -replace "[\(\)]",""

Which provides the following results,

TEST 1
PS> "test(adsf)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test
TEST 2
PS> "test(adsf)(test)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test
TEST 3
PS> "test(adsf) (test)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test
TEST 4
PS> "test(adsf)xx(test)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
testxx
TEST 5
PS> "test()" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test
TEST 6
PS> "test(a(dsf) (t)est)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test est
TEST 7
PS> "test(adsf" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test
TEST 8
PS> "test)" -replace "\([^\)]*\)*","" -replace "[\(\)]",""
test

As always let me know if I have forgotten anything or you have any questions.
Thanks

Advertisements