Hey, Scripting Guy!

The Scripting Guys Answer Your Questions

Hey, Scripting Guy!

Welcome to the TechNet column in which the Microsoft Scripting Guys address commonly-asked questions about system administration scripting. Got a question about system administration scripting? Then send an email to scripter@microsoft.com (in English, if possible). We can’t guarantee we’ll be able to answer every question that comes our way, but we’ll do our best.

Resources

Hey, Scripting Guy! Archive by Category

Hey, Scripting Guy! Archive by Date

Hey, Scripting Guy! Download

Spacer

*

How Can I Remove All the Double Quote Marks in a Text File?

Hey, Scripting Guy! Question

Hey, Scripting Guy! How can I remove all the double quote marks in a text file?

-- DD

SpacerHey, Scripting Guy! AnswerScript Center

Hey, DD. As we’ve noted before, we could probably answer a text file question every day in this column. (Why don’t we? Well, for one thing, Hey, Text File Guy!just doesn’t have quite the same ring as Hey, Scripting Guy! If you know what we mean.) This seemed like a good question for us to answer, however, and for at least two reasons.

For one, it’s very common to find individual fields within a text file to be bracketed by double quote marks; for example, many applications that save data as a comma-separated-values (CSV) file save that data in this format:

"Ken Myer","Research Director","North American Research Lab" 
"Pilar Ackerman","Associate Vice-President","European Sales"

There’s actually a good reason why data is often saved this way. For example, suppose Pilar Ackerman actually worked for this division:

Sales, European

If the individual fields weren’t enclosed in double quote marks the CSV file would look like this:

Ken Myer,Research Director,North American Research Lab 
Pilar Ackerman,Associate Vice-President,Sales, European

As you can see, Ken Myer’s record contains three fields:

Ken Myer

Research Director

North American Research Lab

By contrast, Pilar Ackerman’s record appears to contain four fields:

Pilar Ackerman

Associate Vice-President

Sales

European

That’s why you’ll often see the double quotes surrounding individual fields.

But while these double quotes are often necessary, any time they aren’t necessary they create a problem for script writers. For example, suppose you want to use the VBScript Split function to split each line in the file into its constituent parts. What the heck do you split on? In a case like that, you’ll find it much easier to work with the text file if you first remove all the double quotes.

For today’s column we’re going to assume that you’ve already determined that you can remove all the double quotes in the file without affecting the number of fields found on each line. Of course, doing so leads to a second problem: although we can use the VBScript Replace function to replace each double quote with nothing, how do we indicate this in code? Here’s a hint: this line of code will not work:

strContents = Replace(strContents, """, "")

In other words, trying to enclose double quotes within double quotes just leads to an “Unterminated string” error.

So what do we do about that? Well, there are a couple different ways to work around this issue, but the one we like best uses the Chr function. As you probably know, anything you can type on the keyboard has an ASCII equivalent value; for example, the uppercase A has an ASCII value of 65. The Chr function converts an ASCII value to the actual character; give Chr an ASCII value and it will return the character. Thus we can search for and replace the ASCII value 34 (which corresponds to the double quote mark) using code like this:

strContents = Replace(strContents, Chr(34), "")

Note. For a list of characters and their ASCII equivalents, see the VBScript documentation on MSDN. One nice thing about using the Chr function is that it makes it easy to search for “weird” characters like carriage returns, tabs, blank spaces and non-printing characters.

Confused? Maybe seeing the entire script - and the subsequent explanation - will help:

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile("C:\Scripts\Test.txt", ForReading)
strContents = objFile.ReadAll
objFile.Close

strContents = Replace(strContents, Chr(34), "")

Set objFile = objFSO.OpenTextFile("C:\Scripts\Test.txt", ForWriting)
objFile.WriteLine strContents
objFile.Close

We begin by defining two constants - ForReading and ForWriting - that we’ll use when working with the text file. We create an instance of the FileSystemObject and then call the OpenTextFile method to open the file C:\Scripts\Test.txt. After the file is open we use the ReadAll method to read the entire file and store the contents in a variable name strContents. With the contents safely stored in memory we then close the file.

Why do we close the file? Well, we can’t use the FileSystemObject to directly modify the text file itself; that’s why we opened the file and read the contents into memory. What we’ll end up doing is changing the contents in memory and then writing this revised information back to the file. Because a file can be open for reading or for writing (but not both) we need to close Test.txt then open it again, this time for writing.

Next we use this line of code to remove all the double quotes from the file contents stored in memory:

strContents = Replace(strContents, Chr(34), "")

All we’re doing here is using the Replace function to search the variable strContents and look for all the characters with an ASCII value of 34; as we’ve already noted, that will be all the double quote marks. And what are we replacing these quote marks with? Nothing, as represented by an empty string (“”).

After modifying the file contents in memory we then re-open our text file and use the WriteLine method to replace the existing file with the new, modified contents. Upon closing the file we’ll have exactly the same text file as we had before, minus the double quote marks.


Top of pageTop of page