Subject: Excel users: here's a proc to decode CSV data lines - DN [1]


Dobeedo <dobeedo@my-deja.com> - 23 Nov 1999 - comp.lang.tcl

 Here's a proc I wrote to decode lines read from CSV files (comma-
 separated format) as created by Excel. I thought other people could
 have the same needs as myself. So here's the proc.  If you think you
 have got a better proc, please post it here.  Thanks.

 You may be tempted to use a line like:
    set cells [split $line ,]
 since a comma is used to separate the cell contents on each line.
 However there're different cases that will cause trouble.  The proc
 below handles occurrences of double-quotes and commas in the Excel
 cells.

 # {{{ decode_csv_line

 # decode a line read from a CSV formatted file (as saved by MS Excel)
 # and return data as an list of items.
 # Example used for testing the proc:
 #   set line {a,"un,deux,trois",c,"le ""voici""
 #             le ""voila""",e,"""voici""","""voila"" x"}

 proc decode_csv_line {line} {
     # Handle multiple occurrences of double quote character
     regsub -all {\"\"} $line {_CSV:DBLQUOTE_} line

     # Remove csv quotation around cell content and replace comma within
     # cell content with special symbol
     set newline {}
     set inquote 1
     foreach item [split $line {\"}] {
     puts $item

     set inquote [expr 1-$inquote]

     if $inquote {
         # replace commas within cells by special code
         regsub -all {,} $item {_CSV:COMMA_} item
         set newline "$newline$item"
     } else {
         set newline "$newline$item"
     }
     }

     set items {}
     foreach item [split $newline ,] {
     regsub -all _CSV:DBLQUOTE_ $item "\"" item
     regsub -all _CSV:COMMA_ $item {,} item
     lappend items $item
     }

     return $items
 }

 # }}}

 Voila!  Hope it's useful to you.
 Dobeedo.

 Sent via Deja.com http://www.deja.com/
 Before you buy.

Last modified
1999-12-10

(195.108.246.50)

Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated