An interesting Python csv trick

A ‘typical’ csv file might look something like


which means you could nicely parse and use it in python along the lines of

import csv

csv_content = csv.DictReader(open("my.csv", 'r'))
for row in csv_content:
    p.username = row['username']
    p.password = row['password']
    self.verify_equal(p.error, "invalid login!")

But what if you have a csv file as input that you have absolutely no control over, and isn’t nice and uniform in its composition? Now you need to be clever. So where would you find such a csv? Well, how about an airline who has the list of origin airport codes in the first column, and the rest of the columns are filled with airport codes which those flights can go to? Since all origins are not equal, this will be a very ragged edged csv. (And having seen just this, it is.) Oh, and there are no column headers.

Actually, the lack of column headers is a good thing in this case.

If you read the docs on DictReader, there are two optional parameters you can pass to it.

  • fieldnames – a list of headers to use
  • restkey – the header to use for everything else — which get glomed together as a list

armed with this the parsing of the file and later usage looks something like this

import csv
import random

csv_content = csv.DictReader(open("codes.csv", 'r'), fieldnames=["origin"], restkey=["destinations"])
for row in csv_content:
    p.origin = row['origin']
    p.destination = random.choice(row['destinations'])

You likely won’t need this very often, but if you do, here you go.

Comments 1

  1. Abhijeet Shukla wrote:

    Hey Adam,
    Not sure if you are still maintaining this blog but
    should not restkey=[“destinations”] actually be restkey=”destinations”. This is because if we have multiple values/columns (destinations in your case) restkey will return a List hence reference should be “destinations”
    ‘ not [“destinations”].

    Posted 26 Feb 2018 at 4:47 am

Post a Comment

Your email is never published nor shared. Required fields are marked *