Saturday, September 23, 2006

Learning Python

Just started to pick up on Python,

I am sure everyone who took COSC 2320 with Dr. Anderson remembers the word count program. Well after going through some of the tutorials of python here is the same program in python

Not only is it smaller, but its also easier to understand. Hopefully more to follow


import re
import string

#dictionary to store words and their counts
word_count = {}

#read in text document line by line
for line in open("trial.txt").readlines():

#remove leading and trailing whitespace
line = string.strip(line)

#split the string into words
#based on whitespace, punctuation, digits
for word in re.split("["+string.whitespace+string.punctuation+string.digits+"]",line):

#make the word lower case
word = string.lower(word)

#check if it is actually a word
if re.match("^["+string.lowercase+"]+$",word):

#increment count if true
if word_count.has_key(word):
word_count[word]+=1

#else add entry
else:
word_count[word] = 1

for w in word_count:
print w, ":" ,word_count[w]



Edit: Some more playing around


import re
import string

word_count = {}

text = open("trial.txt").read();

#list of words delimited by whitespace, punctuation and digits
#iterate by words in returned list from split
#lower case all the words in the text
words = re.split("["+string.whitespace+string.punctuation+string.digits+"]",string.lower(text))

#go through the list
for i in range(0,len(words)-1):

#as long as the word in the list is a word and is not already a key
if re.match("^["+string.lowercase+"]+$",words[i]) and not word_count.has_key(words[i]):

#add to the dictionary and get the count from the list
word_count[words[i]] = words.count(words[i])

for w in word_count:
print w,":",word_count[w]

1 comment:

Anonymous said...

Thanks, Kashif. Just learning Python and surfed the internet for a word count program until I finally found your excellent script. best, la