Help - Search - Members - Calendar
Full Version: (G)AWK Script for frequency count and ratio calculation
Linuxhelp > Support > Programming in Linux
a7n9
Hello all,

I am trying to reproduce a graph given in http://magix.fri.uni-lj.si/blaz/papers/2004-PKDD.pdf that is Figure 1. Using the titanic data: http://hakank.org/weka/titanic.arff, I was able to get frequency counts for each combination; however, I am unable to get the ratio for particular cases.

If you look at the titanic data, the data are laid like these:
status, age, sex, survived
1st,adult,male,yes
1st,adult,male,yes
1st,adult,male,no

I was able to get the counts for all the unique combinations for all the cases; however, I want to get the ratio of 1st class survived to 1st class did not survive. In the given example, that ratio would be 2/1 = 2.

Here's the code that I have written so far to make it generic for any dataset and any variable value (in this case it is "yes" survived):
#!/usr/bin/gawk -f
BEGIN {
FS = OFS = ",";
Fields = 4;
Flds2use = 1;
#PredVar = 4;
ClassVal = "yes";
IGNORECASE =1;
}

### patterns1: skip blanks and comments
{sub(/\%.*/,"")} ;
/^[ \t]*$/ {next};
/@/ {next};




#\s(\w*)/

# /("[^"]*")|('[^\r]*)(\r\n)?/


{ #Records++;
Last[$NF]++;
#Data[Records,NF]=$NF;
for (i = 1; i <= NF-1; i++)
freq[$i$NF]++;
#Data[Records,i]=$i;
#prob[$i$NF]=freq[$i$NF]/Last[$NF];
}

END {
for (class in Last)
if (class != ClassVal) sum+=Last[class];
UnCondProb=Last[ClassVal]/sum;
#for (f in freq)

#print UnCondProb;
for (word in freq) {
#print word, freq[word]
print ( word ~ ClassVal)
if ( word ~ ClassVal) {
print word, freq[word]
Num[word]=freq[word]
print Num[word]}
else {
for (class in Last)
Denom[word]= freq[word]}

}
print word, Num[word],Denom[word],Num[word]/Denom[word]


}
a7n9
I have posted this at Odesk, if someone wants to make some quick money, he or she can complete this project at Odesk: http://www.odesk.com/jobs/AWK-and-LaTex-Sc...7e0e5ad6dfed9e6

Thanks
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2017 Invision Power Services, Inc.