Linux Help
guides forums blogs
Home Desktops Distributions ISO Images Logos Newbies Reviews Software Support & Resources Linuxhelp Wiki

Welcome Guest ( Log In | Register )



Advanced DNS Management
New ZoneEdit. New Managment.

FREE DNS Is Back

Sign Up Now

a7n9
Posted on: Apr 3 2008, 08:48 AM


Whats this Lie-nix Thing?
*

Group: Members
Posts: 2
Joined: 2-April 08
Member No.: 13,413


I have posted this at Odesk, if someone wants to make some quick money, he or she can complete this project at Odesk: http://www.odesk.com/jobs/AWK-and-LaTex-Sc...7e0e5ad6dfed9e6

Thanks
  Forum: Programming in Linux · Post Preview: #30131 · Replies: 1 · Views: 6,030

a7n9
Posted on: Apr 2 2008, 03:27 PM


Whats this Lie-nix Thing?
*

Group: Members
Posts: 2
Joined: 2-April 08
Member No.: 13,413


Hello all,

I am trying to reproduce a graph given in http://magix.fri.uni-lj.si/blaz/papers/2004-PKDD.pdf that is Figure 1. Using the titanic data: http://hakank.org/weka/titanic.arff, I was able to get frequency counts for each combination; however, I am unable to get the ratio for particular cases.

If you look at the titanic data, the data are laid like these:
status, age, sex, survived
1st,adult,male,yes
1st,adult,male,yes
1st,adult,male,no

I was able to get the counts for all the unique combinations for all the cases; however, I want to get the ratio of 1st class survived to 1st class did not survive. In the given example, that ratio would be 2/1 = 2.

Here's the code that I have written so far to make it generic for any dataset and any variable value (in this case it is "yes" survived):
#!/usr/bin/gawk -f
BEGIN {
FS = OFS = ",";
Fields = 4;
Flds2use = 1;
#PredVar = 4;
ClassVal = "yes";
IGNORECASE =1;
}

### patterns1: skip blanks and comments
{sub(/\%.*/,"")} ;
/^[ \t]*$/ {next};
/@/ {next};




#\s(\w*)/

# /("[^"]*")|('[^\r]*)(\r\n)?/


{ #Records++;
Last[$NF]++;
#Data[Records,NF]=$NF;
for (i = 1; i <= NF-1; i++)
freq[$i$NF]++;
#Data[Records,i]=$i;
#prob[$i$NF]=freq[$i$NF]/Last[$NF];
}

END {
for (class in Last)
if (class != ClassVal) sum+=Last[class];
UnCondProb=Last[ClassVal]/sum;
#for (f in freq)

#print UnCondProb;
for (word in freq) {
#print word, freq[word]
print ( word ~ ClassVal)
if ( word ~ ClassVal) {
print word, freq[word]
Num[word]=freq[word]
print Num[word]}
else {
for (class in Last)
Denom[word]= freq[word]}

}
print word, Num[word],Denom[word],Num[word]/Denom[word]


}
  Forum: Programming in Linux · Post Preview: #30130 · Replies: 1 · Views: 6,030


New Posts  New Replies
No New Posts  No New Replies
Hot topic  Hot Topic (New)
No new  Hot Topic (No New)
Poll  Poll (New)
No new votes  Poll (No New)
Closed  Locked Topic
Moved  Moved Topic
 

RSS Lo-Fi Version Time is now: 21st October 2017 - 03:51 AM