Hi!
I am trying to do a decision tree using my implementation of ID3 but I am having some problems with one of my functions.
For example, I have the following data frame:
julia> df
14×6 DataFrame
Row │ pa as ic aa oa af
│ String7 String7 String7 String3 String3 String3
─────┼──────────────────────────────────────────────────────
1 │ alta alto alto no no si
2 │ alta alto alto si no si
3 │ baja alto bajo no no si
4 │ media alto alto no si no
5 │ media bajo alto si si no
6 │ baja bajo alto si si si
7 │ alta bajo alto si no si
8 │ alta bajo bajo no si si
9 │ alta alto bajo si si no
10 │ baja bajo alto si si si
11 │ media bajo bajo si si si
12 │ alta bajo alto si si no
13 │ baja alto alto si si si
14 │ baja alto bajo no no si
The class is af
. From within my code I wrote a function that “creates” a tree, although it is not done properly, but it gets the job done for my purposes.
First, I load the database file:
julia> df = dt.read_database("data/administar_farmaco.csv"; dropcols=[:n])
Then I can just generate the tree as:
julia> tree = dt.tree("af", df)
4-element Vector{Any}:
"pa"
Any[InlineStrings.String7("alta"), "oa", Any[InlineStrings.String3("no"), InlineStrings.String3["si"]], Any[InlineStrings.String3("si"), "aa", Any[InlineStrings.String3("no"), InlineStrings.String3["si"]], Any[InlineStrings.String3("si"), InlineStrings.String3["no"]]]]
Any[InlineStrings.String7("baja"), InlineStrings.String3["si"]]
Any[InlineStrings.String7("media"), "ic", Any[InlineStrings.String7("alto"), InlineStrings.String3["no"]], Any[InlineStrings.String7("bajo"), InlineStrings.String3["si"]]]
I was able to write a function that takes that output, tree
, and parses it:
julia> dt.preetyprint(tree)
pa
alta
oa
no si
si
aa
no si
si no
baja si
media
ic
alto no
bajo si
Which can be more clearly understood if I just format it a little it by hand as:
pa
|--- alta
|--- oa
|--- no --- si
|--- si
|--- aa
|--- no --- si
|--- si --- no
|--- baja --- si
|--- media
| --- ic
|--- alto --- no
|--- bajo --- si
It’s not pretty haha but it gets the job done to allow me to visualize the decision tree.
The problem is that I wanted to create a dictionary in the first place where each entry was another dictionary.
I found something similar to what I want to achieve here in this kaggle notebook written in Python.
The owner of that notebook achieved to have each key of each dictionary as a node of the tree I showed above. The image attached below shows what I mean.
Here is my code attached: DecisionTrees.jl (5.4 KB)
Currently, I have a function that attempts to create a dictionary of dictionaries but… Well, I couldn’t figure it out. I called that function tree_dict
for lack of a better name.
Can anyone point me in the right direction on how could I implement it with my current code?