Map data from one dictionary into another

I have data from two APIs in JSON format and parsing them into 2 dictionaries (d1, d2).

I need to derive two additional fields FieldName(from d2), attrVal from d1() by mapping subAttrDefId present in d1(“attrValues”) with subAttribDefId present in d2.

I have tried below code, but it’s giving me “BoundsError”

Link to question posted in Stack overflow.

I’m not sure I understand what you’re looking for, but you could try this way

json2 = JSON3.read(API2)
df2=DataFrame(json2)

json1 = JSON3.read(API1)
df1=DataFrame(json1[1][:attrValues])

select(innerjoin(df1,df2, on=:subAttrDefId=>:subAttribDefId, makeunique=true), :subAttrDefId, :FieldName,:attrVal)

There is a delete! statement in the code which is deleting elements from the vector being iterated on. This can lead to strange results. Can you try without the delete! (note that the for loop will go through each element once, there is no fear of repeating the processing of an element).

@rocco_sprmnt21 Thanks for replying! Yes, I am able to create df’s but data will/should look something like below. Please let me know it’s still confusing.


df1:

Row │ subDefId  DefVersionId  attrValues
1   │ 4143      4142          JSON3.Object[{\n   "subAttr…
2   │ 4144      4145          JSON3.Object[{\n   "subAttr…
3   │ 4145      4146          JSON3.Object[{\n   "subAttr…

julia> **df1.attrValues**(Displaying for one row only)
1-element Vector{JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}:
 [{
   "subAttrValId": 4144, 
   "subDefId": 4143, 
   "subAttrDefId": 3888, 
   "attrVal": "200"
},
{
   "subAttrValId": 4145, 
   "subDefId": 4144, 
   "subAttrDefId": 3889, 
   "attrVal": "201"
}
{
   "subAttrValId": 4146, 
   "subDefId": 4149, 
   "subAttrDefId": 3890, 
   "attrVal": "202"
}
]

df2:

 Row │ subAttribDefId  attribName              attribDataType  attrFormat
   1 │              1  numOfbucket              number          ^[\\d]*$
   2 │              2  fbucket                  number          ^[\\d]*$
   3 │              3  jbucket                  number          ^[\\d]*$
   4 │              4  obucket                  number          ^[\\d]*$
   5 │              5  yPlusbucket              number          ^[\\d]*$

I would like to create new columns for each row in df1 with column names as df2.attribName by joining the (df1.attrValues).subAttrDefId = df2.subAttrDefId and column values will be (df1.attrValues).attrVal

Yes, I tried but no luck.

This is what I did using the data you provided and based on what I understand.
Indicate if there is something to change about this idea or if it is completely out of the way, explain the table or data structure you want to obtain on this data.

julia> API1="""[
           {
              "DefId":4327,
              "DefVersionId":4326,
              "attrValues":[
                 {
                    "subAttrValId":4328,
                    "DefId":4327,
                    "subAttrDefId":3888, 
                    "attrVal":"276",
                    "crtUserid":"3456789",
                    "crtTmstmp":null,
                    "updtUserid":null,
                    "updtTmstmp":"2023-01-10T14:08:36.000+00:00"
                 },
                 {
                    "subAttrValId":4329,
                    "DefId":4327,
                    "subAttrDefId":3889,
                    "attrVal":"52",
                    "crtUserid":"3456789",
                    "crtTmstmp":null,
                    "updtUserid":null,
                    "updtTmstmp":"2023-01-10T14:08:36.000+00:00"
                 },
                 {
                    "subAttrValId":4330,
                    "DefId":4327,
                    "subAttrDefId":3890,
                    "attrVal":"50",
                    "crtUserid":"3456789",
                    "crtTmstmp":null,
                    "updtUserid":null,
                    "updtTmstmp":"2023-01-10T14:08:36.000+00:00"
                 },
                 {
                    "subAttrValId":4331,
                    "DefId":4327,
                    "subAttrDefId":3891,
                    "attrVal":"24",
                    "crtUserid":"3456789",
                    "crtTmstmp":null,
                    "updtUserid":null,
                    "updtTmstmp":"2023-01-10T14:08:36.000+00:00"
                 }
        ],
              "code":"AB",
              "typeCode":"00",
              "crtUserid":"12345",
              "crtTmstmp":null,
              "updtUserid":"23456",
              "updtTmstmp":"2023-01-10T14:08:36.000+00:00"
           }
        ]"""
"[\n   {\n      \"DefId\":4327,\n      \"DefVersionId\":4326,\n      \"attrValues\":[\n         {\n            \"subAttrValId\":4328,\n            \"DefId\":4327,\n            \"subAttrDefId\":3888, \n            \"attrVal\":\"276\",\n            \"crtUseri" ⋯ 984 bytes ⋯ "mp\":\"2023-01-10T14:08:36.000+00:00\"\n         }\n],\n      \"code\":\"AB\",\n      \"typeCode\":\"00\",\n      \"crtUserid\":\"12345\",\n      \"crtTmstmp\":null,\n      \"updtUserid\":\"23456\",\n      \"updtTmstmp\":\"2023-01-10T14:08:36.000+00:00\"\n   }\n]"      

julia>  API2="""[
            {
               "subAttribDefId":3886,
               "displayLabel":"string",
               "attribDesc":"string",
               "attribName":"string",
               "attribDataType":"string",
               "attrFormat":"string",
               "crtUserid":"string",
               "crtTmstmp":"2022-12-21T19:07:14.000+00:00",
               "updtUserid":"string",
               "updtTmstmp":"2022-12-21T19:07:14.000+00:00",        
               "FieldName":null,
               "category":null,
               "subcategory":null
            },
            {
               "subAttribDefId":3887,
               "displayLabel":"string",
               "attribDesc":"string",
               "attribName":"string",
               "attribDataType":"string",
               "attrFormat":"string",
               "crtUserid":"string",
               "crtTmstmp":"2022-12-21T19:08:18.000+00:00",
               "updtUserid":"string",
               "updtTmstmp":"2022-12-21T19:08:18.000+00:00",        
               "FieldName":null,
               "subAttribDefId":3888,
               "displayLabel":"Number of bucket",
               "attribDesc":"Number of bucket",
               "attribName":"numbuckets",
               "attribDataType":"number",
               "attrFormat":"",
               "crtUserid":"12345",
               "crtTmstmp":"2022-12-21T19:08:56.000+00:00",
               "updtUserid":null,
               "updtTmstmp":"2022-12-21T19:08:56.000+00:00",
               "FieldName":null,
               "category":null,
               "subcategory":null
            },
            {
               "subAttribDefId":3889,
               "displayLabel":"Cabin",
               "attribDesc":"class Cabin",
               "attribName":"aCabin",
               "attribDataType":"number",
               "attrFormat":"",
               "crtUserid":"1234",
               "crtTmstmp":"2022-12-21T19:08:56.000+00:00",
               "updtUserid":null,
               "updtTmstmp":"2022-12-21T19:08:56.000+00:00",
               "FieldName":null,
               "category":null,
               "subcategory":null
            }
            ]"""
"[\n{\n   \"subAttribDefId\":3886,\n   \"displayLabel\":\"string\",\n   \"attribDesc\":\"string\",\n   \"attribName\":\"string\",\n   \"attribDataType\":\"string\",\n   \"attrFormat\":\"string\",\n   \"crtUserid\":\"string\",\n   \"crtTmstmp\":\"2022-12-21T19:07:14.0" ⋯ 1063 bytes ⋯ "   \"attrFormat\":\"\",\n   \"crtUserid\":\"1234\",\n   \"crtTmstmp\":\"2022-12-21T19:08:56.000+00:00\",\n   \"updtUserid\":null,\n   \"updtTmstmp\":\"2022-12-21T19:08:56.000+00:00\",\n   \"FieldName\":null,\n   \"category\":null,\n   \"subcategory\":null\n}\n]"
julia> using DataFrames,  JSON3
julia> json2 = JSON3.read(API2);

julia> df2=DataFrame(json2)
4×13 DataFrame
 Row │ subAttribDefId  displayLabel      attribDesc        attribName  attribDataType  attrFormat  crtUserid  crtTmstmp      ⋯
     │ Int64           String            String            String      String          String      String     String         ⋯
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │           3886  string            string            string      string          string      string     2022-12-21T19: ⋯
   2 │           3887  string            string            string      string          string      string     2022-12-21T19:  
   3 │           3888  Number of bucket  Number of bucket  numbuckets  number                      12345      2022-12-21T19:  
   4 │           3889  Cabin             class Cabin       aCabin      number                      1234       2022-12-21T19:  
                                                                                                             6 columns omitted

julia> json1 = JSON3.read(API1);

julia> df1=DataFrame(json1[1][:attrValues])
4×8 DataFrame
 Row │ subAttrValId  DefId  subAttrDefId  attrVal  crtUserid  crtTmstmp  updtUserid  updtTmstmp
     │ Int64         Int64  Int64         String   String     Nothing    Nothing     String
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │         4328   4327          3888  276      3456789                           2023-01-10T14:08:36.000+00:00
   2 │         4329   4327          3889  52       3456789                           2023-01-10T14:08:36.000+00:00
   3 │         4330   4327          3890  50       3456789                           2023-01-10T14:08:36.000+00:00
   4 │         4331   4327          3891  24       3456789                           2023-01-10T14:08:36.000+00:00

julia> select(innerjoin(df1,df2, on=:subAttrDefId=>:subAttribDefId, makeunique=true), :subAttrDefId, :FieldName,:attrVal)     
2×3 DataFrame
 Row │ subAttrDefId  FieldName  attrVal 
     │ Int64         Nothing    String
─────┼──────────────────────────────────
   1 │         3888             276
   2 │         3889             52