On Developing Common Interfaces for the Julia Ecosystem

Hi folks,

I have spent much time ruminating on the idea of interfaces within the Julia ecosystem. Before I start a discussion about interfaces, let me first share my understandings of interfaces.

Background

Working Definitions

Conceptually, I understand an interface to “be a programming structure/syntax that allows the computer to enforce certain properties on an object.” (borrowing this definition from Wikipedia’s definition of an interface in the context of Object Oriented Programming). Despite coming from Object Oriented Programming, I see this definition as still being applicable and usable within the context of discussion here. Additionally, I see an interface comprising of three parts:

  1. The interface internals. This is the part a developer spends much time thinking about in what given inputs to an interface represent in a given state (whether that state is a single object or a large piece of software).

  2. The user interface. This is the part where someone uses the interface based on a previously defined purpose.

  3. The scope of an interface. This is not so much a physical piece of the interface so much as an internal guide to say what the interface does and does not support as well as what it could support one day.

Let me know if any of this background seems off or to amend any understandings in subsequent discussion please.

Observed Interface Patterns within the Julia Community

Since hanging around the Julia Community for quite sometime, I’ve noticed a number of high level interface species occurring organically throughout the ecosystem. They are as follows:

  1. Foundational Interfaces. These are the interfaces baked into Julia itself such AbstractArray, AbstractString, etc.

  2. Common Interfaces. These are the interfaces that are not part of Julia itself but are commonly used throughout the Julia ecosystem. They become common either through proliferation or perceived need whether across all the Julia ecosystem or some small niche within the Julia ecosystem. Such examples of these common interfaces is the Table interface (found in Tables.jl) or the DBInterface (found in DBInterface.jl).

  3. Transient Interfaces. These are interfaces that can (often unknowingly) affect other common or foundational interfaces. An example of such an interface is an OffsetArray (found in OffsetArrays.jl).

These classifications are just based on my own observations and experiences with packages. Feel free to amend my understandings here in case this is just completely offbase.

When To Create Interfaces

With this background, the interface that I have spent the most time thinking about these days is what I call the Common Interface specie – it’s also the one that I refer to exclusively throughout the remainder of this post. I have long admired the Table interface and have taken small steps to implement the interface for some projects I have worked on (special thanks to @visr for once walking me through the implementation!). Additionally, the work being done in the MLJ and LearnAPI package ecosystem is nothing short of, what I would say, heroic in attempts to standardize ML interfaces.

However, something that has always puzzled me is when is it appropriate to create a common interface – not in the sense of software design implementations but when and how to formalize an idea as a common interface. Let me illustrate this with an example:

Example Potential Interface

I work within healthcare research. One interface that I regularly use is known as a cohort definition. This is a specification which describes the constraints and conditions of a group of medical patients – this group is known as a cohort and its associated description is known as a cohort definition. One example of a cohort definition could be “Any patient who has had at least one myocardial infarction (heart attack) over the age of 65),” which could form a possible cohort of interest to examine. This has been loosely instantiated within the health research community in JSON expressions like:

{
  "ConceptSets": [
    {
      "id": 0,
      "name": "Myocardial Infarction1",
      "expression": {
        "items": [
          {
            "concept": {
              "CONCEPT_CLASS_ID": "Clinical Finding",
              "CONCEPT_CODE": "57054005",
              "CONCEPT_ID": 312327,
              "CONCEPT_NAME": "Acute myocardial infarction",
              "DOMAIN_ID": "Condition",
              "INVALID_REASON": "V",
              "INVALID_REASON_CAPTION": "Valid",
              "STANDARD_CONCEPT": "S",
              "STANDARD_CONCEPT_CAPTION": "Standard",
              "VOCABULARY_ID": "SNOMED",
              "VALID_START_DATE": "2002-01-30",
              "VALID_END_DATE": "2099-12-30"
            },
            "includeDescendants": true
          }
        ]
      }
    }
  ],
  "PrimaryCriteria": {
    "CriteriaList": [
      {
        "ConditionOccurrence": {
          "CodesetId": 0
        }
      }
    ],
    "ObservationWindow": {
      "PriorDays": 0,
      "PostDays": 0
    },
    "PrimaryCriteriaLimit": {
      "Type": "First"
    }
  },
  "QualifiedLimit": {
    "Type": "First"
  },
  "ExpressionLimit": {
    "Type": "First"
  },
  "InclusionRules": [
    {
      "name": "Geriatric Population",
      "expression": {
        "Type": "ALL",
        "CriteriaList": [],
        "DemographicCriteriaList": [
          {
            "Age": {
              "Value": 65,
              "Op": "gte"
            }
          }
        ],
        "Groups": []
      }
    }
  ],
  "CensoringCriteria": [],
  "CollapseSettings": {
    "CollapseType": "ERA",
    "EraPad": 0
  },
  "CensorWindow": {}
}

In this case, there really is not a defined interface to work with this JSON expression broadly. In my mind, it would make sense to define some kind of Julia object class maybe like AbstractCohortDefintion with a subtype object called CohortDefinition which could contain this expression as a Schema (i.e. Schema(JSON3.read(read("cohort_def.json", String)) or something to that effect. Alongside that, interface methods for reading (i.e. read), loading (i.e. load), and indexing (maybe indexconcept or something) I would imagine being implemented. I would say the “when” in this situation is that I want to work with, parse, and/or manipulate somehow the definition right now.

How To Create Common Interfaces

In the last section, I was focused on the “when” but now, I wonder about the “how” to create an interface. There was some good advice from @Roger-luo in the discussion: Any guides on designing an interface? - #2 by Roger-luo which I really agreed with being that one should not be afraid to refactor. While @Wikunia and I have been working on Javis.jl, the most interesting story to me concerning this post was the development of what I nicknamed the “JObjects” – basically, a more simpliifed interface in our Javis API to allow users faster manipulation for different shapes and geometries. These emerged over a period of trial and error and based on user feedback/how we perceived users working with Javis.

The best answer it would seem to “how” to build interfaces is to just try building them. Refactor as necessary. But I wonder if that is really the case or if there is a more systematic or better way to approach building interfaces?

Concluding Thoughts

What lies at the heart of the matter here are some of my personal beliefs about interfaces in that they are:

  • Incredibly important to a successful and thriving ecosystem. Knowing what we are all talking about can remedy quickly miscommunications or at least better pinpoint where confusion may be originating.
  • Support a diverse ecosystem. Having well-thought interfaces can support a plethora of novel ideas and projects to emerge. I have to look no further than my own GitHub repo to see some projects I have developed built on other interfaces.
  • Reduce mental burden. Whether for developers or users, if I know you adhere to an interface, I know what to expect from you. No need to pass around a ton of metadata explaining what this data structure is or how that method should be used, etc.

Additionally, I am also driven by the knowledge to take interfaces seriously after being privvy to multiple discussions about interfaces (I list some below for reference):

I’d love to hear people’s thoughts and experiences on when and how to build interfaces, why interfaces might be important, and any additional perspectives on what I stated above. Also, I apologize if any of this was too vague or a ramble – admittedly, I am trying my best to improve my thoughts around this topic and to set-up future work well.

Thanks folks!

~ tcp :deciduous_tree:

5 Likes