Citation for definition of unit testing vs integration testing

I want to use the term integration testing in a paper I am writing (documenting the math behind a Julia package of mine in detail).

I was not trained in CS so I don’t know what the standard reference is. Is there a paper/textbook you would recommend that describes unit testing, integration testing, etc?

If the source is in the context of testing scientific software, even better, but this is not a strict requirement.

2 Likes

Extreme Programming Explained by Kent Beck

1 Like

Unfortunately, these terms are not at all well or consistently defined (Software Engineering is a bit bad at that - this part of “Informatics” has not inherited the desire to define everything properly, unlike actual Computer Science). Different industries use these terms to mean vastly different things, sometimes to the point where they mean the same thing. Some companies use “integration testing” to mean “testing with all dependencies integrated into one binary (so without mocks!), together with our code (but still in isolation)”. Others take it to mean “our code and its dependencies, integrated into the final runtime-environment (which may include customer-provided code, so that they can check whether they’ve integrated our stuff correctly)”. These definitions are highly context-dependent, and I think neither of them applies to your usecase.

What do you take the term “integration testing” to mean? Your best bet is to look for a citation that matches what you understand under it, or to define the term for your own context.

4 Likes

This is a fair question, I should have clarified.

Suppose there is a calculation y = f(x), where f is about 8k LOC of Julia code (not counting dependencies), and involves numerical approximations, stochastic sampling, etc. It is quite a beast.

  • A: test that y_i \approx f(x_i) for a set of (x_i, y_i), i = 1, \dots where I more or less know the solution, or some characteristic of it. f here is a black box monster that spits out solutions to problems (I wish… each “spit” takes days :wink:)

  • B: test a tiny part of f, a building block, not longer than 50–100 LOC by comparing inputs and outputs and checking for invariants etc.

I am pretty sure that (B) is “unit testing”.

I thought (A) was close “integration testing”, but as I said, I don’t know these terms very well. Suggestions welcome, especially if they come with references.

I would consider both of these as unit testing, the “unit under test” being f. “integration testing” really is more about how your code integrates into something external to your codebase, hence the name. It’s just that the degree of how deep that integration goes that varies.

9 Likes

As you say, these terms aren’t necessarily well-defined, but that’s definitely not what I would have in mind as part of the definition for “integration testing”.

In my own usage “unit tests” always cover a single function, and tests that the function returns the correct output for a specific input. That’s just how you verify that the implementation matches the specification (docstring) of the function. It’s the smallest unit of testing.

“Integration tests” check that multiple functions correctly work together. That is, that the units correctly integrate into something larger. Integration tests are much more nebulous, but usually are best done by running through some complete “user experience”, i.e., solving a high-level problem, and checking that everything works as expected. Depending on the project, that might involve things that are external to the codebase. Certainly, checking that something in a library is compatible with some third-party code would fall into the category of “integration test”. But I’d understand “integration” to mean “integration of atomic parts covered by unit tests”, not necessarily just “integration with external code”.

When you put it like that, that’s still a unit test, since you’re testing a single function. It doesn’t matter in principle how many lines of code f has. The question is whether f calls other functions that are (or should be) individually unit-tested.

If the “tiny part” is another Julia function, then that’s a unit test, and then (A) becomes an integration test, testing that those functions work together correctly. If the “tiny part” is just a snippet of code, then you shouldn’t be testing that at all. The smallest unit of testable code is a function.

Unless f is the highest-level function in your API (a function a user would call to solve some complete problem), you’d be doing some kind of mid-level integration testing. My recommendation would be to avoid that; it’s a form of over-testing that’s time-consuming without giving additional insight into the correctness of your code. The best bang for your buck is from combining unit testing with integration testing only at the highest level. Feeling the need for mid-level integration testing is often a code smell.

But all of that depends on circumstances, the size of the codebase, the size and organization of the development team, etc.

P.S.: I have no citation for this definition of “integration testing”. It’s just how I understand and use the term. I would be interested to see how that lines up with definitions in the formal literature.

2 Likes

In data engineering, unit testing a pipeline often doesn’t capture emergent correctness issues. We often depend on “end-to-end”/E2E testing which is feeding known inputs into the system and checking that you get the expected output.

It seems like that is basically what you’re describing for your type A tests.

Anyway, just wanted to throw out another verbage option in case that’s helpful in finding the right taxonomy to Google for.

1 Like

I roughly see it like @goerz and @mrufsvold: There is a hierarchy with ~3 levels. From small to big:

  • Unit tests: Test a single function unit. This may or may not be a single a single function but in Julia typically is in my experience (although this gets a bit muddy when there are helper functions etc…, I typically try to test only public functions and “private” function only when I deem it worth because they are particularly complex or central to the “business logic”)
  • Integration tests: Test the interplay of 2 functional units. This layer may or may not exists depending on the complexity of the overall code, imo. Usually, when codebases get larger, then you find internal interfaces (in your example perhaps the stochastic sampling code is such an example?). An integration test for me, would try to test that these internal interfaces are fulfilling their promises.
  • End-to-end test. These are testing the experience a “user” of your whole code would use. E.g. in DifferentialEquations.jl, this would be setting up an ODEProblem, solving it and verifying the solution.

In industry (well at least my company), the current movement goes a bit from testing everything religiously with unit test to emphasizing the integration more.

2 Likes