I’ve been jealous of the Python ecosystem’s incredible tooling for GenAI applications - the ability to see what’s happening under the hood, track costs, debug weird responses. So I’m excited to bring some of that to Julia!
Fun fact: You can use this combo from Julia, Python and Typescript, making it easier to share your work.
Logfire.jl
Logfire.jl is an unofficial SDK for a service built by the team behind Pydantic (probably one of the most popular Python packages that makes Python more like Julia… Docs: Logfire - Pydantic Logfire Documentation)
It provides OpenTelemetry-based observability for LLM apps with native PromptingTools.jl integration.
using Logfire, PromptingTools
Logfire.configure(service_name = "my-app")
Logfire.instrument_promptingtools!()
aigenerate("Hello!") # automatically traced with tokens, cost, latency
What I love about it:
- Flexible instrumentation - choose which models to trace, set different service names per environment
- Manual spans with do-block syntax for tracing any code:
with_span("my-pipeline") do # your code here end - Error tracking - API keys wrong? Provider out of credits or hit rate limits? These are such a pain to debug on remote servers. Now every error is captured with full context. No more guessing what went wrong.
On pricing: Logfire cloud has a generous free tier (hundreds of thousands of conversations/month). But you can also export to local backends like Jaeger or Langfuse - all the observability without logs leaving your machine.
For experts: It has powerful SQL-powered Explore mode, powerful navigation and organization features, and easy way to copy out / download your traces (or any of its attributes).
Give it a try!
TextPrompts.jl
TextPrompts.jl stores prompts as text files with optional TOML metadata. Two perks: makes it easier to version and evolve your prompts (use your favorite AI agent – they like to auto-optimize text-based prompts) or to catch placeholder typos before they become confusing LLM responses.
using TextPrompts
using PromptingTools: SystemMessage, UserMessage, aigenerate
prompt = load_prompt("system.txt")
sys_msg = prompt(; language = "Julia", task = "explain macros") |> SystemMessage
user_msg = UserMessage("...")
aigenerate([sys_msg, user_msg])
The power combo: Version your prompts in git, trace every call in Logfire, point your coding agent at both → continuous prompt improvement loop.
Both packages are cross-language compatible with their Python/TypeScript versions. Feedback welcome!
Example to run
# =============================================================================
# Logfire.jl + TextPrompts.jl - Simple Example
# =============================================================================
#
# This example uses a temporary environment to demonstrate Logfire with TextPrompts.
#
# In practice, prompts are saved as .txt files with TOML metadata headers.
# The metadata can include version, changelog, and notes for a coding agent
# to track what worked, what to avoid, etc. - creating a feedback loop where
# the agent improves prompts based on Logfire traces.
#
# WHY TextPrompts.format()?
# =========================
# The biggest error is unfilled placeholders or typos in variable names.
# Format catches these BEFORE the LLM call - not in a confusing response.
#
# CROSS-LANGUAGE:
# ===============
# Logfire + TextPrompts are available in Python, TypeScript, and Julia.
# Your prompts and observability setup work across all three languages!
#
# Run: julia scratch.jl
# =============================================================================
# Step 1: Temporary environment setup
# =============================================================================
using Pkg
Pkg.activate(temp = true)
Pkg.add(["Logfire", "TextPrompts", "PromptingTools", "DotEnv"])
using DotEnv
DotEnv.load!()
using Logfire
using TextPrompts
using PromptingTools
using PromptingTools: SystemMessage, UserMessage
# =============================================================================
# Step 2: Configure Logfire
# =============================================================================
Logfire.configure(service_name = "textprompts-example")
Logfire.instrument_promptingtools!()
# =============================================================================
# Step 3: Create a prompt template
# =============================================================================
# In practice, this would be a .txt file in your prompts/ directory.
# The TOML header tracks metadata - version, author, and agent notes.
prompt_content = """
---
title = "Julia Expert Assistant"
version = "1.1.0"
description = "System prompt for Julia coding assistant"
# Agent notes (for coding agents improving this prompt):
# - v1.1.0: Added "be concise" - responses were too long
# - v1.0.0: Initial version
# - TODO: Add examples of good responses
# - AVOID: Don't ask clarifying questions, just answer
---
You are an expert {language} developer.
Your specialty is {specialty}.
Be concise and practical. No preamble.
"""
# For this example, we write it to a temp file
prompt_file = tempname() * ".txt"
write(prompt_file, prompt_content)
# =============================================================================
# Step 4: Load and use the prompt
# =============================================================================
template = load_prompt(prompt_file)
println("Loaded: $(template.meta.title) v$(template.meta.version)")
println("Placeholders: $(template.placeholders)")
# Format the prompt - this validates all placeholders are filled!
system_msg = template(; language = "Julia", specialty = "DataFrames.jl") |> SystemMessage
user_msg = UserMessage("How do I filter rows where column :age > 30?")
# Call the LLM - automatically traced by Logfire
response = aigenerate([system_msg, user_msg]; model = "gpt4om")
println("\nResponse:\n", response.content)
# =============================================================================
# Step 5: See why Format matters
# =============================================================================
println("\n" * "="^50)
println("Catching placeholder typos:")
try
# Typo: 'langauge' instead of 'language'
template(; langauge = "Julia", specialty = "DataFrames.jl")
catch e
println("Error caught BEFORE LLM call: ", e.message)
end
# =============================================================================
# Step 6: API errors are also traced
# =============================================================================
println("\n" * "="^50)
println("API errors captured in Logfire:")
try
# Bad api_kwargs - this will fail at the API level
# Logfire traces the error so you can debug later!
logfire_span = with_span("api-error-example") do span
aigenerate(
"Hello";
model = "gpt4om",
api_kwargs = (; max_tookens = 100, temperatur = 0.5) # typos!
)
end
catch e
println("API error caught: ", first(string(e), 100), "...")
println("Check Logfire - the failed span shows exactly what went wrong!")
end
# =============================================================================
# Cleanup
# =============================================================================
Logfire.shutdown!()
println("\nDone! Check traces at https://logfire.pydantic.dev")
println("""
The workflow:
1. Store prompts as .txt files with metadata headers
2. Agent notes in metadata track what works/what to avoid
3. Logfire captures every LLM call
4. Point your coding agent at traces + prompts
5. Agent improves prompts, bumps version, logs changes
6. Repeat - continuous prompt improvement!
""")




