Parameter Fitting

Note

Parameter fitting requires the retrieval extras:

pip install gasfir[retrieval]

All fitting-related imports come from gasfir.fitting or gasfir.retrieval, not from the top-level gasfir namespace.

Results are saved to <medium_name>/ by default (JSON, HDF5 chain, corner plot, trace plot, LaTeX summary).

Automated Pipeline (recommended)

retrieve() runs the full CMA-ES → least-squares → emcee pipeline. After each phase it prints progress and a fit report using lmfit.fit_report().

Known medium — initial guess and bounds from the stored reference:

import pandas as pd
from gasfir import create_pulse
from gasfir.retrieval import RetrievalConfig, retrieve

data = pd.DataFrame({
    "pulses": [create_pulse(800, I, 0, 6) for I in intensities],
    "Y":      measured_probabilities,
})

cfg = RetrievalConfig(
    medium_name="Hydrogen_SFA",   # output saved to ./Hydrogen_SFA/
    emcee_nsteps=3000,
    latex_mapping={"E_g": r"$E_g$ [a.u.]", "a1": r"$a_1$"},
)
result = retrieve(data_NA=data, config=cfg)

After the LS phase the pipeline automatically prints a Nσ comparison of fitted values against stored reference parameters. After emcee it saves:

<name>_publication_corner.pdf
<name>_trace.pdf
<name>_fit_summary.tex
<name>_retrieval_result.json

Atom cold start — E_g looked up from the NIST atomic database:

# No initial_params needed; E_g resolved automatically for known elements
cfg = RetrievalConfig(medium_name="Kr", cma_maxiter=5000)
result = retrieve(data_NA=data, config=cfg)

Crystal cold start — E_g must always be supplied for crystals:

cfg = RetrievalConfig(
    medium_name="MyNewCrystal",
    initial_params={"E_g": 0.30},   # a.u. — only required field
    is_crystal=True,
    ret_electron_density=True,       # Y is electron density, not P
    cma_maxiter=5000,
)
result = retrieve(data_NA=data, config=cfg)

RetrievalConfig key options

Option	Default	Description
`medium_name`	`"unknown"`	Used for output file names and stored-parameter lookup
`output_dir`	`None`	Defaults to `./medium_name/`
`initial_params`	`None`	Dict of starting values; overrides stored params; only `E_g` is required for cold start
`fixed_params`	`None`	Names to hold fixed (default: `["E_g"]` for atoms, `["a4"]` for crystals)
`is_crystal`	`False`	Enables `m_eff` as free parameter and crystal cold-start bounds
`ret_electron_density`	`False`	Fit electron density instead of ionisation probability
`relative_bound`	`5.0`	Search range = `[val − 5\|val\|, val + 5\|val\|]`
`global_method`	`"cma"`	`"cma"` or `"differential_evolution"`
`cma_maxiter`	`10000`	CMA-ES evaluation budget (plateau-stopped early when flat)
`emcee_nsteps`	`3000`	MCMC steps per walker
`latex_mapping`	`None`	`{name: latex_string}` for axis labels in plots and .tex output

Post-processing an existing chain

Reload a chain and apply transformations (drop/derive variables, re-snap):

from gasfir.retrieval import post_process_mcmc

post_process_mcmc(
    medium_name="Diamond_TBmBJ",
    output_dir="Diamond_TBmBJ",
    original_stored_params={"E_g": 0.235, "a1": 14, "a2": 1.7,
                             "a3": -3.0, "a0_per_au3": 0.35,
                             "m_eff": 0.115, "lattice_constant_au": 6.74},
    drop_vars=["a0"],
    derived_exprs={"a0_per_au3": "a0 / 6.74**3"},
    latex_mapping={"E_g": r"$E_g$ [a.u.]", "m_eff": r"$m_\mathrm{eff}$"},
    is_crystal=True,
)

Simultaneous Fit to QS and Non-Adiabatic Data

Pass data_QS to concatenate quasi-static residuals:

from gasfir.retrieval import RetrievalConfig, retrieve
import numpy as np

qs_data = pd.DataFrame({
    "field": field_amplitudes,
    "Y":     np.log(qs_rates),
})

cfg = RetrievalConfig(medium_name="Hydrogen_SFA", simultaneous=True)
result = retrieve(data_NA=data, data_QS=qs_data, config=cfg)

Low-level: residual function + manual lmfit

For custom optimisers or more control:

import lmfit
from gasfir.fitting import ret_residual_function

residual = ret_residual_function(data, uncertainty_Nadiabatic=0.05)

p = lmfit.Parameters()
p.add("E_g", value=0.5, vary=False)
p.add("a0",  value=5.0, min=0.01)
p.add("a1",  value=3.5, min=0.1)
p.add("a2",  value=2.0, min=0.0)
p.add("a3",  value=1.0)
p.add("a4",  value=0.0, vary=False)

result = lmfit.minimize(residual, p, method="least_squares")
lmfit.report_fit(result)

Uncertainty Quantification with emcee

Use the automated pipeline — emcee runs automatically after LS. For process-level walker parallelism:

from gasfir.retrieval import RetrievalConfig, retrieve, make_emcee_pool

cfg = RetrievalConfig(medium_name="Hydrogen_SFA", emcee_nsteps=3000)

pool = make_emcee_pool()          # each worker sets NUMBA_NUM_THREADS=1
cfg.emcee_pool = pool
result = retrieve(data_NA=data, config=cfg)
pool.close(); pool.join()

for name in result.var_names:
    v = result.params[name]
    print(f"  {name} = {v.value:.4g} ± {v.stderr:.2g}")