Parameter Fitting ================= .. note:: Parameter fitting requires the retrieval extras:: pip install gasfir[retrieval] All fitting-related imports come from :mod:`gasfir.fitting` or :mod:`gasfir.retrieval`, **not** from the top-level ``gasfir`` namespace. Results are saved to ``/`` by default (JSON, HDF5 chain, corner plot, trace plot, LaTeX summary). Automated Pipeline (recommended) --------------------------------- :func:`~gasfir.retrieval.retrieve` runs the full CMA-ES → least-squares → emcee pipeline. After each phase it prints progress and a fit report using :func:`lmfit.fit_report`. **Known medium** — initial guess and bounds from the stored reference: .. code-block:: python import pandas as pd from gasfir import create_pulse from gasfir.retrieval import RetrievalConfig, retrieve data = pd.DataFrame({ "pulses": [create_pulse(800, I, 0, 6) for I in intensities], "Y": measured_probabilities, }) cfg = RetrievalConfig( medium_name="Hydrogen_SFA", # output saved to ./Hydrogen_SFA/ emcee_nsteps=3000, latex_mapping={"E_g": r"$E_g$ [a.u.]", "a1": r"$a_1$"}, ) result = retrieve(data_NA=data, config=cfg) After the LS phase the pipeline automatically prints a Nσ comparison of fitted values against stored reference parameters. After emcee it saves: * ``_publication_corner.pdf`` * ``_trace.pdf`` * ``_fit_summary.tex`` * ``_retrieval_result.json`` **Atom cold start** — E_g looked up from the NIST atomic database: .. code-block:: python # No initial_params needed; E_g resolved automatically for known elements cfg = RetrievalConfig(medium_name="Kr", cma_maxiter=5000) result = retrieve(data_NA=data, config=cfg) **Crystal cold start** — E_g must always be supplied for crystals: .. code-block:: python cfg = RetrievalConfig( medium_name="MyNewCrystal", initial_params={"E_g": 0.30}, # a.u. — only required field is_crystal=True, ret_electron_density=True, # Y is electron density, not P cma_maxiter=5000, ) result = retrieve(data_NA=data, config=cfg) RetrievalConfig key options ----------------------------- .. list-table:: :header-rows: 1 :widths: 25 10 65 * - Option - Default - Description * - ``medium_name`` - ``"unknown"`` - Used for output file names and stored-parameter lookup * - ``output_dir`` - ``None`` - Defaults to ``./medium_name/`` * - ``initial_params`` - ``None`` - Dict of starting values; overrides stored params; only ``E_g`` is required for cold start * - ``fixed_params`` - ``None`` - Names to hold fixed (default: ``["E_g"]`` for atoms, ``["a4"]`` for crystals) * - ``is_crystal`` - ``False`` - Enables ``m_eff`` as free parameter and crystal cold-start bounds * - ``ret_electron_density`` - ``False`` - Fit electron density instead of ionisation probability * - ``relative_bound`` - ``5.0`` - Search range = ``[val − 5|val|, val + 5|val|]`` * - ``global_method`` - ``"cma"`` - ``"cma"`` or ``"differential_evolution"`` * - ``cma_maxiter`` - ``10000`` - CMA-ES evaluation budget (plateau-stopped early when flat) * - ``emcee_nsteps`` - ``3000`` - MCMC steps per walker * - ``latex_mapping`` - ``None`` - ``{name: latex_string}`` for axis labels in plots and .tex output Post-processing an existing chain ---------------------------------- Reload a chain and apply transformations (drop/derive variables, re-snap): .. code-block:: python from gasfir.retrieval import post_process_mcmc post_process_mcmc( medium_name="Diamond_TBmBJ", output_dir="Diamond_TBmBJ", original_stored_params={"E_g": 0.235, "a1": 14, "a2": 1.7, "a3": -3.0, "a0_per_au3": 0.35, "m_eff": 0.115, "lattice_constant_au": 6.74}, drop_vars=["a0"], derived_exprs={"a0_per_au3": "a0 / 6.74**3"}, latex_mapping={"E_g": r"$E_g$ [a.u.]", "m_eff": r"$m_\mathrm{eff}$"}, is_crystal=True, ) Simultaneous Fit to QS and Non-Adiabatic Data ---------------------------------------------- Pass ``data_QS`` to concatenate quasi-static residuals: .. code-block:: python from gasfir.retrieval import RetrievalConfig, retrieve import numpy as np qs_data = pd.DataFrame({ "field": field_amplitudes, "Y": np.log(qs_rates), }) cfg = RetrievalConfig(medium_name="Hydrogen_SFA", simultaneous=True) result = retrieve(data_NA=data, data_QS=qs_data, config=cfg) Low-level: residual function + manual lmfit ------------------------------------------- For custom optimisers or more control: .. code-block:: python import lmfit from gasfir.fitting import ret_residual_function residual = ret_residual_function(data, uncertainty_Nadiabatic=0.05) p = lmfit.Parameters() p.add("E_g", value=0.5, vary=False) p.add("a0", value=5.0, min=0.01) p.add("a1", value=3.5, min=0.1) p.add("a2", value=2.0, min=0.0) p.add("a3", value=1.0) p.add("a4", value=0.0, vary=False) result = lmfit.minimize(residual, p, method="least_squares") lmfit.report_fit(result) Uncertainty Quantification with emcee -------------------------------------- Use the automated pipeline — emcee runs automatically after LS. For process-level walker parallelism: .. code-block:: python from gasfir.retrieval import RetrievalConfig, retrieve, make_emcee_pool cfg = RetrievalConfig(medium_name="Hydrogen_SFA", emcee_nsteps=3000) pool = make_emcee_pool() # each worker sets NUMBA_NUM_THREADS=1 cfg.emcee_pool = pool result = retrieve(data_NA=data, config=cfg) pool.close(); pool.join() for name in result.var_names: v = result.params[name] print(f" {name} = {v.value:.4g} ± {v.stderr:.2g}")