# HG changeset patch # User Tuomo Valkonen # Date 1670793953 -7200 # Node ID d29d1fcf542310e46b5d479b4dabab8b133f2c5a # Parent 9869fa1e0ccd24c7b3ffe03ec0aafe10958e275f Support arbitrary regularisation terms; implement non-positivity-constrained regularisation. * Fixes the conditional gradient methods that were incorrectly solving positivity constrained subproblems although the infinite-dimensional versions do not include such constraints. * Introduces the `ExperimentV2` struct that has `regularisation` in place of `α`. The `Experiment` struct is now deprecated. * The L^2-squared experiments were switch to be unconstrained, as the Franke-Wolfe implementations do not support constraints. (This would be easy to add for the “fully corrective” variant, but is not immediate for the “relaxed” variant.) diff -r 9869fa1e0ccd -r d29d1fcf5423 Cargo.lock --- a/Cargo.lock Sun Dec 11 23:19:17 2022 +0200 +++ b/Cargo.lock Sun Dec 11 23:25:53 2022 +0200 @@ -917,7 +917,7 @@ [[package]] name = "pointsource_algs" -version = "1.0.0" +version = "1.0.1-dev" dependencies = [ "GSL", "alg_tools", diff -r 9869fa1e0ccd -r d29d1fcf5423 Cargo.toml --- a/Cargo.toml Sun Dec 11 23:19:17 2022 +0200 +++ b/Cargo.toml Sun Dec 11 23:25:53 2022 +0200 @@ -1,6 +1,6 @@ [package] name = "pointsource_algs" -version = "1.0.0" +version = "1.0.1-dev" edition = "2021" rust-version = "1.67" authors = ["Tuomo Valkonen "] diff -r 9869fa1e0ccd -r d29d1fcf5423 src/experiments.rs --- a/src/experiments.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/experiments.rs Sun Dec 11 23:25:53 2022 +0200 @@ -20,13 +20,14 @@ use crate::types::*; use crate::run::{ RunnableExperiment, - Experiment, + ExperimentV2, Named, DefaultAlgorithm, AlgorithmConfig }; //use crate::fb::FBGenericConfig; use crate::rand_distr::{SerializableNormal, SaltAndPepper}; +use crate::regularisation::Regularisation; /// Experiments shorthands, to be used with the command line parser @@ -136,10 +137,10 @@ Experiment1D => { let base_spread = Gaussian { variance : Variance1 }; let spread_cutoff = BallIndicator { r : CutOff1, exponent : Linfinity }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]].into(), sensor_count : [N_SENSORS_1D], - α : cli.alpha.unwrap_or(0.09), + regularisation : Regularisation::Radon(cli.alpha.unwrap_or(0.09)), noise_distr : SerializableNormal::new(0.0, cli.variance.unwrap_or(0.2))?, dataterm : DataTerm::L2Squared, μ_hat : MU_TRUE_1D_BASIC.into(), @@ -153,10 +154,10 @@ }, Experiment1DFast => { let base_spread = HatConv { radius : Hat1 }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]].into(), sensor_count : [N_SENSORS_1D], - α : cli.alpha.unwrap_or(0.06), + regularisation : Regularisation::Radon(cli.alpha.unwrap_or(0.06)), noise_distr : SerializableNormal::new(0.0, cli.variance.unwrap_or(0.2))?, dataterm : DataTerm::L2Squared, μ_hat : MU_TRUE_1D_BASIC.into(), @@ -171,10 +172,10 @@ Experiment2D => { let base_spread = Gaussian { variance : Variance1 }; let spread_cutoff = BallIndicator { r : CutOff1, exponent : Linfinity }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]; 2].into(), sensor_count : [N_SENSORS_2D; 2], - α : cli.alpha.unwrap_or(0.19), // 0.18, //0.17, //0.16, + regularisation : Regularisation::Radon(cli.alpha.unwrap_or(0.19)), noise_distr : SerializableNormal::new(0.0, cli.variance.unwrap_or(0.25))?, dataterm : DataTerm::L2Squared, μ_hat : MU_TRUE_2D_BASIC.into(), @@ -190,10 +191,10 @@ }, Experiment2DFast => { let base_spread = HatConv { radius : Hat1 }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]; 2].into(), sensor_count : [N_SENSORS_2D; 2], - α : cli.alpha.unwrap_or(0.12), //0.10, //0.14, + regularisation : Regularisation::Radon(cli.alpha.unwrap_or(0.12)), noise_distr : SerializableNormal::new(0.0, cli.variance.unwrap_or(0.15))?, //0.25 dataterm : DataTerm::L2Squared, μ_hat : MU_TRUE_2D_BASIC.into(), @@ -210,10 +211,10 @@ Experiment1D_L1 => { let base_spread = Gaussian { variance : Variance1 }; let spread_cutoff = BallIndicator { r : CutOff1, exponent : Linfinity }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]].into(), sensor_count : [N_SENSORS_1D], - α : cli.alpha.unwrap_or(0.1), + regularisation : Regularisation::NonnegRadon(cli.alpha.unwrap_or(0.1)), noise_distr : SaltAndPepper::new( cli.salt_and_pepper.as_ref().map_or(0.6, |v| v[0]), cli.salt_and_pepper.as_ref().map_or(0.4, |v| v[1]) @@ -230,10 +231,10 @@ }, Experiment1D_L1_Fast => { let base_spread = HatConv { radius : Hat1 }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]].into(), sensor_count : [N_SENSORS_1D], - α : cli.alpha.unwrap_or(0.12), + regularisation : Regularisation::NonnegRadon(cli.alpha.unwrap_or(0.12)), noise_distr : SaltAndPepper::new( cli.salt_and_pepper.as_ref().map_or(0.6, |v| v[0]), cli.salt_and_pepper.as_ref().map_or(0.4, |v| v[1]) @@ -251,10 +252,10 @@ Experiment2D_L1 => { let base_spread = Gaussian { variance : Variance1 }; let spread_cutoff = BallIndicator { r : CutOff1, exponent : Linfinity }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]; 2].into(), sensor_count : [N_SENSORS_2D; 2], - α : cli.alpha.unwrap_or(0.35), + regularisation : Regularisation::NonnegRadon(cli.alpha.unwrap_or(0.35)), noise_distr : SaltAndPepper::new( cli.salt_and_pepper.as_ref().map_or(0.8, |v| v[0]), cli.salt_and_pepper.as_ref().map_or(0.2, |v| v[1]) @@ -273,10 +274,10 @@ }, Experiment2D_L1_Fast => { let base_spread = HatConv { radius : Hat1 }; - Box::new(Named { name, data : Experiment { + Box::new(Named { name, data : ExperimentV2 { domain : [[0.0, 1.0]; 2].into(), sensor_count : [N_SENSORS_2D; 2], - α : cli.alpha.unwrap_or(0.40), + regularisation : Regularisation::NonnegRadon(cli.alpha.unwrap_or(0.40)), noise_distr : SaltAndPepper::new( cli.salt_and_pepper.as_ref().map_or(0.8, |v| v[0]), cli.salt_and_pepper.as_ref().map_or(0.2, |v| v[1]) diff -r 9869fa1e0ccd -r d29d1fcf5423 src/fb.rs --- a/src/fb.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/fb.rs Sun Dec 11 23:25:53 2022 +0200 @@ -6,8 +6,8 @@ * Valkonen T. - _Proximal methods for point source localisation_, [arXiv:2212.02991](https://arxiv.org/abs/2212.02991). -The main routine is [`pointsource_fb`]. It is based on [`generic_pointsource_fb`], which is also -used by our [primal-dual proximal splitting][crate::pdps] implementation. +The main routine is [`pointsource_fb_reg`]. It is based on [`generic_pointsource_fb_reg`], which is +also used by our [primal-dual proximal splitting][crate::pdps] implementation. FISTA-type inertia can also be enabled through [`FBConfig::meta`]. @@ -83,17 +83,17 @@ use numeric_literals::replace_float_literals; use serde::{Serialize, Deserialize}; use colored::Colorize; -use nalgebra::DVector; +use nalgebra::{DVector, DMatrix}; use alg_tools::iterate::{ AlgIteratorFactory, AlgIteratorState, }; use alg_tools::euclidean::Euclidean; -use alg_tools::norms::Norm; use alg_tools::linops::Apply; use alg_tools::sets::Cube; use alg_tools::loc::Loc; +use alg_tools::mapping::Mapping; use alg_tools::bisection_tree::{ BTFN, PreBTFN, @@ -113,7 +113,6 @@ use crate::measures::{ DiscreteMeasure, DeltaMeasure, - Radon }; use crate::measures::merging::{ SpikeMergingMethod, @@ -124,7 +123,8 @@ DiscreteMeasureOp, Lipschitz }; use crate::subproblem::{ - quadratic_nonneg, + nonneg::quadratic_nonneg, + unconstrained::quadratic_unconstrained, InnerSettings, InnerMethod, }; @@ -134,6 +134,10 @@ Plotting, PlotLookup }; +use crate::regularisation::{ + NonnegRadonRegTerm, + RadonRegTerm, +}; /// Method for constructing $μ$ on each iteration #[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, Debug)] @@ -156,7 +160,7 @@ InertiaFISTA, } -/// Settings for [`pointsource_fb`]. +/// Settings for [`pointsource_fb_reg`]. #[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, Debug)] #[serde(default)] pub struct FBConfig { @@ -169,7 +173,7 @@ } /// Settings for the solution of the stepwise optimality condition in algorithms based on -/// [`generic_pointsource_fb`]. +/// [`generic_pointsource_fb_reg`]. #[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, Debug)] #[serde(default)] pub struct FBGenericConfig { @@ -236,7 +240,7 @@ } } -/// Trait for specialisation of [`generic_pointsource_fb`] to basic FB, FISTA. +/// Trait for specialisation of [`generic_pointsource_fb_reg`] to basic FB, FISTA. /// /// The idea is that the residual $Aμ - b$ in the forward step can be replaced by an arbitrary /// value. For example, to implement [primal-dual proximal splitting][crate::pdps] we replace it @@ -301,7 +305,7 @@ } } -/// Specialisation of [`generic_pointsource_fb`] to basic μFB. +/// Specialisation of [`generic_pointsource_fb_reg`] to basic μFB. struct BasicFB< 'a, F : Float + ToNalgebraRealField, @@ -340,7 +344,7 @@ } } -/// Specialisation of [`generic_pointsource_fb`] to FISTA. +/// Specialisation of [`generic_pointsource_fb_reg`] to FISTA. struct FISTA< 'a, F : Float + ToNalgebraRealField, @@ -423,62 +427,290 @@ } } -/// Iteratively solve the pointsource localisation problem using forward-backward splitting -/// -/// The settings in `config` have their [respective documentation](FBConfig). `opA` is the -/// forward operator $A$, $b$ the observable, and $\lambda$ the regularisation weight. -/// The operator `op𝒟` is used for forming the proximal term. Typically it is a convolution -/// operator. Finally, the `iterator` is an outer loop verbosity and iteration count control -/// as documented in [`alg_tools::iterate`]. -/// -/// For details on the mathematical formulation, see the [module level](self) documentation. -/// -/// Returns the final iterate. + +/// Abstraction of regularisation terms for [`generic_pointsource_fb_reg`]. +pub trait RegTerm +: for<'a> Apply<&'a DiscreteMeasure, F>, Output = F> { + /// Approximately solve the problem + ///
$$ + /// \min_{x ∈ ℝ^n} \frac{1}{2} x^⊤Ax - g^⊤ x + τ G(x) + /// $$
+ /// for $G$ depending on the trait implementation. + /// + /// The parameter `mA` is $A$. An estimate for its opeator norm should be provided in + /// `mA_normest`. The initial iterate and output is `x`. The current main tolerance is `ε`. + /// + /// Returns the number of iterations taken. + fn solve_findim( + &self, + mA : &DMatrix, + g : &DVector, + τ : F, + x : &mut DVector, + mA_normest : F, + ε : F, + config : &FBGenericConfig + ) -> usize; + + /// Find a point where `d` may violate the tolerance `ε`. + /// + /// If `skip_by_rough_check` is set, do not find the point if a rough check indicates that we + /// are in bounds. `ε` is the current main tolerance and `τ` a scaling factor for the + /// regulariser. + /// + /// Returns `None` if `d` is in bounds either based on the rough check, or a more precise check + /// terminating early. Otherwise returns a possibly violating point, the value of `d` there, + /// and a boolean indicating whether the found point is in bounds. + fn find_tolerance_violation( + &self, + d : &mut BTFN, + τ : F, + ε : F, + skip_by_rough_check : bool, + config : &FBGenericConfig, + ) -> Option<(Loc, F, bool)> + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N>; + + /// Verify that `d` is in bounds `ε` for a merge candidate `μ` + /// + /// `ε` is the current main tolerance and `τ` a scaling factor for the regulariser. + fn verify_merge_candidate( + &self, + d : &mut BTFN, + μ : &DiscreteMeasure, F>, + τ : F, + ε : F, + config : &FBGenericConfig, + ) -> bool + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N>; + + fn target_bounds(&self, τ : F, ε : F) -> Option>; + + /// Returns a scaling factor for the tolerance sequence. + /// + /// Typically this is the regularisation parameter. + fn tolerance_scaling(&self) -> F; +} + #[replace_float_literals(F::cast_from(literal))] -pub fn pointsource_fb<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, const N : usize>( - opA : &'a A, - b : &A::Observable, - α : F, - op𝒟 : &'a 𝒟, - config : &FBConfig, - iterator : I, - plotter : SeqPlotter -) -> DiscreteMeasure, F> -where F : Float + ToNalgebraRealField, - I : AlgIteratorFactory>, - for<'b> &'b A::Observable : std::ops::Neg, - //+ std::ops::Mul, <-- FIXME: compiler overflow - A::Observable : std::ops::MulAssign, - GA : SupportGenerator + Clone, - A : ForwardModel, F, PreadjointCodomain = BTFN> - + Lipschitz<𝒟, FloatType=F>, - BTA : BTSearch>, - G𝒟 : SupportGenerator + Clone, - 𝒟 : DiscreteMeasureOp, F, PreCodomain = PreBTFN>, - 𝒟::Codomain : RealMapping, - S: RealMapping + LocalAnalysis, N>, - K: RealMapping + LocalAnalysis, N>, - BTNodeLookup: BTNode, N>, - Cube: P2Minimise, F>, - PlotLookup : Plotting, - DiscreteMeasure, F> : SpikeMerging { +impl RegTerm for NonnegRadonRegTerm +where Cube : P2Minimise, F> { + fn solve_findim( + &self, + mA : &DMatrix, + g : &DVector, + τ : F, + x : &mut DVector, + mA_normest : F, + ε : F, + config : &FBGenericConfig + ) -> usize { + let inner_tolerance = ε * config.inner.tolerance_mult; + let inner_it = config.inner.iterator_options.stop_target(inner_tolerance); + let inner_τ = config.inner.τ0 / mA_normest; + quadratic_nonneg(config.inner.method, mA, g, τ * self.α(), x, + inner_τ, inner_it) + } + + #[inline] + fn find_tolerance_violation( + &self, + d : &mut BTFN, + τ : F, + ε : F, + skip_by_rough_check : bool, + config : &FBGenericConfig, + ) -> Option<(Loc, F, bool)> + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N> { + let τα = τ * self.α(); + let keep_below = τα + ε; + let maximise_above = τα + ε * config.insertion_cutoff_factor; + let refinement_tolerance = ε * config.refinement.tolerance_mult; - let initial_residual = -b; - let τ = config.τ0/opA.lipschitz_factor(&op𝒟).unwrap(); + // If preliminary check indicates that we are in bonds, and if it otherwise matches + // the insertion strategy, skip insertion. + if skip_by_rough_check && d.bounds().upper() <= keep_below { + None + } else { + // If the rough check didn't indicate no insertion needed, find maximising point. + d.maximise_above(maximise_above, refinement_tolerance, config.refinement.max_steps) + .map(|(ξ, v_ξ)| (ξ, v_ξ, v_ξ <= keep_below)) + } + } - match config.meta { - FBMetaAlgorithm::None => generic_pointsource_fb( - opA, α, op𝒟, τ, &config.insertion, iterator, plotter, initial_residual, - BasicFB{ b, opA } - ), - FBMetaAlgorithm::InertiaFISTA => generic_pointsource_fb( - opA, α, op𝒟, τ, &config.insertion, iterator, plotter, initial_residual, - FISTA{ b, opA, λ : 1.0, μ_prev : DiscreteMeasure::new() } - ), + fn verify_merge_candidate( + &self, + d : &mut BTFN, + μ : &DiscreteMeasure, F>, + τ : F, + ε : F, + config : &FBGenericConfig, + ) -> bool + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N> { + let τα = τ * self.α(); + let refinement_tolerance = ε * config.refinement.tolerance_mult; + let merge_tolerance = config.merge_tolerance_mult * ε; + let keep_below = τα + merge_tolerance; + let keep_supp_above = τα - merge_tolerance; + let bnd = d.bounds(); + + return ( + bnd.lower() >= keep_supp_above + || + μ.iter_spikes().map(|&DeltaMeasure{ α : β, ref x }| { + (β == 0.0) || d.apply(x) >= keep_supp_above + }).all(std::convert::identity) + ) && ( + bnd.upper() <= keep_below + || + d.has_upper_bound(keep_below, refinement_tolerance, config.refinement.max_steps) + ) + } + + fn target_bounds(&self, τ : F, ε : F) -> Option> { + let τα = τ * self.α(); + Some(Bounds(τα - ε, τα + ε)) + } + + fn tolerance_scaling(&self) -> F { + self.α() } } -/// Generic implementation of [`pointsource_fb`]. +#[replace_float_literals(F::cast_from(literal))] +impl RegTerm for RadonRegTerm +where Cube : P2Minimise, F> { + fn solve_findim( + &self, + mA : &DMatrix, + g : &DVector, + τ : F, + x : &mut DVector, + mA_normest: F, + ε : F, + config : &FBGenericConfig + ) -> usize { + let inner_tolerance = ε * config.inner.tolerance_mult; + let inner_it = config.inner.iterator_options.stop_target(inner_tolerance); + let inner_τ = config.inner.τ0 / mA_normest; + quadratic_unconstrained(config.inner.method, mA, g, τ * self.α(), x, + inner_τ, inner_it) + } + + fn find_tolerance_violation( + &self, + d : &mut BTFN, + τ : F, + ε : F, + skip_by_rough_check : bool, + config : &FBGenericConfig, + ) -> Option<(Loc, F, bool)> + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N> { + let τα = τ * self.α(); + let keep_below = τα + ε; + let keep_above = -τα - ε; + let maximise_above = τα + ε * config.insertion_cutoff_factor; + let minimise_below = -τα - ε * config.insertion_cutoff_factor; + let refinement_tolerance = ε * config.refinement.tolerance_mult; + + // If preliminary check indicates that we are in bonds, and if it otherwise matches + // the insertion strategy, skip insertion. + if skip_by_rough_check && Bounds(keep_above, keep_below).superset(&d.bounds()) { + None + } else { + // If the rough check didn't indicate no insertion needed, find maximising point. + let mx = d.maximise_above(maximise_above, refinement_tolerance, + config.refinement.max_steps); + let mi = d.minimise_below(minimise_below, refinement_tolerance, + config.refinement.max_steps); + + match (mx, mi) { + (None, None) => None, + (Some((ξ, v_ξ)), None) => Some((ξ, v_ξ, keep_below >= v_ξ)), + (None, Some((ζ, v_ζ))) => Some((ζ, v_ζ, keep_above <= v_ζ)), + (Some((ξ, v_ξ)), Some((ζ, v_ζ))) => { + if v_ξ - τα > τα - v_ζ { + Some((ξ, v_ξ, keep_below >= v_ξ)) + } else { + Some((ζ, v_ζ, keep_above <= v_ζ)) + } + } + } + } + } + + fn verify_merge_candidate( + &self, + d : &mut BTFN, + μ : &DiscreteMeasure, F>, + τ : F, + ε : F, + config : &FBGenericConfig, + ) -> bool + where BT : BTSearch>, + G : SupportGenerator, + G::SupportType : Mapping,Codomain=F> + + LocalAnalysis, N> { + let τα = τ * self.α(); + let refinement_tolerance = ε * config.refinement.tolerance_mult; + let merge_tolerance = config.merge_tolerance_mult * ε; + let keep_below = τα + merge_tolerance; + let keep_above = -τα - merge_tolerance; + let keep_supp_pos_above = τα - merge_tolerance; + let keep_supp_neg_below = -τα + merge_tolerance; + let bnd = d.bounds(); + + return ( + (bnd.lower() >= keep_supp_pos_above && bnd.upper() <= keep_supp_neg_below) + || + μ.iter_spikes().map(|&DeltaMeasure{ α : β, ref x }| { + use std::cmp::Ordering::*; + match β.partial_cmp(&0.0) { + Some(Greater) => d.apply(x) >= keep_supp_pos_above, + Some(Less) => d.apply(x) <= keep_supp_neg_below, + _ => true, + } + }).all(std::convert::identity) + ) && ( + bnd.upper() <= keep_below + || + d.has_upper_bound(keep_below, refinement_tolerance, + config.refinement.max_steps) + ) && ( + bnd.lower() >= keep_above + || + d.has_lower_bound(keep_above, refinement_tolerance, + config.refinement.max_steps) + ) + } + + fn target_bounds(&self, τ : F, ε : F) -> Option> { + let τα = τ * self.α(); + Some(Bounds(-τα - ε, τα + ε)) + } + + fn tolerance_scaling(&self) -> F { + self.α() + } +} + + +/// Generic implementation of [`pointsource_fb_reg`]. /// /// The method can be specialised to even primal-dual proximal splitting through the /// [`FBSpecialisation`] parameter `specialisation`. @@ -497,16 +729,18 @@ /// /// Returns the final iterate. #[replace_float_literals(F::cast_from(literal))] -pub fn generic_pointsource_fb<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, Spec, const N : usize>( +pub fn generic_pointsource_fb_reg< + 'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, Spec, Reg, const N : usize +>( opA : &'a A, - α : F, + reg : Reg, op𝒟 : &'a 𝒟, mut τ : F, config : &FBGenericConfig, iterator : I, mut plotter : SeqPlotter, mut residual : A::Observable, - mut specialisation : Spec, + mut specialisation : Spec ) -> DiscreteMeasure, F> where F : Float + ToNalgebraRealField, I : AlgIteratorFactory>, @@ -522,17 +756,16 @@ S: RealMapping + LocalAnalysis, N>, K: RealMapping + LocalAnalysis, N>, BTNodeLookup: BTNode, N>, - Cube: P2Minimise, F>, PlotLookup : Plotting, - DiscreteMeasure, F> : SpikeMerging { + DiscreteMeasure, F> : SpikeMerging, + Reg : RegTerm { // Set up parameters let quiet = iterator.is_quiet(); let op𝒟norm = op𝒟.opnorm_bound(); - // We multiply tolerance by τ for FB since - // our subproblems depending on tolerances are scaled by τ compared to the conditional - // gradient approach. - let tolerance = config.tolerance * τ * α; + // We multiply tolerance by τ for FB since our subproblems depending on tolerances are scaled + // by τ compared to the conditional gradient approach. + let tolerance = config.tolerance * τ * reg.tolerance_scaling(); let mut ε = tolerance.initial(); // Initialise operators @@ -568,17 +801,6 @@ // Run the algorithm iterator.iterate(|state| { - // Calculate subproblem tolerances, and update main tolerance for next iteration - let τα = τ * α; - let target_bounds = Bounds(τα - ε, τα + ε); - let merge_tolerance = config.merge_tolerance_mult * ε; - let merge_target_bounds = Bounds(τα - merge_tolerance, τα + merge_tolerance); - let inner_tolerance = ε * config.inner.tolerance_mult; - let refinement_tolerance = ε * config.refinement.tolerance_mult; - let maximise_above = τα + ε * config.insertion_cutoff_factor; - let ε_prev = ε; - ε = tolerance.update(ε, state.iteration()); - // Maximum insertion count and measure difference calculation depend on insertion style. let (m, warn_insertions) = match (state.iteration(), config.bootstrap_insertions) { (i, Some((l, k))) if i <= l => (k, false), @@ -626,12 +848,10 @@ // ≤ sup_{|z|_2 ≤ 1} |Cz|_ℳ |𝒟Cx|_∞ ≤ sup_{|z|_2 ≤ 1} |Cz|_ℳ |𝒟| |Cx|_ℳ // ≤ sup_{|z|_2 ≤ 1} |z|_1 |𝒟| |x|_1 ≤ sup_{|z|_2 ≤ 1} n |z|_2 |𝒟| |x|_2 // = n |𝒟| |x|_2, where n is the number of points. Therefore - let inner_τ = config.inner.τ0 / (op𝒟norm * F::cast_from(μ.len())); + let Ã_normest = op𝒟norm * F::cast_from(μ.len()); // Solve finite-dimensional subproblem. - let inner_it = config.inner.iterator_options.stop_target(inner_tolerance); - inner_iters += quadratic_nonneg(config.inner.method, &Ã, &g̃, τ*α, &mut x, - inner_τ, inner_it); + inner_iters += reg.solve_findim(&Ã, &g̃, τ, &mut x, Ã_normest, ε, config); // Update masses of μ based on solution of finite-dimensional subproblem. μ.set_masses_dvector(&x); @@ -644,36 +864,23 @@ // If no merging heuristic is used, let's be more conservative about spike insertion, // and skip it after first round. If merging is done, being more greedy about spike // insertion also seems to improve performance. - let may_break = if let SpikeMergingMethod::None = config.merging { + let skip_by_rough_check = if let SpikeMergingMethod::None = config.merging { false } else { count > 0 }; - // If preliminary check indicates that we are in bonds, and if it otherwise matches - // the insertion strategy, skip insertion. - if may_break && target_bounds.superset(&d.bounds()) { - break 'insertion (true, d) - } - - // If the rough check didn't indicate stopping, find maximising point, maintaining for - // the calculations in the beginning of the loop that v_ξ = (ω0-τv-𝒟μ)(ξ) = d(ξ), - // where 𝒟μ is now distinct from μ0 after the insertions already performed. - // We do not need to check lower bounds, as a solution of the finite-dimensional - // subproblem should always satisfy them. - - // If μ has some spikes, only find a maximum of d if it is above a threshold - // defined by the refinment tolerance. - let (ξ, v_ξ) = match d.maximise_above(maximise_above, refinement_tolerance, - config.refinement.max_steps) { + // Find a spike to insert, if needed + let (ξ, _v_ξ, in_bounds) = match reg.find_tolerance_violation( + &mut d, τ, ε, skip_by_rough_check, config + ) { None => break 'insertion (true, d), Some(res) => res, }; // Break if maximum insertion count reached if count >= max_insertions { - let in_bounds2 = target_bounds.upper() >= v_ξ; - break 'insertion (in_bounds2, d) + break 'insertion (in_bounds, d) } // No point in optimising the weight here; the finite-dimensional algorithm is fast. @@ -695,21 +902,8 @@ μ.merge_spikes(config.merging, |μ_candidate| { let mut d = &minus_τv + op𝒟.preapply(μ_diff(&μ_candidate, &μ_base)); - if merge_target_bounds.superset(&d.bounds()) { - return Some(()) - } - - let d_min_supp = μ_candidate.iter_spikes().filter_map(|&DeltaMeasure{ α, ref x }| { - (α != 0.0).then(|| d.apply(x)) - }).reduce(F::min); - - if d_min_supp.map_or(true, |b| b >= merge_target_bounds.lower()) && - d.has_upper_bound(merge_target_bounds.upper(), refinement_tolerance, - config.refinement.max_steps) { - Some(()) - } else { - None - } + reg.verify_merge_candidate(&mut d, μ_candidate, τ, ε, &config) + .then_some(()) }); debug_assert!(μ.len() >= n_before_merge); merged += μ.len() - n_before_merge; @@ -725,6 +919,10 @@ this_iters += 1; + // Update main tolerance for next iteration + let ε_prev = ε; + ε = tolerance.update(ε, state.iteration()); + // Give function value if needed state.if_verbose(|| { let value_μ = specialisation.value_μ(&μ); @@ -732,12 +930,12 @@ plotter.plot_spikes( format!("iter {} end; {}", state.iteration(), within_tolerances), &d, "start".to_string(), Some(&minus_τv), - Some(target_bounds), value_μ, + reg.target_bounds(τ, ε_prev), value_μ, ); // Calculate mean inner iterations and reset relevant counters. // Return the statistics let res = IterInfo { - value : specialisation.calculate_fit(&μ, &residual) + α * value_μ.norm(Radon), + value : specialisation.calculate_fit(&μ, &residual) + reg.apply(value_μ), n_spikes : value_μ.len(), inner_iters, this_iters, @@ -757,6 +955,128 @@ specialisation.postprocess(μ, config.final_merging) } +/// Iteratively solve the pointsource localisation problem using forward-backward splitting +/// +/// The settings in `config` have their [respective documentation](FBConfig). `opA` is the +/// forward operator $A$, $b$ the observable, and $\lambda$ the regularisation weight. +/// The operator `op𝒟` is used for forming the proximal term. Typically it is a convolution +/// operator. Finally, the `iterator` is an outer loop verbosity and iteration count control +/// as documented in [`alg_tools::iterate`]. +/// +/// For details on the mathematical formulation, see the [module level](self) documentation. +/// +/// Returns the final iterate. +#[replace_float_literals(F::cast_from(literal))] +pub fn pointsource_fb_reg<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, Reg, const N : usize>( + opA : &'a A, + b : &A::Observable, + reg : Reg, + op𝒟 : &'a 𝒟, + config : &FBConfig, + iterator : I, + plotter : SeqPlotter, +) -> DiscreteMeasure, F> +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory>, + for<'b> &'b A::Observable : std::ops::Neg, + //+ std::ops::Mul, <-- FIXME: compiler overflow + A::Observable : std::ops::MulAssign, + GA : SupportGenerator + Clone, + A : ForwardModel, F, PreadjointCodomain = BTFN> + + Lipschitz<𝒟, FloatType=F>, + BTA : BTSearch>, + G𝒟 : SupportGenerator + Clone, + 𝒟 : DiscreteMeasureOp, F, PreCodomain = PreBTFN>, + 𝒟::Codomain : RealMapping, + S: RealMapping + LocalAnalysis, N>, + K: RealMapping + LocalAnalysis, N>, + BTNodeLookup: BTNode, N>, + Cube: P2Minimise, F>, + PlotLookup : Plotting, + DiscreteMeasure, F> : SpikeMerging, + Reg : RegTerm { + + let initial_residual = -b; + let τ = config.τ0/opA.lipschitz_factor(&op𝒟).unwrap(); + + match config.meta { + FBMetaAlgorithm::None => generic_pointsource_fb_reg( + opA, reg, op𝒟, τ, &config.insertion, iterator, plotter, initial_residual, + BasicFB{ b, opA }, + ), + FBMetaAlgorithm::InertiaFISTA => generic_pointsource_fb_reg( + opA, reg, op𝒟, τ, &config.insertion, iterator, plotter, initial_residual, + FISTA{ b, opA, λ : 1.0, μ_prev : DiscreteMeasure::new() }, + ), + } +} + +// +// Deprecated interfaces +// + +#[deprecated(note = "Use `pointsource_fb_reg`")] +pub fn pointsource_fb<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, const N : usize>( + opA : &'a A, + b : &A::Observable, + α : F, + op𝒟 : &'a 𝒟, + config : &FBConfig, + iterator : I, + plotter : SeqPlotter +) -> DiscreteMeasure, F> +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory>, + for<'b> &'b A::Observable : std::ops::Neg, + A::Observable : std::ops::MulAssign, + GA : SupportGenerator + Clone, + A : ForwardModel, F, PreadjointCodomain = BTFN> + + Lipschitz<𝒟, FloatType=F>, + BTA : BTSearch>, + G𝒟 : SupportGenerator + Clone, + 𝒟 : DiscreteMeasureOp, F, PreCodomain = PreBTFN>, + 𝒟::Codomain : RealMapping, + S: RealMapping + LocalAnalysis, N>, + K: RealMapping + LocalAnalysis, N>, + BTNodeLookup: BTNode, N>, + Cube: P2Minimise, F>, + PlotLookup : Plotting, + DiscreteMeasure, F> : SpikeMerging { + + pointsource_fb_reg(opA, b, NonnegRadonRegTerm(α), op𝒟, config, iterator, plotter) +} +#[deprecated(note = "Use `generic_pointsource_fb_reg`")] +pub fn generic_pointsource_fb<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, Spec, const N : usize>( + opA : &'a A, + α : F, + op𝒟 : &'a 𝒟, + τ : F, + config : &FBGenericConfig, + iterator : I, + plotter : SeqPlotter, + residual : A::Observable, + specialisation : Spec, +) -> DiscreteMeasure, F> +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory>, + Spec : FBSpecialisation, + A::Observable : std::ops::MulAssign, + GA : SupportGenerator + Clone, + A : ForwardModel, F, PreadjointCodomain = BTFN> + + Lipschitz<𝒟, FloatType=F>, + BTA : BTSearch>, + G𝒟 : SupportGenerator + Clone, + 𝒟 : DiscreteMeasureOp, F, PreCodomain = PreBTFN>, + 𝒟::Codomain : RealMapping, + S: RealMapping + LocalAnalysis, N>, + K: RealMapping + LocalAnalysis, N>, + BTNodeLookup: BTNode, N>, + Cube: P2Minimise, F>, + PlotLookup : Plotting, + DiscreteMeasure, F> : SpikeMerging { + generic_pointsource_fb_reg(opA, NonnegRadonRegTerm(α), op𝒟, τ, config, iterator, plotter, + residual, specialisation) +} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/frank_wolfe.rs --- a/src/frank_wolfe.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/frank_wolfe.rs Sun Dec 11 23:25:53 2022 +0200 @@ -53,7 +53,7 @@ use crate::forward_model::ForwardModel; #[allow(unused_imports)] // Used in documentation use crate::subproblem::{ - quadratic_nonneg, + unconstrained::quadratic_unconstrained, InnerSettings, InnerMethod, }; @@ -166,7 +166,7 @@ // where C = √m satisfies ‖x‖_1 ≤ C ‖x‖_2. Since we are intested in ‖A_*A‖, no // square root is needed when we scale: let inner_τ = inner.τ0 / (findim_data.opAnorm_squared * F::cast_from(μ.len())); - let iters = quadratic_nonneg(inner.method, &Ã, &g̃, α, &mut x, inner_τ, iterator); + let iters = quadratic_unconstrained(inner.method, &Ã, &g̃, α, &mut x, inner_τ, iterator); // Update masses of μ based on solution of finite-dimensional subproblem. μ.set_masses_dvector(&x); diff -r 9869fa1e0ccd -r d29d1fcf5423 src/main.rs --- a/src/main.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/main.rs Sun Dec 11 23:25:53 2022 +0200 @@ -33,6 +33,7 @@ pub mod plot; pub mod subproblem; pub mod tolerance; +pub mod regularisation; pub mod fb; pub mod frank_wolfe; pub mod pdps; diff -r 9869fa1e0ccd -r d29d1fcf5423 src/pdps.rs --- a/src/pdps.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/pdps.rs Sun Dec 11 23:25:53 2022 +0200 @@ -7,7 +7,7 @@ [arXiv:2212.02991](https://arxiv.org/abs/2212.02991). The main routine is [`pointsource_pdps`]. It is based on specilisatinn of -[`generic_pointsource_fb`] through relevant [`FBSpecialisation`] implementations. +[`generic_pointsource_fb_reg`] through relevant [`FBSpecialisation`] implementations. Both norm-2-squared and norm-1 data terms are supported. That is, implemented are solvers for
$$ @@ -88,8 +88,10 @@ use crate::fb::{ FBGenericConfig, FBSpecialisation, - generic_pointsource_fb + generic_pointsource_fb_reg, + RegTerm, }; +use crate::regularisation::NonnegRadonRegTerm; /// Acceleration #[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, ValueEnum, Debug)] @@ -154,7 +156,7 @@ } } -/// Specialisation of [`generic_pointsource_fb`] to PDPS. +/// Specialisation of [`generic_pointsource_fb_reg`] to PDPS. pub struct PDPS< 'a, F : Float + ToNalgebraRealField, @@ -302,10 +304,10 @@ /// /// Returns the final iterate. #[replace_float_literals(F::cast_from(literal))] -pub fn pointsource_pdps<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, D, const N : usize>( +pub fn pointsource_pdps_reg<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, D, Reg, const N : usize>( opA : &'a A, b : &'a A::Observable, - α : F, + reg : Reg, op𝒟 : &'a 𝒟, config : &PDPSConfig, iterator : I, @@ -328,11 +330,11 @@ S: RealMapping + LocalAnalysis, N>, K: RealMapping + LocalAnalysis, N>, BTNodeLookup: BTNode, N>, - Cube: P2Minimise, F>, PlotLookup : Plotting, DiscreteMeasure, F> : SpikeMerging, PDPS<'a, F, A, D, N> : FBSpecialisation, - D : Subdifferentiable { + D : Subdifferentiable, + Reg : RegTerm { let y = dataterm.some_subdifferential(-b); let l = opA.lipschitz_factor(&op𝒟).unwrap().sqrt(); @@ -349,8 +351,46 @@ y_prev : y.clone(), }; - generic_pointsource_fb( - opA, α, op𝒟, τ, &config.insertion, iterator, plotter, y, - pdps + generic_pointsource_fb_reg( + opA, reg, op𝒟, τ, &config.insertion, iterator, plotter, y, pdps ) -} \ No newline at end of file +} + +// +// Deprecated interfaces +// + +#[deprecated(note = "Use `pointsource_pdps_reg`")] +pub fn pointsource_pdps<'a, F, I, A, GA, 𝒟, BTA, G𝒟, S, K, D, const N : usize>( + opA : &'a A, + b : &'a A::Observable, + α : F, + op𝒟 : &'a 𝒟, + config : &PDPSConfig, + iterator : I, + plotter : SeqPlotter, + dataterm : D, +) -> DiscreteMeasure, F> +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory>, + for<'b> &'b A::Observable : std::ops::Neg + + std::ops::Add, + A::Observable : std::ops::MulAssign, + GA : SupportGenerator + Clone, + A : ForwardModel, F, PreadjointCodomain = BTFN> + + Lipschitz<𝒟, FloatType=F>, + BTA : BTSearch>, + G𝒟 : SupportGenerator + Clone, + 𝒟 : DiscreteMeasureOp, F, PreCodomain = PreBTFN>, + 𝒟::Codomain : RealMapping, + S: RealMapping + LocalAnalysis, N>, + K: RealMapping + LocalAnalysis, N>, + BTNodeLookup: BTNode, N>, + Cube: P2Minimise, F>, + PlotLookup : Plotting, + DiscreteMeasure, F> : SpikeMerging, + PDPS<'a, F, A, D, N> : FBSpecialisation, + D : Subdifferentiable { + + pointsource_pdps_reg(opA, b, NonnegRadonRegTerm(α), op𝒟, config, iterator, plotter, dataterm) +} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/regularisation.rs --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/regularisation.rs Sun Dec 11 23:25:53 2022 +0200 @@ -0,0 +1,84 @@ +/*! +Regularisation terms +*/ + +use serde::{Serialize, Deserialize}; +use alg_tools::norms::Norm; +use alg_tools::linops::Apply; +use alg_tools::loc::Loc; +use crate::types::*; +use crate::measures::{ + DiscreteMeasure, + Radon +}; +#[allow(unused_imports)] // Used by documentation. +use crate::fb::generic_pointsource_fb_reg; + +/// The regularisation term $α\\|μ\\|\_{ℳ(Ω)} + δ_{≥ 0}(μ)$ for [`generic_pointsource_fb_reg`]. +/// +/// The only member of the struct is the regularisation parameter α. +#[derive(Copy, Clone, Debug, Serialize, Deserialize)] +pub struct NonnegRadonRegTerm(pub F /* α */); + +impl<'a, F : Float> NonnegRadonRegTerm { + /// Returns the regularisation parameter + pub fn α(&self) -> F { + let &NonnegRadonRegTerm(α) = self; + α + } +} + +impl<'a, F : Float, const N : usize> Apply<&'a DiscreteMeasure, F>> +for NonnegRadonRegTerm { + type Output = F; + + fn apply(&self, μ : &'a DiscreteMeasure, F>) -> F { + self.α() * μ.norm(Radon) + } +} + + +/// The regularisation term $α\|μ\|_{ℳ(Ω)}$ for [`generic_pointsource_fb_reg`]. +/// +/// The only member of the struct is the regularisation parameter α. +#[derive(Copy, Clone, Debug, Serialize, Deserialize)] +pub struct RadonRegTerm(pub F /* α */); + + +impl<'a, F : Float> RadonRegTerm { + /// Returns the regularisation parameter + pub fn α(&self) -> F { + let &RadonRegTerm(α) = self; + α + } +} + +impl<'a, F : Float, const N : usize> Apply<&'a DiscreteMeasure, F>> +for RadonRegTerm { + type Output = F; + + fn apply(&self, μ : &'a DiscreteMeasure, F>) -> F { + self.α() * μ.norm(Radon) + } +} + +/// Regularisation term configuration +#[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, Debug)] +pub enum Regularisation { + /// $α \\|μ\\|\_{ℳ(Ω)}$ + Radon(F), + /// $α\\|μ\\|\_{ℳ(Ω)} + δ_{≥ 0}(μ)$ + NonnegRadon(F), +} + +impl<'a, F : Float, const N : usize> Apply<&'a DiscreteMeasure, F>> +for Regularisation { + type Output = F; + + fn apply(&self, μ : &'a DiscreteMeasure, F>) -> F { + match *self { + Self::Radon(α) => RadonRegTerm(α).apply(μ), + Self::NonnegRadon(α) => NonnegRadonRegTerm(α).apply(μ), + } + } +} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/run.rs --- a/src/run.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/run.rs Sun Dec 11 23:25:53 2022 +0200 @@ -34,7 +34,7 @@ use alg_tools::mapping::RealMapping; use alg_tools::nalgebra_support::ToNalgebraRealField; use alg_tools::euclidean::Euclidean; -use alg_tools::norms::{Norm, L1}; +use alg_tools::norms::L1; use alg_tools::lingrid::lingrid; use alg_tools::sets::SetOrd; @@ -45,13 +45,14 @@ use crate::forward_model::*; use crate::fb::{ FBConfig, - pointsource_fb, - FBMetaAlgorithm, FBGenericConfig, + pointsource_fb_reg, + FBMetaAlgorithm, + FBGenericConfig, }; use crate::pdps::{ PDPSConfig, L2Squared, - pointsource_pdps, + pointsource_pdps_reg, }; use crate::frank_wolfe::{ FWConfig, @@ -65,6 +66,7 @@ use crate::plot::*; use crate::{AlgorithmOverrides, CommandLineArgs}; use crate::tolerance::Tolerance; +use crate::regularisation::{Regularisation, RadonRegTerm, NonnegRadonRegTerm}; /// Available algorithms and their configurations #[derive(Copy, Clone, Debug, Serialize, Deserialize)] @@ -276,7 +278,7 @@ /// Struct for experiment configurations #[derive(Debug, Clone, Serialize)] -pub struct Experiment +pub struct ExperimentV2 where F : Float, [usize; N] : Serialize, NoiseDistr : Distribution, @@ -300,8 +302,8 @@ pub kernel : K, /// True point sources pub μ_hat : DiscreteMeasure, F>, - /// Regularisation parameter - pub α : F, + /// Regularisation term and parameter + pub regularisation : Regularisation, /// For plotting : how wide should the kernels be plotted pub kernel_plot_width : F, /// Data term @@ -322,8 +324,12 @@ -> Named>; } +// *** macro boilerplate *** +macro_rules! impl_experiment { +($type:ident, $reg_field:ident, $reg_convert:path) => { +// *** macro *** impl RunnableExperiment for -Named> +Named<$type> where F : ClapFloat + nalgebra::RealField + ToNalgebraRealField, [usize; N] : Serialize, S : Sensor + Copy + Serialize + std::fmt::Debug, @@ -356,12 +362,14 @@ // Get experiment configuration let &Named { name : ref experiment_name, - data : Experiment { + data : $type { domain, sensor_count, ref noise_distr, sensor, spread, kernel, - ref μ_hat, α, kernel_plot_width, dataterm, noise_seed, + ref μ_hat, /*regularisation,*/ kernel_plot_width, dataterm, noise_seed, .. } } = self; + #[allow(deprecated)] + let regularisation = $reg_convert(self.data.$reg_field); println!("{}\n{}", format!("Performing experiment {}…", experiment_name).cyan(), @@ -420,7 +428,12 @@ format!("{:?}", iterator_options).bright_black(), format!("{:?}", alg).bright_black()); }; - + let not_implemented = || { + let msg = format!("Algorithm “{alg_name}” not implemented for \ + dataterm {dataterm:?} and regularisation {regularisation:?}. \ + Skipping.").red(); + eprintln!("{}", msg); + }; // Create Logger and IteratorFactory let mut logger = Logger::new(); let findim_data = prepare_optimise_weights(&opA); @@ -437,20 +450,18 @@ this_iters, .. } = data; - let post_value = match postprocessing { - None => value, - Some(mut μ) => { - match dataterm { - DataTerm::L2Squared => { - optimise_weights( - &mut μ, &opA, &b, α, &findim_data, &inner_config, - inner_it - ); - dataterm.value_at_residual(opA.apply(&μ) - &b) + α * μ.norm(Radon) - }, - _ => value, - } - } + let post_value = match (postprocessing, dataterm, regularisation) { + (Some(mut μ), DataTerm::L2Squared, Regularisation::Radon(α)) => { + // Comparison postprocessing is only implemented for the case handled + // by the FW variants. + optimise_weights( + &mut μ, &opA, &b, α, &findim_data, &inner_config, + inner_it + ); + dataterm.value_at_residual(opA.apply(&μ) - &b) + + regularisation.apply(&μ) + }, + _ => value, }; CSVLog { iter, @@ -477,30 +488,72 @@ // Run the algorithm let start = Instant::now(); let start_cpu = ProcessTime::now(); - let μ : DiscreteMeasure, F> = match (alg, dataterm) { - (AlgorithmConfig::FB(ref algconfig), DataTerm::L2Squared) => { - running(); - pointsource_fb(&opA, &b, α, &op𝒟, &algconfig, iterator, plotter) + let μ = match alg { + AlgorithmConfig::FB(ref algconfig) => { + match (regularisation, dataterm) { + (Regularisation::NonnegRadon(α), DataTerm::L2Squared) => { + running(); + pointsource_fb_reg( + &opA, &b, NonnegRadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter + ) + }, + (Regularisation::Radon(α), DataTerm::L2Squared) => { + running(); + pointsource_fb_reg( + &opA, &b, RadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter + ) + }, + _ => { + not_implemented(); + continue + } + } }, - (AlgorithmConfig::FW(ref algconfig), DataTerm::L2Squared) => { - running(); - pointsource_fw(&opA, &b, α, &algconfig, iterator, plotter) - }, - (AlgorithmConfig::PDPS(ref algconfig), DataTerm::L2Squared) => { + AlgorithmConfig::PDPS(ref algconfig) => { running(); - pointsource_pdps(&opA, &b, α, &op𝒟, &algconfig, iterator, plotter, L2Squared) - }, - (AlgorithmConfig::PDPS(ref algconfig), DataTerm::L1) => { - running(); - pointsource_pdps(&opA, &b, α, &op𝒟, &algconfig, iterator, plotter, L1) + match (regularisation, dataterm) { + (Regularisation::NonnegRadon(α), DataTerm::L2Squared) => { + pointsource_pdps_reg( + &opA, &b, NonnegRadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter, L2Squared + ) + }, + (Regularisation::Radon(α),DataTerm::L2Squared) => { + pointsource_pdps_reg( + &opA, &b, RadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter, L2Squared + ) + }, + (Regularisation::NonnegRadon(α), DataTerm::L1) => { + pointsource_pdps_reg( + &opA, &b, NonnegRadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter, L1 + ) + }, + (Regularisation::Radon(α), DataTerm::L1) => { + pointsource_pdps_reg( + &opA, &b, RadonRegTerm(α), &op𝒟, algconfig, + iterator, plotter, L1 + ) + }, + } }, - _ => { - let msg = format!("Algorithm “{alg_name}” not implemented for \ - dataterm {dataterm:?}. Skipping.").red(); - eprintln!("{}", msg); - continue + AlgorithmConfig::FW(ref algconfig) => { + match (regularisation, dataterm) { + (Regularisation::Radon(α), DataTerm::L2Squared) => { + running(); + pointsource_fw(&opA, &b, α, algconfig, iterator, plotter) + }, + _ => { + not_implemented(); + continue + } + } } }; + let elapsed = start.elapsed().as_secs_f64(); let cpu_time = start_cpu.elapsed().as_secs_f64(); @@ -520,6 +573,11 @@ Ok(()) } } +// *** macro end boiler plate *** +}} +// *** actual code *** + +impl_experiment!(ExperimentV2, regularisation, std::convert::identity); /// Plot experiment setup #[replace_float_literals(F::cast_from(literal))] @@ -589,3 +647,46 @@ opA.write_observable(&b, pfx("b_noisy")) } +// +// Deprecated interface +// + +/// Struct for experiment configurations +#[derive(Debug, Clone, Serialize)] +pub struct Experiment +where F : Float, + [usize; N] : Serialize, + NoiseDistr : Distribution, + S : Sensor, + P : Spread, + K : SimpleConvolutionKernel, +{ + /// Domain $Ω$. + pub domain : Cube, + /// Number of sensors along each dimension + pub sensor_count : [usize; N], + /// Noise distribution + pub noise_distr : NoiseDistr, + /// Seed for random noise generation (for repeatable experiments) + pub noise_seed : u64, + /// Sensor $θ$; $θ * ψ$ forms the forward operator $𝒜$. + pub sensor : S, + /// Spread $ψ$; $θ * ψ$ forms the forward operator $𝒜$. + pub spread : P, + /// Kernel $ρ$ of $𝒟$. + pub kernel : K, + /// True point sources + pub μ_hat : DiscreteMeasure, F>, + /// Regularisation parameter + #[deprecated(note = "Use [`ExperimentV2`], which replaces `α` by more generic `regularisation`")] + pub α : F, + /// For plotting : how wide should the kernels be plotted + pub kernel_plot_width : F, + /// Data term + pub dataterm : DataTerm, + /// A map of default configurations for algorithms + #[serde(skip)] + pub algorithm_defaults : HashMap>, +} + +impl_experiment!(Experiment, α, Regularisation::NonnegRadon); diff -r 9869fa1e0ccd -r d29d1fcf5423 src/subproblem.rs --- a/src/subproblem.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/subproblem.rs Sun Dec 11 23:25:53 2022 +0200 @@ -1,25 +1,27 @@ -//! Iterative algorithms for solving finite-dimensional subproblems. +/*! +Iterative algorithms for solving the finite-dimensional subproblem. +*/ use serde::{Serialize, Deserialize}; -use nalgebra::{DVector, DMatrix}; use numeric_literals::replace_float_literals; -use itertools::{izip, Itertools}; -use colored::Colorize; - -use alg_tools::iter::Mappable; -use alg_tools::error::NumericalError; use alg_tools::iterate::{ - AlgIteratorFactory, - AlgIteratorState, AlgIteratorOptions, Verbose, - Step, + }; -use alg_tools::linops::GEMV; -use alg_tools::nalgebra_support::ToNalgebraRealField; use crate::types::*; +pub mod nonneg; +pub mod unconstrained; + +#[deprecated(since = "1.0.1", note = "Moved to submodule nonneg")] +pub use nonneg::{ + quadratic_nonneg, + quadratic_nonneg_ssn, + quadratic_nonneg_fb +}; + /// Method for solving finite-dimensional subproblems #[derive(Clone, Copy, Eq, PartialEq, Serialize, Deserialize, Debug)] #[allow(dead_code)] @@ -65,309 +67,3 @@ } } } - -/// Compute the proximal operator of $x \mapsto x + \delta\_{[0, \infty)}$, i.e., -/// the non-negativity contrained soft-thresholding operator. -#[inline] -#[replace_float_literals(F::cast_from(literal))] -fn nonneg_soft_thresholding(v : F, λ : F) -> F { - (v - λ).max(0.0) -} - -/// Forward-backward splitting implementation of [`quadratic_nonneg`]. -/// For detailed documentation of the inputs and outputs, refer to there. -/// -/// The `λ` component of the model is handled in the proximal step instead of the gradient step -/// for potential performance improvements. -#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] -pub fn quadratic_nonneg_fb( - mA : &DMatrix, - g : &DVector, - //c_ : F, - λ_ : F, - x : &mut DVector, - τ_ : F, - iterator : I -) -> usize -where F : Float + ToNalgebraRealField, - I : AlgIteratorFactory -{ - let mut xprev = x.clone(); - //let c = c_.to_nalgebra_mixed(); - let λ = λ_.to_nalgebra_mixed(); - let τ = τ_.to_nalgebra_mixed(); - let τλ = τ * λ; - let mut v = DVector::zeros(x.len()); - let mut iters = 0; - - iterator.iterate(|state| { - // Replace `x` with $x - τ[Ax-g]= [x + τg]- τAx$ - v.copy_from(g); // v = g - v.axpy(1.0, x, τ); // v = x + τ*g - v.sygemv(-τ, mA, x, 1.0); // v = [x + τg]- τAx - let backup = state.if_verbose(|| { - xprev.copy_from(x) - }); - // Calculate the proximal map - x.iter_mut().zip(v.iter()).for_each(|(x_i, &v_i)| { - *x_i = nonneg_soft_thresholding(v_i, τλ); - }); - - iters +=1; - - backup.map(|_| { - // The subdifferential of the objective is $Ax - g + λ + ∂ δ_{≥ 0}(x)$. - // We return the minimal ∞-norm over all subderivatives. - v.copy_from(g); // d = g - mA.gemv(&mut v, 1.0, x, -1.0); // d = Ax - g - let mut val = 0.0; - for (&v_i, &x_i) in izip!(v.iter(), x.iter()) { - let d = v_i + λ; - if x_i > 0.0 || d < 0.0 { - val = val.max(d.abs()); - } - } - F::from_nalgebra_mixed(val) - }) - }); - - iters -} - -/// Semismooth Newton implementation of [`quadratic_nonneg`]. -/// -/// For detailed documentation of the inputs, refer to there. -/// This function returns the number of iterations taken if there was no inversion failure, -/// -/// ## Method derivation -/// -/// **The below may look like garbage. Sorry, but rustdoc is obsolete rubbish -/// that doesn't directly support by-now standard-in-markdown LaTeX math. Instead it -/// forces one into unreliable KaTeX autorender postprocessing andescape hell and that -/// it doesn't even process correctly.** -/// -///

-/// For the objective -/// $$ -/// J(x) = \frac{1}{2} x^⊤Ax - g^⊤ x + λ{\vec 1}^⊤ x + c + δ_{≥ 0}(x), -/// $$ -/// we have the optimality condition -/// $$ -/// x - \mathop{\mathrm{prox}}_{τλ{\vec 1}^⊤ + δ_{≥ 0}}(x - τ[Ax-g^⊤]) = 0, -/// $$ -/// which we write as -/// $$ -/// x - [G ∘ F](x)=0 -/// $$ -/// for -/// $$ -/// G(x) = \mathop{\mathrm{prox}}_{λ{\vec 1}^⊤ + δ_{≥ 0}} -/// \quad\text{and}\quad -/// F(x) = x - τ Ax + τ g^⊤ -/// $$ -/// We can use Newton derivative chain rule to compute -/// $D_N[G ∘ F](x) = D_N G(F(x)) D_N F(x)$, where -/// $D_N F(x) = \mathop{\mathrm{Id}} - τ A$, -/// and $[D_N G(F(x))]_i = 1$ for inactive coordinates and $=0$ for active coordinates. -///

-/// -///

-/// The method itself involves solving $D_N[Id - G ∘ F](x^k) s^k = - [Id - G ∘ F](x^k)$ and -/// updating $x^{k+1} = x^k + s^k$. Consequently -/// $$ -/// s^k - D_N G(F(x^k)) [s^k - τ As^k] = - x^k + [G ∘ F](x^k) -/// $$ -/// For $𝒜$ the set of active coordinates and $ℐ$ the set of inactive coordinates, this -/// expands as -/// $$ -/// [τ A_{ℐ × ℐ}]s^k_ℐ = - x^k_ℐ + [G ∘ F](x^k)_ℐ - [τ A_{ℐ × 𝒜}]s^k_𝒜 -/// $$ -/// and -/// $$ -/// s^k_𝒜 = - x^k_𝒜 + [G ∘ F](x^k)_𝒜. -/// $$ -/// Thus on $𝒜$ the update $[x^k + s^k]_𝒜 = [G ∘ F](x^k)_𝒜$ is just the forward-backward update. -///

-/// -///

-/// We need to detect stopping by a subdifferential and return $x$ satisfying $x ≥ 0$, -/// which is in general not true for the SSN. We therefore use that $[G ∘ F](x^k)$ is a valid -/// forward-backward step. -///

-#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] -pub fn quadratic_nonneg_ssn( - mA : &DMatrix, - g : &DVector, - //c_ : F, - λ_ : F, - x : &mut DVector, - τ_ : F, - iterator : I -) -> Result -where F : Float + ToNalgebraRealField, - I : AlgIteratorFactory -{ - let n = x.len(); - let mut xprev = x.clone(); - let mut v = DVector::zeros(n); - //let c = c_.to_nalgebra_mixed(); - let λ = λ_.to_nalgebra_mixed(); - let τ = τ_.to_nalgebra_mixed(); - let τλ = τ * λ; - let mut inact : Vec = Vec::from_iter(std::iter::repeat(false).take(n)); - let mut s = DVector::zeros(0); - let mut decomp = nalgebra::linalg::LU::new(DMatrix::zeros(0, 0)); - let mut iters = 0; - - let res = iterator.iterate_fallible(|state| { - // 1. Perform delayed SSN-update based on previously computed step on active - // coordinates. The step is delayed to the beginning of the loop because - // the SSN step may violate constraints, so we arrange `x` to contain at the - // end of the loop the valid FB step that forms part of the SSN step - let mut si = s.iter(); - for (&ast, x_i, xprev_i) in izip!(inact.iter(), x.iter_mut(), xprev.iter_mut()) { - if ast { - *x_i = *xprev_i + *si.next().unwrap() - } - *xprev_i = *x_i; - } - - //xprev.copy_from(x); - - // 2. Calculate FB step. - // 2.1. Replace `x` with $x⁻ - τ[Ax⁻-g]= [x⁻ + τg]- τAx⁻$ - x.axpy(τ, g, 1.0); // x = x⁻ + τ*g - x.sygemv(-τ, mA, &xprev, 1.0); // x = [x⁻ + τg]- τAx⁻ - // 2.2. Calculate prox and set of active coordinates at the same time - let mut act_changed = false; - let mut n_inact = 0; - for (x_i, ast) in izip!(x.iter_mut(), inact.iter_mut()) { - if *x_i > τλ { - *x_i -= τλ; - if !*ast { - act_changed = true; - *ast = true; - } - n_inact += 1; - } else { - *x_i = 0.0; - if *ast { - act_changed = true; - *ast = false; - } - } - } - - // *** x now contains forward-backward step *** - - // 3. Solve SSN step `s`. - // 3.1 Construct [τ A_{ℐ × ℐ}] if the set of inactive coordinates has changed. - if act_changed { - let decomp_iter = inact.iter().cartesian_product(inact.iter()).zip(mA.iter()); - let decomp_constr = decomp_iter.filter_map(|((&i_inact, &j_inact), &mAij)| { - //(i_inact && j_inact).then_some(mAij * τ) - (i_inact && j_inact).then_some(mAij) // 🔺 below matches removal of τ - }); - let mat = DMatrix::from_iterator(n_inact, n_inact, decomp_constr); - decomp = nalgebra::linalg::LU::new(mat); - } - - // 3.2 Solve `s` = $s_ℐ^k$ from - // $[τ A_{ℐ × ℐ}]s^k_ℐ = - x^k_ℐ + [G ∘ F](x^k)_ℐ - [τ A_{ℐ × 𝒜}]s^k_𝒜$. - // With current variable setup we have $[G ∘ F](x^k) = $`x` and $x^k = x⁻$ = `xprev`, - // so the system to solve is $[τ A_{ℐ × ℐ}]s^k_ℐ = (x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ - // The matrix $[τ A_{ℐ × ℐ}]$ we have already LU-decomposed above into `decomp`. - s = if n_inact > 0 { - // 3.2.1 Construct `rhs` = $(x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ - let inactfilt = inact.iter().copied(); - let rhs_iter = izip!(x.iter(), xprev.iter(), mA.row_iter()).filter_zip(inactfilt); - let rhs_constr = rhs_iter.map(|(&x_i, &xprev_i, mAi)| { - // Calculate row i of [τ A_{ℐ × 𝒜}]s^k_𝒜 = [τ A_{ℐ × 𝒜}](x-xprev)_𝒜 - let actfilt = inact.iter().copied().map(std::ops::Not::not); - let actit = izip!(x.iter(), xprev.iter(), mAi.iter()).filter_zip(actfilt); - let actpart = actit.map(|(&x_j, &xprev_j, &mAij)| { - mAij * (x_j - xprev_j) - }).sum(); - // Subtract it from [x-prev]_i - //x_i - xprev_i - τ * actpart - (x_i - xprev_i) / τ - actpart // 🔺 change matches removal of τ above - }); - let mut rhs = DVector::from_iterator(n_inact, rhs_constr); - assert_eq!(rhs.len(), n_inact); - // Solve the system - if !decomp.solve_mut(&mut rhs) { - return Step::Failure(NumericalError( - "Failed to solve linear system for subproblem SSN." - )) - } - rhs - } else { - DVector::zeros(0) - }; - - iters += 1; - - // 4. Report solution quality - state.if_verbose(|| { - // Calculate subdifferential at the FB step `x` that hasn't yet had `s` yet added. - // The subdifferential of the objective is $Ax - g + λ + ∂ δ_{≥ 0}(x)$. - // We return the minimal ∞-norm over all subderivatives. - v.copy_from(g); // d = g - mA.gemv(&mut v, 1.0, x, -1.0); // d = Ax - g - let mut val = 0.0; - for (&v_i, &x_i) in izip!(v.iter(), x.iter()) { - let d = v_i + λ; - if x_i > 0.0 || d < 0.0 { - val = val.max(d.abs()); - } - } - F::from_nalgebra_mixed(val) - }) - }); - - res.map(|_| iters) -} - -/// This function applies an iterative method for the solution of the quadratic non-negativity -/// constrained problem -///
$$ -/// \min_{x ∈ ℝ^n} \frac{1}{2} x^⊤Ax - g^⊤ x + λ{\vec 1}^⊤ x + c + δ_{≥ 0}(x). -/// $$
-/// Semismooth Newton or forward-backward are supported based on the setting in `method`. -/// The parameter `mA` is matrix $A$, and `g` and `λ` are as in the mathematical formulation. -/// The constant $c$ does not need to be provided. The step length parameter is `τ` while -/// `x` contains the initial iterate and on return the final one. The `iterator` controls -/// stopping. The “verbose” value output by all methods is the $ℓ\_∞$ distance of some -/// subdifferential of the objective to zero. -/// -/// Interior point methods could offer a further alternative, for example, the one in: -/// -/// * Valkonen T. - _A method for weighted projections to the positive definite -/// cone_, . -/// -/// This function returns the number of iterations taken. -pub fn quadratic_nonneg( - method : InnerMethod, - mA : &DMatrix, - g : &DVector, - //c_ : F, - λ : F, - x : &mut DVector, - τ : F, - iterator : I -) -> usize -where F : Float + ToNalgebraRealField, - I : AlgIteratorFactory -{ - - match method { - InnerMethod::FB => - quadratic_nonneg_fb(mA, g, λ, x, τ, iterator), - InnerMethod::SSN => - quadratic_nonneg_ssn(mA, g, λ, x, τ, iterator).unwrap_or_else(|e| { - println!("{}", format!("{e}. Using FB fallback.").red()); - let ins = InnerSettings::::default(); - quadratic_nonneg_fb(mA, g, λ, x, τ, ins.iterator_options) - }) - } -} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/subproblem/nonneg.rs --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/subproblem/nonneg.rs Sun Dec 11 23:25:53 2022 +0200 @@ -0,0 +1,333 @@ +/*! +Iterative algorithms for solving the finite-dimensional subproblem with non-negativity constraints. +*/ + +use nalgebra::{DVector, DMatrix}; +use numeric_literals::replace_float_literals; +use itertools::{izip, Itertools}; +use colored::Colorize; + +use alg_tools::iter::Mappable; +use alg_tools::error::NumericalError; +use alg_tools::iterate::{ + AlgIteratorFactory, + AlgIteratorState, + Step, +}; +use alg_tools::linops::GEMV; +use alg_tools::nalgebra_support::ToNalgebraRealField; + +use crate::types::*; +use super::{ + InnerMethod, + InnerSettings +}; + +/// Compute the proximal operator of $x \mapsto x + \delta\_{[0, \infty)}$, i.e., +/// the non-negativity contrained soft-thresholding operator. +#[inline] +#[replace_float_literals(F::cast_from(literal))] +fn nonneg_soft_thresholding(v : F, λ : F) -> F { + (v - λ).max(0.0) +} + +/// Returns the ∞-norm minimal subdifferential of $x ↦ x^⊤Ax - g^⊤ x + λ\vec 1^⊤ x δ_{≥ 0}(x)$ +/// at $x$. +/// +/// `v` will be modified and cannot be trusted to contain useful values afterwards. +#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +fn min_subdifferential( + v : &mut DVector, + mA : &DMatrix, + x : &DVector, + g : &DVector, + λ : F::MixedType +) -> F { + v.copy_from(g); + mA.gemv(v, 1.0, x, -1.0); // v = Ax - g + let mut val = 0.0; + for (&v_i, &x_i) in izip!(v.iter(), x.iter()) { + // The subdifferential of the objective is $Ax - g + λ + ∂ δ_{≥ 0}(x)$. + let d = v_i + λ; + if x_i > 0.0 || d < 0.0 { + val = val.max(d.abs()); + } + } + F::from_nalgebra_mixed(val) +} + +/// Forward-backward splitting implementation of [`quadratic_nonneg`]. +/// For detailed documentation of the inputs and outputs, refer to there. +/// +/// The `λ` component of the model is handled in the proximal step instead of the gradient step +/// for potential performance improvements. +#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +pub fn quadratic_nonneg_fb( + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ_ : F, + x : &mut DVector, + τ_ : F, + iterator : I +) -> usize +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + let mut xprev = x.clone(); + //let c = c_.to_nalgebra_mixed(); + let λ = λ_.to_nalgebra_mixed(); + let τ = τ_.to_nalgebra_mixed(); + let τλ = τ * λ; + let mut v = DVector::zeros(x.len()); + let mut iters = 0; + + iterator.iterate(|state| { + // Replace `x` with $x - τ[Ax-g]= [x + τg]- τAx$ + v.copy_from(g); // v = g + v.axpy(1.0, x, τ); // v = x + τ*g + v.sygemv(-τ, mA, x, 1.0); // v = [x + τg]- τAx + let backup = state.if_verbose(|| { + xprev.copy_from(x) + }); + // Calculate the proximal map + x.iter_mut().zip(v.iter()).for_each(|(x_i, &v_i)| { + *x_i = nonneg_soft_thresholding(v_i, τλ); + }); + + iters +=1; + + backup.map(|_| { + min_subdifferential(&mut v, mA, x, g, λ) + }) + }); + + iters +} + +/// Semismooth Newton implementation of [`quadratic_nonneg`]. +/// +/// For detailed documentation of the inputs, refer to there. +/// This function returns the number of iterations taken if there was no inversion failure, +/// +/// ## Method derivation +/// +/// **The below may look like garbage. Sorry, but rustdoc is obsolete rubbish +/// that doesn't directly support by-now standard-in-markdown LaTeX math. Instead it +/// forces one into unreliable KaTeX autorender postprocessing andescape hell and that +/// it doesn't even process correctly.** +/// +///

+/// For the objective +/// $$ +/// J(x) = \frac{1}{2} x^⊤Ax - g^⊤ x + λ{\vec 1}^⊤ x + c + δ_{≥ 0}(x), +/// $$ +/// we have the optimality condition +/// $$ +/// x - \mathop{\mathrm{prox}}_{τλ{\vec 1}^⊤ + δ_{≥ 0}}(x - τ[Ax-g^⊤]) = 0, +/// $$ +/// which we write as +/// $$ +/// x - [G ∘ F](x)=0 +/// $$ +/// for +/// $$ +/// G(x) = \mathop{\mathrm{prox}}_{λ{\vec 1}^⊤ + δ_{≥ 0}} +/// \quad\text{and}\quad +/// F(x) = x - τ Ax + τ g^⊤ +/// $$ +/// We can use Newton derivative chain rule to compute +/// $D_N[G ∘ F](x) = D_N G(F(x)) D_N F(x)$, where +/// $D_N F(x) = \mathop{\mathrm{Id}} - τ A$, +/// and $[D_N G(F(x))]_i = 1$ for inactive coordinates and $=0$ for active coordinates. +///

+/// +///

+/// The method itself involves solving $D_N[Id - G ∘ F](x^k) s^k = - [Id - G ∘ F](x^k)$ and +/// updating $x^{k+1} = x^k + s^k$. Consequently +/// $$ +/// s^k - D_N G(F(x^k)) [s^k - τ As^k] = - x^k + [G ∘ F](x^k) +/// $$ +/// For $𝒜$ the set of active coordinates and $ℐ$ the set of inactive coordinates, this +/// expands as +/// $$ +/// [τ A_{ℐ × ℐ}]s^k_ℐ = - x^k_ℐ + [G ∘ F](x^k)_ℐ - [τ A_{ℐ × 𝒜}]s^k_𝒜 +/// $$ +/// and +/// $$ +/// s^k_𝒜 = - x^k_𝒜 + [G ∘ F](x^k)_𝒜. +/// $$ +/// Thus on $𝒜$ the update $[x^k + s^k]_𝒜 = [G ∘ F](x^k)_𝒜$ is just the forward-backward update. +///

+/// +///

+/// We need to detect stopping by a subdifferential and return $x$ satisfying $x ≥ 0$, +/// which is in general not true for the SSN. We therefore use that $[G ∘ F](x^k)$ is a valid +/// forward-backward step. +///

+#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +pub fn quadratic_nonneg_ssn( + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ_ : F, + x : &mut DVector, + τ_ : F, + iterator : I +) -> Result +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + let n = x.len(); + let mut xprev = x.clone(); + let mut v = DVector::zeros(n); + //let c = c_.to_nalgebra_mixed(); + let λ = λ_.to_nalgebra_mixed(); + let τ = τ_.to_nalgebra_mixed(); + let τλ = τ * λ; + let mut inact : Vec = Vec::from_iter(std::iter::repeat(false).take(n)); + let mut s = DVector::zeros(0); + let mut decomp = nalgebra::linalg::LU::new(DMatrix::zeros(0, 0)); + let mut iters = 0; + + let res = iterator.iterate_fallible(|state| { + // 1. Perform delayed SSN-update based on previously computed step on active + // coordinates. The step is delayed to the beginning of the loop because + // the SSN step may violate constraints, so we arrange `x` to contain at the + // end of the loop the valid FB step that forms part of the SSN step + let mut si = s.iter(); + for (&ast, x_i, xprev_i) in izip!(inact.iter(), x.iter_mut(), xprev.iter_mut()) { + if ast { + *x_i = *xprev_i + *si.next().unwrap() + } + *xprev_i = *x_i; + } + + //xprev.copy_from(x); + + // 2. Calculate FB step. + // 2.1. Replace `x` with $x⁻ - τ[Ax⁻-g]= [x⁻ + τg]- τAx⁻$ + x.axpy(τ, g, 1.0); // x = x⁻ + τ*g + x.sygemv(-τ, mA, &xprev, 1.0); // x = [x⁻ + τg]- τAx⁻ + // 2.2. Calculate prox and set of active coordinates at the same time + let mut act_changed = false; + let mut n_inact = 0; + for (x_i, ast) in izip!(x.iter_mut(), inact.iter_mut()) { + if *x_i > τλ { + *x_i -= τλ; + if !*ast { + act_changed = true; + *ast = true; + } + n_inact += 1; + } else { + *x_i = 0.0; + if *ast { + act_changed = true; + *ast = false; + } + } + } + + // *** x now contains forward-backward step *** + + // 3. Solve SSN step `s`. + // 3.1 Construct [τ A_{ℐ × ℐ}] if the set of inactive coordinates has changed. + if act_changed { + let decomp_iter = inact.iter().cartesian_product(inact.iter()).zip(mA.iter()); + let decomp_constr = decomp_iter.filter_map(|((&i_inact, &j_inact), &mAij)| { + //(i_inact && j_inact).then_some(mAij * τ) + (i_inact && j_inact).then_some(mAij) // 🔺 below matches removal of τ + }); + let mat = DMatrix::from_iterator(n_inact, n_inact, decomp_constr); + decomp = nalgebra::linalg::LU::new(mat); + } + + // 3.2 Solve `s` = $s_ℐ^k$ from + // $[τ A_{ℐ × ℐ}]s^k_ℐ = - x^k_ℐ + [G ∘ F](x^k)_ℐ - [τ A_{ℐ × 𝒜}]s^k_𝒜$. + // With current variable setup we have $[G ∘ F](x^k) = $`x` and $x^k = x⁻$ = `xprev`, + // so the system to solve is $[τ A_{ℐ × ℐ}]s^k_ℐ = (x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ + // The matrix $[τ A_{ℐ × ℐ}]$ we have already LU-decomposed above into `decomp`. + s = if n_inact > 0 { + // 3.2.1 Construct `rhs` = $(x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ + let inactfilt = inact.iter().copied(); + let rhs_iter = izip!(x.iter(), xprev.iter(), mA.row_iter()).filter_zip(inactfilt); + let rhs_constr = rhs_iter.map(|(&x_i, &xprev_i, mAi)| { + // Calculate row i of [τ A_{ℐ × 𝒜}]s^k_𝒜 = [τ A_{ℐ × 𝒜}](x-xprev)_𝒜 + let actfilt = inact.iter().copied().map(std::ops::Not::not); + let actit = izip!(x.iter(), xprev.iter(), mAi.iter()).filter_zip(actfilt); + let actpart = actit.map(|(&x_j, &xprev_j, &mAij)| { + mAij * (x_j - xprev_j) + }).sum(); + // Subtract it from [x-prev]_i + //x_i - xprev_i - τ * actpart + (x_i - xprev_i) / τ - actpart // 🔺 change matches removal of τ above + }); + let mut rhs = DVector::from_iterator(n_inact, rhs_constr); + assert_eq!(rhs.len(), n_inact); + // Solve the system + if !decomp.solve_mut(&mut rhs) { + return Step::Failure(NumericalError( + "Failed to solve linear system for subproblem SSN." + )) + } + rhs + } else { + DVector::zeros(0) + }; + + iters += 1; + + // 4. Report solution quality + state.if_verbose(|| { + // Calculate subdifferential at the FB step `x` that hasn't yet had `s` yet added. + min_subdifferential(&mut v, mA, x, g, λ) + }) + }); + + res.map(|_| iters) +} + +/// This function applies an iterative method for the solution of the quadratic non-negativity +/// constrained problem +///
$$ +/// \min_{x ∈ ℝ^n} \frac{1}{2} x^⊤Ax - g^⊤ x + λ{\vec 1}^⊤ x + c + δ_{≥ 0}(x). +/// $$
+/// Semismooth Newton or forward-backward are supported based on the setting in `method`. +/// The parameter `mA` is matrix $A$, and `g` and `λ` are as in the mathematical formulation. +/// The constant $c$ does not need to be provided. The step length parameter is `τ` while +/// `x` contains the initial iterate and on return the final one. The `iterator` controls +/// stopping. The “verbose” value output by all methods is the $ℓ\_∞$ distance of some +/// subdifferential of the objective to zero. +/// +/// Interior point methods could offer a further alternative, for example, the one in: +/// +/// * Valkonen T. - _A method for weighted projections to the positive definite +/// cone_, . +/// +/// This function returns the number of iterations taken. +pub fn quadratic_nonneg( + method : InnerMethod, + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ : F, + x : &mut DVector, + τ : F, + iterator : I +) -> usize +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + + match method { + InnerMethod::FB => + quadratic_nonneg_fb(mA, g, λ, x, τ, iterator), + InnerMethod::SSN => + quadratic_nonneg_ssn(mA, g, λ, x, τ, iterator).unwrap_or_else(|e| { + println!("{}", format!("{e}. Using FB fallback.").red()); + let ins = InnerSettings::::default(); + quadratic_nonneg_fb(mA, g, λ, x, τ, ins.iterator_options) + }) + } +} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/subproblem/unconstrained.rs --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/subproblem/unconstrained.rs Sun Dec 11 23:25:53 2022 +0200 @@ -0,0 +1,288 @@ +/*! +Iterative algorithms for solving the finite-dimensional subproblem without constraints. +*/ + +use nalgebra::{DVector, DMatrix}; +use numeric_literals::replace_float_literals; +use itertools::{izip, Itertools}; +use colored::Colorize; +use std::cmp::Ordering::*; + +use alg_tools::iter::Mappable; +use alg_tools::error::NumericalError; +use alg_tools::iterate::{ + AlgIteratorFactory, + AlgIteratorState, + Step, +}; +use alg_tools::linops::GEMV; +use alg_tools::nalgebra_support::ToNalgebraRealField; + +use crate::types::*; +use super::{ + InnerMethod, + InnerSettings +}; + +/// Compute the proximal operator of $x \mapsto |x|$, i.e., the soft-thresholding operator. +#[inline] +#[replace_float_literals(F::cast_from(literal))] +fn soft_thresholding(v : F, λ : F) -> F { + if v > λ { + v - λ + } else if v < -λ { + v + λ + } else { + 0.0 + } +} + +/// Returns the ∞-norm minimal subdifferential of $x ↦ x^⊤Ax - g^⊤ x + λ\|x\|₁$ at $x$. +/// +/// `v` will be modified and cannot be trusted to contain useful values afterwards. +#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +fn min_subdifferential( + v : &mut DVector, + mA : &DMatrix, + x : &DVector, + g : &DVector, + λ : F::MixedType +) -> F { + v.copy_from(g); + mA.gemv(v, 1.0, x, -1.0); // v = Ax - g + let mut val = 0.0; + for (&v_i, &x_i) in izip!(v.iter(), x.iter()) { + // The subdifferential at x is $Ax - g + λ ∂‖·‖₁(x)$. + val = val.max(match x_i.partial_cmp(&0.0) { + Some(Greater) => v_i + λ, + Some(Less) => v_i - λ, + Some(Equal) => soft_thresholding(v_i, λ), + None => F::MixedType::nan(), + }) + } + F::from_nalgebra_mixed(val) +} + + +/// Forward-backward splitting implementation of [`quadratic_unconstrained`]. +/// For detailed documentation of the inputs and outputs, refer to there. +/// +/// The `λ` component of the model is handled in the proximal step instead of the gradient step +/// for potential performance improvements. +#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +pub fn quadratic_unconstrained_fb( + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ_ : F, + x : &mut DVector, + τ_ : F, + iterator : I +) -> usize +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + let mut xprev = x.clone(); + //let c = c_.to_nalgebra_mixed(); + let λ = λ_.to_nalgebra_mixed(); + let τ = τ_.to_nalgebra_mixed(); + let τλ = τ * λ; + let mut v = DVector::zeros(x.len()); + let mut iters = 0; + + iterator.iterate(|state| { + // Replace `x` with $x - τ[Ax-g]= [x + τg]- τAx$ + v.copy_from(g); // v = g + v.axpy(1.0, x, τ); // v = x + τ*g + v.sygemv(-τ, mA, x, 1.0); // v = [x + τg]- τAx + let backup = state.if_verbose(|| { + xprev.copy_from(x) + }); + // Calculate the proximal map + x.iter_mut().zip(v.iter()).for_each(|(x_i, &v_i)| { + *x_i = soft_thresholding(v_i, τλ); + }); + + iters +=1; + + backup.map(|_| { + min_subdifferential(&mut v, mA, x, g, λ) + }) + }); + + iters +} + +/// Semismooth Newton implementation of [`quadratic_unconstrained`]. +/// +/// For detailed documentation of the inputs, refer to there. +/// This function returns the number of iterations taken if there was no inversion failure, +/// +/// For method derivarion, see the documentation for [`super::nonneg::quadratic_nonneg`]. +#[replace_float_literals(F::cast_from(literal).to_nalgebra_mixed())] +pub fn quadratic_unconstrained_ssn( + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ_ : F, + x : &mut DVector, + τ_ : F, + iterator : I +) -> Result +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + let n = x.len(); + let mut xprev = x.clone(); + let mut v = DVector::zeros(n); + //let c = c_.to_nalgebra_mixed(); + let λ = λ_.to_nalgebra_mixed(); + let τ = τ_.to_nalgebra_mixed(); + let τλ = τ * λ; + let mut inact : Vec = Vec::from_iter(std::iter::repeat(false).take(n)); + let mut s = DVector::zeros(0); + let mut decomp = nalgebra::linalg::LU::new(DMatrix::zeros(0, 0)); + let mut iters = 0; + + let res = iterator.iterate_fallible(|state| { + // 1. Perform delayed SSN-update based on previously computed step on active + // coordinates. The step is delayed to the beginning of the loop because + // the SSN step may violate constraints, so we arrange `x` to contain at the + // end of the loop the valid FB step that forms part of the SSN step + let mut si = s.iter(); + for (&ast, x_i, xprev_i) in izip!(inact.iter(), x.iter_mut(), xprev.iter_mut()) { + if ast { + *x_i = *xprev_i + *si.next().unwrap() + } + *xprev_i = *x_i; + } + + //xprev.copy_from(x); + + // 2. Calculate FB step. + // 2.1. Replace `x` with $x⁻ - τ[Ax⁻-g]= [x⁻ + τg]- τAx⁻$ + x.axpy(τ, g, 1.0); // x = x⁻ + τ*g + x.sygemv(-τ, mA, &xprev, 1.0); // x = [x⁻ + τg]- τAx⁻ + // 2.2. Calculate prox and set of active coordinates at the same time + let mut act_changed = false; + let mut n_inact = 0; + for (x_i, ast) in izip!(x.iter_mut(), inact.iter_mut()) { + if *x_i > τλ { + *x_i -= τλ; + if !*ast { + act_changed = true; + *ast = true; + } + n_inact += 1; + } else if *x_i < -τλ { + *x_i += τλ; + if !*ast { + act_changed = true; + *ast = true; + } + n_inact += 1; + } else { + *x_i = 0.0; + if *ast { + act_changed = true; + *ast = false; + } + } + } + + // *** x now contains forward-backward step *** + + // 3. Solve SSN step `s`. + // 3.1 Construct [τ A_{ℐ × ℐ}] if the set of inactive coordinates has changed. + if act_changed { + let decomp_iter = inact.iter().cartesian_product(inact.iter()).zip(mA.iter()); + let decomp_constr = decomp_iter.filter_map(|((&i_inact, &j_inact), &mAij)| { + //(i_inact && j_inact).then_some(mAij * τ) + (i_inact && j_inact).then_some(mAij) // 🔺 below matches removal of τ + }); + let mat = DMatrix::from_iterator(n_inact, n_inact, decomp_constr); + decomp = nalgebra::linalg::LU::new(mat); + } + + // 3.2 Solve `s` = $s_ℐ^k$ from + // $[τ A_{ℐ × ℐ}]s^k_ℐ = - x^k_ℐ + [G ∘ F](x^k)_ℐ - [τ A_{ℐ × 𝒜}]s^k_𝒜$. + // With current variable setup we have $[G ∘ F](x^k) = $`x` and $x^k = x⁻$ = `xprev`, + // so the system to solve is $[τ A_{ℐ × ℐ}]s^k_ℐ = (x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ + // The matrix $[τ A_{ℐ × ℐ}]$ we have already LU-decomposed above into `decomp`. + s = if n_inact > 0 { + // 3.2.1 Construct `rhs` = $(x-x⁻)_ℐ - [τ A_{ℐ × 𝒜}](x-x⁻)_𝒜$ + let inactfilt = inact.iter().copied(); + let rhs_iter = izip!(x.iter(), xprev.iter(), mA.row_iter()).filter_zip(inactfilt); + let rhs_constr = rhs_iter.map(|(&x_i, &xprev_i, mAi)| { + // Calculate row i of [τ A_{ℐ × 𝒜}]s^k_𝒜 = [τ A_{ℐ × 𝒜}](x-xprev)_𝒜 + let actfilt = inact.iter().copied().map(std::ops::Not::not); + let actit = izip!(x.iter(), xprev.iter(), mAi.iter()).filter_zip(actfilt); + let actpart = actit.map(|(&x_j, &xprev_j, &mAij)| { + mAij * (x_j - xprev_j) + }).sum(); + // Subtract it from [x-prev]_i + //x_i - xprev_i - τ * actpart + (x_i - xprev_i) / τ - actpart // 🔺 change matches removal of τ above + }); + let mut rhs = DVector::from_iterator(n_inact, rhs_constr); + assert_eq!(rhs.len(), n_inact); + // Solve the system + if !decomp.solve_mut(&mut rhs) { + return Step::Failure(NumericalError( + "Failed to solve linear system for subproblem SSN." + )) + } + rhs + } else { + DVector::zeros(0) + }; + + iters += 1; + + // 4. Report solution quality + state.if_verbose(|| { + // Calculate subdifferential at the FB step `x` that hasn't yet had `s` yet added. + min_subdifferential(&mut v, mA, x, g, λ) + }) + }); + + res.map(|_| iters) +} + +/// This function applies an iterative method for the solution of the problem +///
$$ +/// \min_{x ∈ ℝ^n} \frac{1}{2} x^⊤Ax - g^⊤ x + λ\|x\|₁ + c. +/// $$
+/// Semismooth Newton or forward-backward are supported based on the setting in `method`. +/// The parameter `mA` is matrix $A$, and `g` and `λ` are as in the mathematical formulation. +/// The constant $c$ does not need to be provided. The step length parameter is `τ` while +/// `x` contains the initial iterate and on return the final one. The `iterator` controls +/// stopping. The “verbose” value output by all methods is the $ℓ\_∞$ distance of some +/// subdifferential of the objective to zero. +/// +/// This function returns the number of iterations taken. +pub fn quadratic_unconstrained( + method : InnerMethod, + mA : &DMatrix, + g : &DVector, + //c_ : F, + λ : F, + x : &mut DVector, + τ : F, + iterator : I +) -> usize +where F : Float + ToNalgebraRealField, + I : AlgIteratorFactory +{ + + match method { + InnerMethod::FB => + quadratic_unconstrained_fb(mA, g, λ, x, τ, iterator), + InnerMethod::SSN => + quadratic_unconstrained_ssn(mA, g, λ, x, τ, iterator).unwrap_or_else(|e| { + println!("{}", format!("{e}. Using FB fallback.").red()); + let ins = InnerSettings::::default(); + quadratic_unconstrained_fb(mA, g, λ, x, τ, ins.iterator_options) + }) + } +} diff -r 9869fa1e0ccd -r d29d1fcf5423 src/types.rs --- a/src/types.rs Sun Dec 11 23:19:17 2022 +0200 +++ b/src/types.rs Sun Dec 11 23:25:53 2022 +0200 @@ -95,4 +95,3 @@ } } } -