Curious whether anyone has thoughts on the following question about model design in Python.
Suppose I have an input to a numerical model that could be either a single float or an array corresponding to locations on a grid. Hydraulic roughness (e.g., Mannings n) could be an example.
I would like the model user to have two options for input variables:
(1) read them from an input file, or
(2) create them in a Python script or session and pass them to the model as items in a dictionary.
For #2, handling is straightforward: create either a float or an array of float and include it as an item in your parameter dictionary. However, harder to use type hints because you have two valid types.
For #1, it makes sense to have “small” variables (like a single float) inside a text-based input file (say, yaml format). But “large” variables (like an array of roughness values) would be better off in a separate file.
So, one way to handle #1 would be to give EITHER a float value OR a string that names a separate file containing the desired array values. PRO: simple and convenient; the model code detects whether the value is float or str and acts accordingly. CON: again, harder to use type hints because now you’re expecting any of three different variable types (a float, an array of float, or a string naming a separate input file).
Any thoughts on pros vs. cons of the “clean-ness” of sticking to single types versus the convenience of dynamic typing?
You can also make sure that within your model everything consists of arrays and use a single-element array to represent the 1D model. That way the variable types are only on the outside interface, but the typing inside your model is simple and consistent.
I agree, I think your approach is good and having different possible input types is probably a good thing.
@BSchilperoort - Have you ever tried Schema for validating data? I tend to use this but, to be honest, I find it really clunky to use. I was wondering if you had experience using both and could comment. At a quick glance, they look kind of similar (though the Pydantic docs are much better and maybe it’s more broadly used).
There is a small upside of Schema though; it only uses base Python and is only a single 1k line .py file. Pydantic’s core is written in rust nowadays, making it fast but you do need the the 2MB binary wheels (although theses are available for basically any system, including WebAssembly python).