Thursday, August 13, 2015

Troubleshooting models that don’t run at all

by Laura Condon
Keywords: watershed model, getting started, troubleshooting

Manual Location:
Little Washita annotated example: 3.6.2
Running ParFlow: 3.2

Common problems that could cause your model not to run:
This post lists some common reasons why a model might not run at all. This is by no means an exhaustive list, but it’s a good place to start. You can tell that your model isn’t actually running if it exits before it starts solving anything and you don’t get a kinsol.log file in your outputs. In general, this means that you have somehow miss-specified your inputs. 

If this is the case you might get a ‘segmentation fault’ error message or there could be other errors reported in the out.txt file.

Problem: ParFlow is not actually installed correctly.
Fix: Go back to the install instruction and run the ParFlow test problems. If these are failing then the problem is with your install not your model.

Problem: You are missing a key that is required in your tcl script.
Fix: If this is the case then the problem will be listed in the out.txt file.

Problem: You don’t have all of your model inputs where ParFlow can find them.
Fix: Go back and check that all of your inputs are in the directory you are running from and that you have the correct path in your tcl script.

Problem: You have distributed your files in the vertical dimension.
Fix: Under most conditions you should only divide your domain laterally and keep your Process.Topology.R set to 1 (see Jones and Woodward, 2001 for a detailed discussion of this).

Problem: You are running on multiple processors but you haven't distributed all of your inputs.
Fix: If your processor topology is not 1 for P, Q and R, then you are running on multiple processors and you need to distribute any file that you are using as an input in your tcl script. You can do this using the pfdist command (refer to the Little Washita Annotated input script in the manual section 3.6.2 for an example). If this is your problem you will likely have an error message in your out.txt file saying that ParFlow can’t find a file that it’s looking for.

Problem: You distributed your slope file, or any other 2 dimensional input file, in 3 dimensions.
Fix: Slope files are 2 dimensional inputs so in your script you need to set nz to 1 before you distribute the slope files and then set it back to your 3D nz before you distribute the other inputs (refer to the Little Washita Annotated input script in the manual section 3.6.2 for an example).

Problem: You are mixing and matching distributed file types. If you have inputs that were distributed with an older version of ParFlow or a version that was configured without the “--with-amps-sequential-io” flag then the pfdist command will generate a separate dist file for every processor (e.g. slopex.pfb.0, slopex.pfb.1 ... slopex.pfb.324). In the new format, a single dist pointer file is created (e.g. slopex.pfb.dist). If you have input files that were distributed the old way and you are running with the new distributed format things will fail.
Fix: Redistribute all of your files with the version of parflow you will be using to run your simulation.

Problem:  One of your input files is corrupted or not formatted correctly.
Fix: Try using the PFTool commands to convert your pfb inputs to silo (see manual section 4.3 example 1). If this fails then you know that your input is somehow corrupted. Go back and check that you have formatted your inputs correctly and that they have the correct dimensions. If the file conversion succeeds you can look at your inputs and make sure that everything looks right. 

References:

Jones, J.E. and C.S. Woodward, Newton–Krylov-multigrid solvers for large-scale, highly heterogeneous, variably saturated flow problems, Advances in Water Resources 24:763-774, 2001

No comments: