How do we know the codes solving our simulation problems are accurate? It’s now more vital than ever that our analysis codes are verified accurately and independently. Who is responsible for code verification? What happens when we find an error? The answers to these questions are not only necessary, but they also have a significant impact on how trusted our industry can be.
NAFEMS has code verification at its very heart. Our initial working groups were set-up solely to produce benchmark tests for this very purpose, and we see it as our role to ensure that all software codes aspire to the same standard of accuracy.
I’ll say it quietly “here at NAFEMS, there are some people who don’t think this is a glamorous topic”. Getting Code Verification on the board as an Analysis Agenda topic was not an easy sell. Generating interest in simulation by showing all the amazing new techniques - now that’s what we should be focusing on… but what use is all that exciting stuff if it isn’t giving you the right answer?
Getting the wrong answer is even more galling if you have followed all the best practice advice, you have triple checked your input, you’ve had the company expert review the methods you are using and then through no fault of your own there’s a bug. Something you couldn’t have predicted or planned for, something you have zero control over has occurred and you are left in the position that everyone running a simulation dreads. You have to backtrack and explain why the results you previously proudly presented might not be quite as reliable as you had indicated. Decision makers don’t care why it happened. From their perspective, hitting a bug and the simulation department making an error amount to the same thing. Not only does the rework cost time and money, it reduces your credibility and it diminishes the confidence that the decision makers in your organisation have in your simulation capability.
The simulation community seems to be aware that code verification is a topic worthy of some attention. The NAFEMS marketing department put together a short survey to take the community’s temperature on the Analysis Agenda topics. They asked how important each topic was to our industry and out of the 16 listed topics “Code Verification” ranked #2 in the percentage of respondents (over 45% of you) who thought it was “Vital to our Industry”1. Dig down a few questions in the survey and you can see that despite people indicating that Code Verification is an important topic, the majority of companies had no plans to invest in this area.
This doesn’t come as a surprise, if you were to list all the different factors that can lead to variability in a simulation result, what position does “software error” come on your list? A great line I heard at an ASME V&V Symposium was “The biggest decision that I make that impacts the uncertainty in my simulation results is who I ask to run the job”. Sure software errors are a concern but there are bigger fish to fry; education, best practice, and… shouldn’t the software vendor take care of this?
No one is arguing, the bulk of code verification activities fall to the software vendor. Most commercial vendors will have a rigorous QA process, Code Verification is part of their “business as usual” activities. The large vendors have verification manuals that can be viewed by their customers, and frequently make a subset of test cases available so that their users can run the verification problems for themselves. Altair [1] and BETA CAE Systems [2] use the NAFEMS benchmarks as part of their QA process and you can find their verification manuals on the NAFEMS resource centre. If you are interested in further details of what the major vendors do in terms of Code Verification, check out William Bryan's extended abstract "Verifying and Validating Software in a Regulated Environment" [3]. Understandably, only a limited number of hardware configurations can be evaluated. If you, as a user want to be whiter than white, then you need to run on a fully supported configuration. It can be challenging tracking down in stock, cost effective compilers and graphics cards. In my experience the large industrial users, the “big boys” have a good awareness of the issues around Code Verification and share responsibility for testing the code. They will often perform extensive in-house testing on any new version of the code prior to rolling it out. Because of this rigour it isn’t uncommon for their analysts to be using a version of the code that is years behind the latest release. Researchers are normally operating at the cutting edge where the simulation work involves user defined subroutines/plugins to extend the functionality of the software. They are generally aware that extensive testing is needed, often as they are responsible for introducing features that need to be tested. The more relaxed attitude to Code Verification is displayed through increased usage of open source codes. Most of us however, fall somewhere in between these different camps. Sure, we would love to run the available verification test cases but realistically there isn’t time or budget for this.
I used to work for one of the major simulation vendors, we had a rigorous QA system that met ISO 9001, but we still filed a significant number of bug reports each week. This isn’t a dig at the software vendors, the big general purpose analysis codes contain hundreds of elements and analysis procedures and an almost unlimited number of way to connect these different capabilities together. It isn’t possible to perform rigorous Code Verification on every possible combination of features an analyst might want to utilise. Simulation engineers should have an awareness that there is no guarantee that their particular use case has been tested. Users need to acknowledge that the tools are not error free. The majority of end users can’t lift the hood on and poke around at the code so naturally most of us assume we get a free pass. We put a S.E.P. field over the code verification issue and get on with the real engineering.
The original S.E.P. field concept from the Life the Universe and Everything [4]
So does the end user have a role to play? My take is that the user needs to be aware and informed. If you are using a new method, element or procedure then test, test and test again. Use simple benchmark models that isolate the features you are interested in. These sorts of simple benchmark tests were one of the core outputs NAFEMS started to produce 35 years ago. Over the years we have updated the archive and recently our Simulation Governance and Management Working Group have been discussing our benchmark archive and asking if we should be producing more rigorous tests. Should we be updating these benchmarks to incorporate modern Code Verification methods such as the Method of Manufactured Solutions (see the upcoming events/resources tab for a great web event on this techniques by Dr Oberkampf and Dr Howell both members of the NAFEMS Simulation Governance and Management Working Group).
This area is still evolving and the trend towards cloud computing creates some interesting new wrinkles. When I first started in this industry, each new software release was sent to you on a CD. The old hands in the consultancy I worked for would take that shiny box and put it up on a high shelf. “We’ll wait for the 1st maintenance release, all those eager beavers can find the bugs for us”. We protected ourselves by hiding behind the herd. One downside of cloud computing is that users lose control. Will you know for sure what version of the code you are running and on what configuration? But on the other hand cloud computing allows vendors to be more agile and push bug fixes as soon as they are available. New features can be tested easily by a subset of the community and if successful rolled out quickly to the rest of the user base. As it becomes easier for developers to respond to bugs and quickly roll out fixes the simulation community needs to ensure that vendors do not reduce the amount of effort invested in code verification. The goal should not be to fix bugs but to try and make sure they are not there in the first place.
Ian Symington, Technical Officer, NAFEMS
[1] Altair, "Optistruct Verification Problems Manual," 6 January 2020. [Online]. Available: https://www.nafems.org/publications/resource_center/manual_01/. [Accessed 20 November 2020].
[2] BETA CAE Systems, "EPILYSIS Verification Manual," 23 July 2018. [Online]. Available: https://www.nafems.org/publications/resource_center/manual_02/. [Accessed 20 November 2020].
[3] Verifying and Validating Software in a Regulated Environment, W. Bryan, NAFEMS Americas Conference, June 2016, Available: https://www.nafems.org/publications/resource_center/c_jun_16_usa_70b/
[4] D. Adams, Life, the Universe and Everything, Pan Books, 1982.
Code Verification Meeting - Open to all NAFEMS Technical Working Group Members | Working Group Meeting | 3rd Monday of the Month |
Stay up to date with our technology updates, events, special offers, news, publications and training
If you want to find out more about NAFEMS and how membership can benefit your organisation, please click below.
Joining NAFEMS© NAFEMS Ltd 2024
Developed By Duo Web Design