YERC Tests Hypotheses Explaining Differences in Water Quality Results

Total Phosphorus results (left) from nine sites on the Yellowstone River between Gardiner and Livingston, showing data from in-house analyses (blue bars), data from external independent analyses (clear, black-outlined bars), and the difference betwe… — Total Phosphorus results (left) from nine sites on the Yellowstone River between Gardiner and Livingston, showing data from in-house analyses (blue bars), data from external independent analyses (clear, black-outlined bars), and the difference between data analyzed under these different methods but from samples collected at the same time and location (red lines). Why such a big difference? Orthophosphate results (right), on the other hand, were similar between both methods. We are currently testing hypotheses that could help explain why…

In our last Field Notes post, YERC Interprets This Summer’s RiverNET Results (Sept. 20, 2019), we discussed our surprise at finding significant differences between results from samples collected at the same time and location, but analyzed using two different methods: one method was conducted in-house at the YERC lab using low-cost but low-sensitivity equipment (Hach DR900 colorimeters), the other method was conducted at an independent, external, EPA-certified lab (Energy Labs in Helena) using much more precise (but also much more expensive) analysis tools. That independent validation data, funded in large part by Montana Department of Environmental Quality’s Volunteer Monitoring Program, is essential for QA/QC of any science project, and it is especially important for RiverNET as we develop techniques that are less expensive, user-friendly, and have a fast data turnaround time in order to collect more data at more locations more frequently. And while our program follows more of a qualitative (general picture) than a quantitative (precise numbers) approach — our goal is to detect natural trends from a “30,000-foot view”, as well as deviations from those trends that would serve as red flags indicating the need for a closer look — it is still important that our results be accurate in order to be interpreted correctly and put to the greatest use.

So last week, we collected our third and final round of paired sampling and comparative analyses for the season. And earlier this month, we also conducted a separate experiment to test one of the hypotheses described in the last Field Note, namely:

Was residue from our equipment cleaning procedure influencing our Total Phosphorus results?

Our thinking was this: the mid-August Total Phosphorus results, shown in the plots above, showed a major difference between the two different methods, averaging about 0.15 mg/L difference with some as much as 0.35 mg/L, much greater than the +/- 0.05 mg/L accuracy range reported for our in-house analytical method (shown by the grey bars on the plots). Results for Orthophosphate (a phosphorus compound), on the other hand, agreed quite well between the two methods. Because some of the other phosphorus compounds besides Orthophosphate included in the Total Phosphorus test are associated with detergents, we hypothesized that our process of cleaning equipment between tests may have been leaving residues contaminating the next sample (even though we use phosphate-free soap in that process). So we set up an experiment to test this hypothesis:

(1) We cleaned three sampling jars each using two different cleaning methods: one followed our standard cleaning procedures using just the phosphate-free soap (the control group), the other used that method plus a final rinse in hydrochloric acid (the treatment group), which should have removed any and all phosphates that remained.

(2) We then collected samples in all six jars at the same time and location, and analyzed them for Total Phosphorus right away using the same analysis procedure.

(3) If the cleaning process was indeed contaminating the samples, we expected to find lower Total Phosphorus results in the treatment group collected in jars that were cleaned in hydrochloric acid.

The results of this test: no difference between the control and the treatment groups. In fact, the lowest test result came from one of the control group samples, completely contrary to expectations. In science-speak, we can say that “our test failed to support the hypothesis that our cleaning procedure was influencing analysis results.” In layman’s terms, we can say that our cleaning procedure was not the problem, and it is back to the drawing board…

So, what now? Do we scratch our heads, shrug, and give up… no! We come up with a new hypothesis, run another experiment to test it, and report back to you with the results. And we are already halfway there, with a new hypothesis and a plan for an experiment:

The equipment we are using for our in-house methods was originally intended for testing drinking water, which has a pH value of 7, whereas water in the Upper Yellowstone is naturally more basic, with pH values averaging around 9: are these higher pH values affecting our results?

As we analyze the samples collected this week, we will subset some to analyze at both their natural pH values (the control group) and at pH values that we will lower by adding a hydrochloric acid solution (the treatment group), and expect to find lower results in the treatment group (thus supporting this hypothesis). Advice, ideas, and expertise (not to mention some quick math calculations and good humor) from Elaine Westbrook, the ChemStores manager at Montana State University, were of great help designing this experiment. Stay tuned…

Patrick Cross is YERC’s research director and the RiverNET project manager

RiverNET Feature, Lab News & Field UpdatesGuest UserOctober 17, 2019RiverNET, Patrick CrossComment