Guo et al. investigate the variability in C-Q relationships in relation to the catchment hydrological conditions (more specifically the BFI) for several water quality parameters across several climate regions in Australia. The authors make use of an impressive data set from an arid region and apply a Bayesian Hierarchical Approach including the BFI, which allows understanding spatial patterns of export dynamics. This study can thus provide an important contribution for understanding solute transport beyond temperate regions. However, the manuscript still needs more clarity on the research questions and key messages and methodological improvements. I would suggest the manuscript for major revisions as substantial improvements are still necessary.
General comments:
One of my concerns is that given the fact that previous studies in the same region, using the same dataset and similar methods (as I read from the text), are not accessible or provided (under review or in preparation), it is not possible to judge the additional value of this study. The preceding studies (Lintern and Liu) are referenced both when defining the research goals and in the method sections. I definitely see the value of investigating C-Q relationships in various climate zones, but this was also done by these referenced studies. It is hard to judge the additional value without knowing what was shown already.
The motivation of investigating BFI impact on C-Q relationships was not convincing for me. It needs to be clear 1.) why we need to know that and 2.) what exactly we do not know yet. The first question is not satisfyingly presented: Why do you want to focus on BFI, why is it useful to investigate this relationship? For the second: From my knowledge and in contrast of what you state (see also my comments below), the influence of BFI on the spatial variability of C-Q relationships has been discussed in several previous studies. However, I agree that studies have been biased towards temperate climates. I think the latter should be the main motivation, while generally the literature review on the control of BFI needs to be extended. It is not right, that is has not been investigated. There are studies using BFI as a descriptor for explaining the variability in export behaviour of different solutes, including several studies that you have cited in the introduction but considering other statements. For example, Minaudo et al. 2019 stated, “we found for NO3− that high BFI values, low W2, and low erosion differentiated C-Q dilution patterns from non-significant and mobilization types”. But also Ebeling et al. 2021, Moatar et al. 2017, Musolff et al. 2015 have used the BFI to explain variability in C-Q relationships among catchments for several solutes. Moatar et al. 2020 also investigated the impact of discharge flashiness on C-Q slopes and subsequently load flashiness. As BFI and Q flashiness are closely linked, this needs to be mentioned in the introduction. These also need to be discussed in relation to your study in the discussion. Also see further comments below
Some of the methods seem inappropriate, especially as there is too few data for some climate-solute combinations to fit robust models/regressions and interpret them (further comments below). Besides, I do not see the value of investigating the BFI impact within each climate zone individually, i.e. separating the climate zones and fitting different models, instead of investigating the BFI impact across the whole climate variability. I think it would be more valuable to know what effect the BFI has across the whole climatic variability, i.e. the continuum of variations. The climate zones, could be represented by their characteristics such as precipitation amount, seasonality, aridity, temperature etc.. Even within the climate zones those variables vary and could potentially explain the deviations not explained by BFI.
The interpretation of BFI_m in terms of variability of flow paths is not convincing to me as you could easily and more directly use the range of BFI to determine the relationships between C-Q slope variability with BFI ranges. I think it would be good to look at this instead of speculating, as you have the data at hand and Figure 3b is not convincing enough for this interpretation, in my opinion. Instead of the range BFI_h-BFI_l you could also consider other metrics of variability.
Linked to the methodology, I have a concern about the conceptualisation in Figure 7 and main conclusions. I think some methodological approaches and results/evidence are not robust and clear enough to generalise the results in the given way.
The discussion misses comparison to relevant previous studies. Previous studies investigating hydrological controls (such as the BFI, flashiness etc.) spatial variability of C-Q relationships. The discussions needs extension
Specific comments:
L17: Does the baseflow contribution in a catchment impact the concentration itself or the C-Q relationship? For me, it seems like spatial and temporal dimensions are mixed up here.
L18: This is not true “these patterns have not yet been investigated across large spatial scales”, e.g. Minaudo et al. 2019 (see also related comments)
L48: “variable, which” reference is unclear, you mean here the studies? Please revise
L 57: This sentence needs revision. The concentration variability within one catchment regarding the contribution of baseflow or quickflow to the current discharge is represented by its C-Q relationship, i.e. the variability “within a particular catchment”. However, the cited studies also investigate the differences in C-Q relationships among catchments. Therefore, the provided references do not fit to this sentence, in my opinion. This also leads to the next sentence being incorrect. It defines the research gap as the differences among the catchments regarding the hydrological average behaviour not being investigated and understood. E.g. Minaudo et al. 2019 (others, see main comment) considered BFI in the analysis of C-Q relationship variability among catchments.
L91: “nitrate-nitrite” I am not sure what you mean with that. Is it the sum of both nitrate and nitrite concentrations? It should be defined once in the manuscript
L105 “unaffected” this is a strong word, I suggest to say “more robust”
L106: plural “span”
L111: “met the above criteria across all the six water quality variables” for me this sound as if the stations needed to meet the criteria for all the six variables, which was not the case from what I read in the next sentence and the following. I suggest to revise this formulation
L144: Does that mean you fit equation 1 only to baseflow discharge? How can this work, if C is a mix of baseflow and quickflow concentrations? This sentence in unclear, please revise.
L146: “the C-Q slopes of all catchments are following a normal distribution with a ‘grand mean’“ This works if the represented catchments cover the range of variability well. This would not be true if catchment types are overrepresented, would it?
L146f: “Then the variation of C-Q slopes between catchments, away from beta_0, are explained by changes in catchment BFI. “ This would only be true, if the BFI is the “only” controlling variable
Fig2: I suggest to add axis labels and ticks for panel a and b
L185: “together with flow” I do not understand this. If c=f(Q) in the C-Q relationship this is already included.
L214 “surface flow” is imprecise. There is not just baseflow and surface flow
L217: “In contrast, a catchment with a high BFI_m generally has a large range in instantaneous BFIs” I see several high BFI_m with not very high ranges in BFI. Fig3b rather looks like a bell shape with highest ranges for medium values, not like a linear relationship.
Fig 3: Boxplots can create confounding impressions if the sample size is very different, which I see from Figure 1. I would suggest adapting the boxplot widths according to the sample size and/or writing the sample size number to the plot for each climate zone (probably the numbers in the x axis labels?). For the right panel, I would suggest to colour the dots according to their climate zone as in Figure 1, possibly also for the left panel with different hue or saturation values to distinguish the different BFI quantiles.
L255: “BFI-based model has only marginally lower performance … BFI-based model, while having the capacity to predict C-Q slope across space, can predict water quality almost as well as using the observed C-Q slope.” I do not understand this comparison of individual C-Q slope with a model explaining the variability in C-Q slopes. It sounds like you were expecting worse performance, while actually a more complex model should improve performance to be a valid approach.
L269: “confidence intervals”? I do not know credible intervals
Table 1: Why do you think the NSE of your BFI-base model are lower than the baseline model? Does it actually make sense to fit a more complex model in this case?
Section 3.2 I do not like this whole section, I think the statements derived from selected examples are not representative for the catchments distribution of TSS export patterns within the corresponding climate zones (Fig6).
Figure 5: What is the modelled effect? Is it dBFI_climate from equation 3? What is the NSE above each subplot describing?
L278-285: I do not agree with this approach and subsequent observation. Fitting linear relationships is not appropriate for the given observations and “consistent diverging” behaviour goes beyond what can be interpreted here. Especially for the last two sentences: When looking at the overall point clouds, there is not clearly increasing variance (diverging behaviour) with higher BFI_m. In my opinion, weak relationships are overinterpreted here. These are also transferred to conclusions.
L283: ‘grand mean’ I know you have used this term before, but I think it is not well chosen, as it does not tell that it is about the “solute-specific base C-Q slope”. Consider changing the term
Figure 6: In my opinion, it is not justifiable to fit linear regressions to all combinations of solutes and climate zones, because several combinations have 1) too little sample sizes, 2) clearly non-linear relationships, and 3) in some cases plus influenced by outliers, e.g. the Tropical NOx fit. The legend titles should be improved.
How does the modelled effect from Figure 5 relate to the slope of the linear regressions in Figure 6? This seems somewhat redundant to me.
Figure 7: Firstly: The generalisation shown is questionable (see main and other comments). E.g. Figure 3b shows that highest BFI ranges are for medium BFI_m, suggesting that high BFI might also have more stable flow conditions with generally higher groundwater contributions. Secondly: This Figure takes a lot of time to understand and could benefit from some reworking including/according to other adaptations. The Figure text is unclear without reading the main text, as well as the meaning of a1, a2, b1, b2 only from the second reading. I suggest selecting other identifiers. The spatial organisation, e.g. link between the upper and the bottom panel and left, middle and right column, is not visually clear.
L332-334: This statement could benefit from checking also mean concentrations: are activated sources low or high in more quickflow dominated catchments?
L398-391: I cannot follow this point unfortunately due to missing information. Moreover, the characteristics land use, geology and climate are all integrated in the base flow index, which is a resulting hydrologic characteristic. This point definitely also needs a discussion part, including further literature and potential controls.
L392: Why is the Bayesian hierarchical model more effective than multiple linear regressions with BFI or other multivariate models? For me, Figure 5 (outcome of the Bayesian model) and Figure 6 (not a direct model output) were somewhat redundant. This is not covered in enough in the discussion section.
SUPPLEMENTS:
Figure S3: I do not understand why the BFI_m should be shown per solute, if the BFI depends on discharge and not on concentration.
REFERENCES:
Ehrhardt et al. is already finally published, please change reference to: Ehrhardt, S., Kumar, R., Fleckenstein, J. H., Attinger, S., & Musolff, A. (2019). Trajectories of nitrate input and output in three nested catchments along a land use gradient. Earth Syst. Sci., 23(9), 3503-3524. https://www.hydrol-earth-syst-sci.net/23/3503/2019/
This manuscript presents a synthesis of baseflow effects on C-Q relationships in watersheds across Australia. The authors have leveraged a Bayesian Hierarchical Model in this research. Overall, I think the research is solid, the manuscript is well written, and it can become an important contribution to the literature on riverine C-Q relationships. I provide below some comments to the author, which I hope can help improve the manuscript.
1. The use of Bayesian Hierarchical Model should be more fully justified. I am aware of the research the team has done in the past few years involving Bayesian approaches, but why is it used in this work on C-Q relationships. Please provide your reasoning in the Introduction, probably the last paragraph.
2. The authors reported that the Bayesian Hierarchical Model can explain over half of the observed variability in concentration of TSS, EC and P species across all catchments (93% for EC, 63% for TP, 63% for SRP, and 60% for TSS). I feel the intension has switched here from understanding C-Q relationship to predicting water-quality concentrations, which seems to be a distraction to me. Moreover, what’s the benefit of adopting the Bayesian Hierarchical Model, given that many statistical models (e.g., WRTDS) have been developed and can probably provide more accurate estimates?
3. Figure 3b: It is not a strictly positive relationship for the entire range of BFI_m. The variability continues to increase with BFI_m up to ~ 0.5 and then starts to decrease with BFI_m. The latter part of the curve seems largely ignored in the manuscript, including Discussion. The same observation holds true for the individual constituents (Figure S5).
4. It would be interesting to investigate the effects of seasons and antecedent discharge conditions (wet vs. dry), both of which may change the response of C-Q slope to the BFI_m metric. There may be strong contrast among, for example, growing vs. non-growing seasons. Toward the end of manuscript, the authors have briefly pointed out the possibility of season effects. I think it is probably beyond your scope to look into these effects in this paper, but I encourage the authors to provide a brief discussion to point out that the response of C-Q slope to the BFI_m metric can vary among different seasons, among different antecedent discharge conditions, and even among different periods. In the latter regard, it is reported that anthropogenic disturbances and/or management actions occurred in the catchment can cause the C-Q relationship to change. For example, Zhang (2018) provides an investigation of C-Q relationship for different river flows and years: https://doi.org/10.1016/j.scitotenv.2017.09.221.
5. The term BFI_m (median BFI) is not self-evident in the Abstract. Given the importance of this metric, I encourage the authors to define it more clearly in the Abstract.
6. The authors have used BFI_l and BFI_h to represent the variability of BFI, which makes sense to me. I may have used 2.5% and 97.5% instead but 10% and 90% are fine. By the way, have you considered using standard deviation to capture the spread, which may help shorten the manuscript in terms of text and figures presented? I think an argument can be added to the end of Section 2.1, which favors the use of BFI_l and BFI_h, that these are percentile based and hence are more robust to outliers.
7. For days with multiple samples, is it necessary to pre-calculate the average concentration? Why not keeping all the samples in the analysis? In addition, it may be helpful to provide a table that quantifies the fraction of such days in the record.
8. BFI calculation: I am curious about the use of 0.98 for alpha in the baseflow filter. Did Ladson et al. (2013) recommend this value? What is the rationale?
9. Figure 2: Please add numbers and units (even if hypothetical) on the y-axis for panels a and b.
10. Equations 2-3: Consider changing ð¿ðµð¹ð¼ðððððð¡ð to ð¿BFI_ðððððð¡ð. At first glance, I thought this is the product of two variables (ð¿ and BFIclimate).
11. Section 3.3.1, including Table 1: I would like to refer back to my comment above. The NSE values do not seem to be comparable to more established approaches such as WRTDS. What is the value of showing these results? Should the baseline model or the BFI-based model be used for predicting concentrations? Why not those other established approaches?
12. Section 3.3.2: According to published literature on many catchments around the world, SRP is a minor component of TP, whereas NOx is a major component of TN. It is quite interesting that in these Australian catchments, NOx/TN is quite small. This presents a strong contrast to many regions and may be discussed with a couple of sentences.
Sorry for the display issue with the equation in my comment 10. Here is the correction:
Equations 2-3: Consider changing 𝛿𝐵𝐹𝐼𝑐𝑙𝑖𝑚𝑎𝑡𝑒 to 𝛿BFI_𝑐𝑙𝑖𝑚𝑎𝑡𝑒. (Move "BFI" to the subscript.) At first glance, I thought this is the product of two variables (𝛿 and BFIclimate).