Monitoring and maintaining the calibration of a weather radar network is an important task. Especially for dualpol radars multiple complementing monitoring sources are needed to assess the calibration of a radar. This work considers one particular source to monitor the calibration of ground based weather radars. Using the NASA GPM, it is shown that there is systematic negative bias of the surface bias for all radars, assuming that the GPM is considered as a references. Among the weather radars, the bias to the GPM data varies between -6.6 and -1.3 dB which is significant, but in part appears to be attributed to the differences in hardware and the age of the systems (without actual prove). But there seems to be no attempt trying to explain the large biases. On the other hand, dedicated ship born radar measurements compared to selected radars illustrate a consistence of the measurement within +/-1 dB, which includes the radar Broome. Using shipborne radars to assess the calibration of operational weather radars is unique.
For the radar Broome the authors find a bias of -6.6 dB compared to GPM data. The authors do not attempt to find the source for those biases, and there is no plan laid out on the next steps. The relative consistence of the shipborne radar data and the continental radars compared to the large variability found between the GPM radar data and the ship radar /radar network suggest that a much more thorough investigation is needed before the satellite data can be one source of the calibration monitoring of the Australian weather radar network. In the conclusion (l.303) you state that you want to get "insights into the accuracy" of space borne radar obs "to calibrate national operational radar network". I would argue that you missed this goal. In principle you compare reflectivities doing a careful designed radar/radar comparision without going into the details explaining obviouse differences.
A more thorough literature survey is missing. This should include references on how other met services monitor the calibration of their radars. I would expect at least a brief summary on how the BOM maintains and calibrate the operational radar network, , how the calibration is done on the ships. Since this is a paper that deals with calibration, this is essential to me.
A more general comment: the term calibration is sometimes used in a very loose way. When doing calibration a normed reference is used to determine the calibration. I wouldn’t consider the GPM as a "normed reference" (see e.g. wording in l.39) since it is also an remote sensing instrument which has to be calibrated. So I suggest to be more careful with the wording throughout the manuscript.
To conclude, this is in principle is an important investigation, which falls short on assessing sources for calibration errors and the observed relative differences between the sensors. Without such an assessement, the results remain inconclusive.
Some more specific comments:
How do you do the frequency correction from Ka -> C band? Or is there already a C-Band product you can use?
l 95: what is “dark art” about calibration???
l 115: Please include the solar monitoring results for the Berrimah and Geraldton radar. That would be helpful to understand possible error sources.
I miss the Geraldton radar in the results. Why is it missing? (it is given in Figure 1)
l.116: raw reflectivities: I assume you mean unfiltered data, no clutter correction applied, no range averaging? Please state clearly what you mean with “raw”
l 121 ff: there is no need to separate in HM type depending the height of the radar bin? Why not?
l. 404: Table 2: please include the names of the radars instead of the numbers… makes it easier to read.
L 417: Figure 2: no data for Learmonth? why showing this graph?
L 426: Figure 3: from the caption, the difference in (a) and (b) is not explained. Please describe briefely, as the captions should be self explaining. Or refer to the text.
Dampie with 6.3 dB bias (Fig 2) : did you check the calibration procedure? How often are radars calibrated in the network? Should be mentioned somewhere. A 6.3 dB bias ia dramatic.
L 271 “natural variability of the calibration figure”: I would suggest to reserve the term calibration for a “real calibration” where you compare against a normed reference. Here you consider a relative adjustment (so far you quantify the difference, but you seem to suggest the satellite could be taken as the truth).
You only discuss the relative differences between satellite and surface/ship based observations. A bias of 3 dB for your ship based radar compared to GPM is significant. I’m a bit surprised that you don’t try assess the possible source of the bias. I assume that this instrument has more staff to do a more thorough investigation on the technical aspects of calibration, to really pin down (or rule our) the origin of the bias, checking all relevant elements of you radar hardware, and use the sun as reference to verify your receiver calibration. This should include also a check of the GPM data, perhaps using other data sources like disdrometer measurements (if available)
Assessing the calibration of a radar is a multisource task. I think you have all the ingredience together and you should go through the exercise to assess the absolute calibration of your radars. If you have the numbers together it may be an elegant approach to use a space born sensor to make an relative adjustment of your radar calibration.
I wonder how the GPM products compare to rain gauge estimates in Australia? Any hint? Would be worthwhile in the literature survey.
The authors evaluate and assess the accuracy achieved of the recently developed radar calibration framework used to monitor the calibration accuracy of all operational radars of the Australian weather radar network in real-time. The technique is based on the comparison with spaceborne Ku-band radar observations from GPM applying a Volume Matching Method (VMM). After an additionally available ship radar (OceanPOL) and the radars from the network have been calibrated separately by comparison with the space-born measurements, measurements of all radars of the Australian network are compared to the ones of OceanPOL. These more accurate ship – ground radar comparisons are considered as an indirect evaluation of the GPM validation technique and exploited to demonstrate the value of using such GPM data as a single source of reference for the calibration of a whole national network. Indeed, for all seven radars the calibration difference with the ship radar lies within ± 0.5 dB.
Intercomparisons of gridded radar observations also revealed the potential to estimate calibration differences between radars with overlapping coverage to within about 0.3 dB at daily time scale and about 1 dB at hourly time scale, which can be exploited for additional calibration monitoring.
The accuracy and value of the outlined calibration strategy is of interest for the community. Furthermore, my list of edits and suggestions provided below includes nothing severe and therefore I suggest this manuscript for publication after their consideration. E.g. at several places the authors refer the reader to upcoming ‘later’ explanations without being precise. In case references to later subsections are required more often, restructuring of the manuscript may also be an option. At one or two places the introduction of subsections would make the structure of the text more transparent and for some aspects I am also missing some more detailed explanations (see ‘major’ points, even though they are not really major, but I distinguish them from pure formulation issues).
Major points:
Line 13: Maybe the advantage of using a ship-based radar should be shortly indicated here?
Lines 16/17: What about the range of differences before the calibration? Please see also my comment regarding line 165.
Line 29: The pointing accuracy is not mentioned again in this manuscript, right?
Lines 38/39: Are there references demonstrating/documenting the accuracy of the GPM radar? You also say in Line „…whose calibration is very accurately tracked by NASA.“ Can you be more precise here?
Line 50: Reads a bit weird that the advantages of using the ship radar will be discussed ‚later‘ (no precise statement), followed by the precise structure of the article in the following lines (section 2, section 3, section 4 contains this and that).
Lines 79, „more accurate source of reference“: For the third time the authors indicate here the special role /higher accuracy of the ship radar without explanation. I would suggest to unravel the secret earlier. I was wondering already while reading the abstract, why the ship radar is a reference. If I understand correctly, the solution is provided in lines 192ff and I would suggest to summarize this idea also in the abstract.
Line 91: What kind of additional quality control is done?
Line 123: Just for curiosity: Quite often ring structures are generated with gridding/compositing of radar data. Do you encounter similar problems? If yes, this could also impact the comparison.
Line 129ff: Not really clear to me what is done. The radius of incluence is only applied to the same elevation but how do you decide whether an adjacent elevation is included?
Line 139: What about corrections for the ship movement? I thought such kind of things introduce different uncertainties for ship radars, but the authors only mention the superiority of OceanPOL.
Line 170, „discussed later and shown as black dots in Fig. 4)“: Such references to later paragraphs should be avoided if possible, or at least be more precise. When/where instead of ‚later‘? Or restructuring the manuscript should be considered in case too many references to later paragraphes are needed.
Line 188, frequency conversion: Should be mentioned how this is done/taken into account.
Line 188, most problematic in the melting layer: The melting layer is not excluded from the comparison? I suggest to be more precise how the comparison is performed.
Line 214, „We will get back to that point shortly.“: Again, refering to the unprecise future is not optimal.
Line 247: Maybe nicer, more structured, to introduce 2 subsections 3.1 and 3.2, with subsection 3.2 starting here dealing with the day-to-day variability.
Line 254: Again refering the reader to the unkown future „which will be discussed in more detail later.“ Please avoid.
Minor points:
Lines 81 and 83 contradict each other: Table shows all radars used in this study including different frequencies, but then it says this study uses only the C-band radars.
Line 90: Did you ever applied consistency of polarimetric variables for calibration of OceanPOL and checked the agreement with your calibration based on the GPM measurements?
Line 91, Version 5 of the GPM 2AKu product: A reference would be nice here.
Line 137/138: Are you using stratiform and convective events for the comparison?
Line 165, „All calibration results are summarized in Fig. 2.“: The calibration results should be mentioned in the text, not only written in the panels of the Fig.
Line 168/169: The older estimate also includes RCA checks or just for radar 63?
Line 173, „Looking at the time series of GPM calibration estimates for other radars than 63 …“: Why not refering now to radars 63 AND 29?
Lines 173ff: So, at the end only radar 16 shows variations? Why not directly writing that instead of starting with 29, then adding 63 and finally the others? Would be easier to read.
Line 182, ‚In a perfect world‘…: I am not a native speaker, but sounds more like colloquial language to me.
Figure 3 caption: Here I suggest to write that the comparison with radar 63 is shown, but in the text you can write that the overall strategy for the OceanPOL comparison with any radar is illustrated using radar 63 in Fig. 3. I also suggest to rewrite the explanation of what is shown in panel b. Maybe something like „…a percentage of all OceanPOL reflectivity values in a resolved 0.5dB bin“. And I suggest to write „The number of samples N is 141978 (see panel a).“
Line 200: Instead of „as on the left panel of Fig. 3)“ -> Fig. 3a
Line 202: Better „Comparison of …provides a better….and allows the detection…“
Line 206: Better „reflectivities less than 35 dBZ mostly contributed…“
Line 210: Delete ‚and‘
Line 215, „When including all days of observations for radars 63 and 77“: Is there a need to emphasize this? For the other radars not all measurements available are used?
Lines 217: ‚See‘ Fig. 4 for ….
Line 218, „The next best operational radar is radar 70 (Perth).“: Please reformulate.
Line 289: …derived from all hourly not from all daily estimates, right?
Lines 304/305: „A major advantage of using a single source of reference is that all radars of the network are calibrated in the same way.“ This is also the case when other methods are used. If you choose to use the consistency method for your network, you also use the same method for your entire network. Not sure what you want to express here.
Lines 305-308: I guess this sentence can be better formulated. Bit hard to read.
Lines 350ff: For several publications the DOIs are provided, for others not.