The paper number soil-2021-138 of Zefang Shen and co-authors regards the repeatability of the spectrometers and the accuracy of the spectroscopic models built with seven statistical and machine learning algorithms. The authors found that miniaturised spectrometers and combinations predicted 24 of the 29 soil properties with moderate or greater accuracy. The repeatability of the miniaturised NIR spectrometers was similar to that of the full-range, portable spectrometer. The manuscript is potentially a contribution of interest for remote sensing application and it is within its specific scopes but in my opinion the manuscript don’t fit in the scope of this journal. In my opinion the work lacks of a clear application of soil, same part of the ms. are very difficult to understand and don't permit to comprehend the originality of the paper. As is written do not explain the main results and should be connected with the aim of the research. Instead, the aim (L85-90, pg. 4)) focus only in several approach of multivariate procedure, seems just a mere exercise of multivariate statistics applied to remote sensing equipment’s. . The novelty is not explained as well as the gap in our knowledge that the manuscript with its objectives should fill. For example if the novelty are studies on 1) hyperspectral quantitative analysis (L155 to provide comparison data for the 29 soil chemical, physical and biological soil properties to be assessed using spectroscopic methods) 2) compare Spectral range, resolution, price, weight, and dimensions of miniaturised and portable spectrometers used in this study. (?) , 3) to assess the spectroscopic modelling with different statistical and machine learning algorithms, as well as 4) the accuracy of the spectrometers estimates and their repeatability, 5) assessment of the spectroscopic modelling algorithms (L160) with data from plots or (finally!) a model analysis of physicochemical indicators of polluted soil (?). The gap that results from a state-of-the-art topic should be clarify form the beginning . In my opinion the authors have to clarify in the state of art: 1) why they us this soil health indicators in mine site rehabilitation 2) if the procedure adopted is able to predict the level of contamination or soil health , 3) what are the limits of their predictive model, 4) if the model can be used for other place, because the authors do not compare their results with similar study. The second problem of this work is in the preprocessing method that is very confused. The Materials and Methods section (L160-185) does not provide sufficient detail to follow the progress of the manuscript. Regarding methods, there is a use of PLSR, RF, SVM, GBXBoost, CUBIST, SVM, GPRL, GPRP an incredible set of algorithms without explaining the criteria or limits or even if they are designed for these tools. So the reader is assuming or just have to imagine if the spectra were precorcessed. So the row data were filtered with Savitzky-Goaly (SGR) may be with multiplicative scatter correction MSC, or standard normal variate SNV, if they are treated with linear baseline correction LBC, peak normalization N, mean center MC. All this pre-processing are without explanation, whereas all the rest of the methods are only to be found in the results and conclusion discussion section. It is not explained why these methods were used and not others, how they are related to each other, nor is there sufficient detail to understand what was done and how these methods achieved the objectives of the manuscript. I am sorry for the Authors but no revision can at this point improve this work. Many other comments would be possible both for the sections 'Materials and methods' and 'Results and discussion', but it is useless because the comments made are more than sufficient to recommend the rejection of the manuscript
Reply to the comment C1 of the paper number soil-2021-138 of Zefang Shen and co-authors with the title “Miniaturised visible and near-infrared spectrometers for assessing soil health indicators in mine site rehabilitation”.
Comment 1. The Author reply“ The manuscript fits well in the scope of SOIL, e.g. under soil protection and remediation and soil and methods" I think that the purpose of the paper is still out of the scope of this journal. One would expect that the paper can cover same aspects of soil science (spatial variability, spatial dependence, application in soil remediation, soil contamination, toxic bioavailability ecc.) but then the main results go other way. For example the only comment in remediation is in line 350 where the Authors said in a generic way that “Practitioners can then effectively identify the need for early interventions to establish positive soil health trajectories. In addition, spectroscopy could facilitate the evaluation of soil degradation, more timely identification and remediation of ecologically hostile conditions, and more effective monitoring of the change in soil properties in response to restoration activities.” The Authors does not discuss if the soils are contaminated by Zn, or Cu, and what are the limits of this contamination, or if they are related with any other toxic elements and how this mining soils can be really remediated according the health indicators.
Comment 2. The Authors reply “ The `soil application' pertains to the development of a rapid and cost-effective soil analytical method for the assessment of mining soil and for the purpose of post-mining soil rehabilitation.” In addition the Author wrote” The soil samples that we used in our analysis originate from two depths, 0-20 cm and 50-70 cm. In my opinion the authors need to classify the data set more hierarchically according their classification. For example, the data collected could be classified based on the horizons and not with the depth. In addition, the effects of spectral reflectance can also be re-examined by classification the soils based on their order. The authors correctly apply the soil order types from the Australian Soil Classification (Isbell, 2002), but then their classification was totally ignored in the multivariate statistic elaboration. The name of a soil, especially when detailed, gives a synthetic but irreplaceable description of soil, its main properties, and perhaps more importantly, the processes that formed it. Also, the soil name enables one to make a posteriori inferences on aspects not directly taken into account in the context of a work. The sampling methodology to assess the vertical spatial variability was totally ignored. .
The Authors reply “We evaluated miniaturised spectrometers alone and in combinations, which is important as different spectral ranges contain different and potentially unique information; We compared seven different algorithms to demonstrate the robustness of the spectroscopic method; We validated the models using two methods to prevent overly promising results”. In addition, the Author propose a new Figure 1 as study experimental design that should be a summary of the methodology. In my opinion this figures makes this research even more confusing by combining instrumentation, algorithms and applications. The question is that the reader should not need to read a lot of paper to know the main features and elements of this paper. In addition to the bibliography listed indicated in the work, the reader should read another 20 paper indicated in the reply to the reviewers in order to finally understand the procedures, the instrument, the calibration set up and the algorithms presented in this paper.
Comment 3. I still remained sceptical about their interpretation of the healthy soil trend. The authors reply at this point “ We have already clarified the comments around the aims and that this research isn't about `remote sensing', but here, we must stress that this research isn't only a statistical exercise” but then about half of the discussion is about the comparison of algorithms and spectroradiometer performance . In my opinion , this study merely repeats the well-known findings on predictive models approach. Anyway there are two fundamental problems with the manuscript Chemometric technique performances in predicting soil chemical and biological properties from visNIR reflectance with this dataset. The data set compare soil properties only on the basis of depth and in this form makes difficult any defendable conclusions and there is no physical connection between some of the soil biological properties that are being at 50-70 cm (CO2 emission?). The second problems is that the paper does not discuss convincingly the limitations of the approach and potential biases due to the assumptions made. I do not think the authors can make valid conclusions, because they did not have true system of comparison between the instruments used, for example there is no indication of the optical set up of the miniaturised spectrometers, the distance from the sample, the illumination etc. We’re also limited by the quality of the sensor. To make accurate predictions, most systems need as much data as possible necessary to get a reasonable result. On top of that, it takes a soil data from soil profiles. Machines can’t tell which data is good or bad; they can only process the data given to them.
Comment 4. I am sorry but I still do not understand the reply of the Authors. According their reply the miniaturised spectrometers can represent a future " for assessing soil health indicators ", but I would expected some concrete application regarding the quantification or contamination mapping of copper or zinc in this paper , highlighting the hot spot areas where it is necessary to concentrate the activities. I still remained sceptical about the novelty of this research.
Comment 5. The authors reply “ We believe that the submitted manuscript addresses … soil physical, chemical and biological properties (or indicators) for assessments of soil health in mining. There are soil properties being predicted using visNIR without any clear physical reasoning that visNIR should be able to measure the soil property. For example CO2 emission, fungal richness and diversity, microbial community composition , Fe, Mn, DTPA extracted ? In particular this parameters were measured in top soils and then compared at 50-70 cm where biological activity is considered generally low.
Conclusion. The paper is generally well-written but not always to the point, may be is adapted for a specialist or expert of soil spectroscopy. Unfortunately, I have several concerns about the approach used that could potentially affect both the reliability of some data and the interpretation of obtained results. In few words, I do not consider this work as original enough and of added value that deserves publication in this journal, at least in the present status. I am sorry for the Authors but I am not longer available for further reviewing process of this paper.
The paper discusses about several statistical and machine learning algorithms for evaluation of the spectrometers and the model prediction accuracy. Many soil physical, chemical and biological properties are targeted. It would give a reference for further NIR application. The paper should be improved before publication. Some suggestions are listed below,
Please give details of the experimental design for spectroscopy measurement. How can you observe the data in Figure 2.
Please give algorithmic details of the involved methods
I do not think that 56 samples for modeling is enough. Please prove and validate it, or I am not convinced of the results.
The definition of Lin’s concordance correlation is not given.
Analysis of prediction errors (like RMSE) are needed in results and discussions part.