Sunil SAHA,Debabrata SARKAR,Prolay MONDAL
Department of Geography,Raiganj University,Raiganj,Uttar Dinajpur,West Bengal,733134,India
Keywords:Soil erosion risk Revised Universal Soil Loss Equation(RUSLE)Analytic Hierarchy Process (AHP)Machine learning algorithms Kappa coefficient;Ratlam District India
ABSTRACT Evaluation of physical and quantitative data of soil erosion is crucial to the sustainable development of the environment.The extreme form of land degradation through different forms of erosion is one of the major problems in the sub-tropical monsoon-dominated region.In India,tackling soil erosion is one of the major geo-environmental issues for its environment.Thus,identifying soil erosion risk zones and taking preventative actions are vital for crop production management.Soil erosion is induced by climate change,topographic conditions,soil texture,agricultural systems,and land management.In this research,the soil erosion risk zones of Ratlam District was determined by employing the Geographic Information System (GIS),Revised Universal Soil Loss Equation (RUSLE),Analytic Hierarchy Process(AHP),and machine learning algorithms (Random Forest and Reduced Error Pruning (REP) tree).RUSLE measured the rainfall eosivity (R),soil erodibility(K),length of slope and steepness (LS),land cover and management (C),and support practices (P) factors.Kappa statistic was used to configure model reliability and it was found that Random Forest and AHP have higher reliability than other models.About 14.73% (715.94 km2) of the study area has very low risk to soil erosion,with an average soil erosion rate of 0.00–7.00×103 kg/(hm2•a),while about 7.46% (362.52 km2) of the study area has very high risk to soil erosion,with an average soil erosion rate of 30.00×103–48.00×103 kg/(hm2•a).Slope,elevation,stream density,Stream Power Index (SPI),rainfall,and land use and land cover (LULC) all affect soil erosion.The current study could help the government and non-government agencies to employ developmental projects and policies accordingly.However,the outcomes of the present research also could be used to prevent,monitor,and control soil erosion in the study area by employing restoration measures.
Soil erosion is a phenomenon of continuous deterioration in the soil caused by various environmental forces,one of which is deforestation.Plant roots hold soil in place while also mitigating the effects of atmospheric forces and thus preventing soil erosion.However,as a result of widespread deforestation,soil erosion is posing a serious threat to human civilization,particularly in areas with light-and medium-textured soils.Soil erosion is a global problem that can result in the depletion of nutrients,and cause less fertile surface soil.Increased surface flow over more impermeable subsoil could lead to a reduced deficiency of water supply to vegetation.Land degradation has been a severe issue in the Yellow River Basin of China,which typically takes the type of soil erosion brought on by climate change and insensitive human activity (Gao et al.,2018).Severe soil erosion in the Loess Plateau in China causes a number of ecological and economic issues,including decreased agricultural productivity,worsened rural poverty,reduced biodiversity,and sediment deposition of the lower reaches of the Yellow River,China (Li et al.,2017).Soil erosion also poses a significant problem for environmental preservation in semi-arid and arid regions,manifesting itself as soil erosion (Buttafuco et al.,2012).Rocky desertification is a common process of land degradation accompanied by severe soil erosion,large-scale bare bedrock,significant reduction in land productivity,and comparable desert landscape on the surface as a result of excessive social and economic activities (Guo et al.,2022).Rocky desertification causes soil erosion,loss of aquatic habitat,river sediment deposition,road damage,and reduced agricultural production,which will have a negative impact on sustainable development (Sharada et al.,2013;Gayen and Saha,2017).
Over the last few decades,soil erosion has had a worldwide effect on the natural environment,resources,and agricultural productivity (Bakker et al.,2005;Prasannakumar et al.,2012).Climate change accelerates soil erosion by modifying weather and temperature patterns and gradually altering rainfall and land use and land cover (LULC),thus resulting in floods,droughts,and famine (Nearing et al.,2004;Zhao et al.,2013;Thapa,2019).On the other hand,sediment accumulation in rivers impacts reservoirs and lakes by raising their repair costs and eventually rendering them unusable (Samaras and Kautitas,2014;Masroor et al.,2022).Soil erosion does indeed have a significant impact on the agriculture sector in India,causing the aggradations of river beds,reservoir siltation,etc.Nearly,about 1.30×106km2of land in India is affected by severe soil erosion resulted from rill and gully erosion,changing agriculture,sandy desertification,desertification,poor drainage,and waterlogging (Kothyari,1996).Estimate of 3.29×106km2of land,or approximately 53.00% of India’s overall territorial land,are projected to be eroding at a rate of 16.00×103kg/(hm2•a) (Jain et al.,2001;Pandey et al.,2009).It is predicted that approximately 5.33×1012kg of soil is extracted yearly in India for a variety of reasons (Narayan and Babu,1983;Pandey et al.,2007).It is estimated that the annual loss in the production of major crops in India is 7.20×109kg,or 4.00%–6.30%of India’s annual agricultural output due to soil erosion (Saroha,2017).In order for a soil protection program to work,it is important to measure soil erosion and find the most important areas for improvement and management.
Soil erosion is influenced by many factors,such as the type of soil,climatic condition,landscape,and LULC,as well as their interactions.Besides,the slope,aspect,shape of the surface,elevation,stream power,and Topographic Wetness Index (TWI) all affect the process of soil erosion.The slope and aspect will have a significant effect on the runoff process.The steeper slope leads to greater runoff,and therefore less infiltration.The runoff caused by the slope can find a direction nearby,resulting in soil erosion as the increase of runoff speed.Based on the required empirical and/or physical data sets,researchers use numerous soil erosion models like the Universal Soil Loss Equation (USLE),Morgan-Morgan-Finney (Morgan et al.,1984),European Soil Erosion Model (Morgan et al.,1992),Griffith University Soil Erosion Template (Ciesiolka et al.,1995;Coughlan and Rose,1997),Limburg Soil Erosion Model(de Roo et al.,1998),and Water Erosion Prediction Project (Laftan et al.,1998) to assess soil erosion.USLE was created by Wischmeier and Smith (1965) and is one of the most commonly used analytical models for measuring soil erosion.Initially,USLE was developed primarily to estimate soil erosion on farmland or areas of gently sloping terrain.Integrating USLE with remote sensing (RS) and geographic information systems (GIS) techniques can effectively assist in evaluating erosion-prone locations (Sharada et al.,1993;Welde,2016).Additionally,previous research indicates that integrating USLE with a grid-cell-based GIS enables the successful study of spatially dispersed soil erosion (Renschler et al.,1998;Onyando et al.,2005;Onori et al.,2006;Bhattarai and Dutta,2007).The developed USLE forms,i.e.,Revised Universal Soil Loss Equation (RUSLE) and Modified Universal Soil Loss Equation,are now used in a large number of experiments to measure soil erosion (Wischmeier and Smith,1978;Lee and Lee,2006).
It is a cumbersome method to use conventional approaches to assess the risk of soil erosion.To estimate soil erosion,GIS and RS have been utilised successfully in conjunction with many different traditional methods such as RUSLE,Analytic Hierarchy Process (AHP) (Pradeep et al.,2015;Tairi et al.,2019;Das et al.,2020),and machine learning algorithms such as Random Forest and Reduced Error Pruning (REP) tree (Arabameri et al.,2020).AHP is still one of the most popular ways to figure out how to make a hard decision.It is used widely because this method is flexible and easy to use.A decision condition can be described on many levels in an AHP.With the help of pairwise comparison matrix with a scale of relative importance and AHP,the weights of the selected factors that affect the suitability of the site can be found (Kachouri et al.,2015).
For any number of scenarios,including cropping systems,management strategies,and erosion control practices,RUSLE can anticipate an average yearly rate of soil erosion for a site of interest (Angima et al.,2002).The confluence of existing models and machine learning algorithms for soil erosion (Mosavi et al.,2020),field data,and RS technology will allow for the development of new approaches.By using GIS,additional research might appear to be an asset (Xu et al.,2009;Gansri and Ramesh,2015).Shinde et al.(2010) demonstrated that RUSLE can predict the potential of soil erosion on a cell-by-cell basis.When scrambling to determine all the spatial patterns of soil erosion across a vast area,RUSLE works well.Aslam et al.(2021) used AHP for mapping soil erosion susceptibility and found that elevation,slope,curvature,Normalised Difference Water Index (NDWI),and rainfall are the most significant factors in influencing soil erosion.Saha et al.(2022) employed machine learning algorithms to explore land degradation in the eastern plateau region in India.Phinzi et al.(2021) employed RUSLE and Random Forest algorithm to explain the relationship between soil erosion and influencing factors.In the last few decades,RS and GIS have been employed to precisely estimate soil erosion on a watershed and basin scale (Zhu,2015;Singh and Panda,2017),along with RUSLE,AHP,and machine learning algorithms.For the estimation of the spatial distribution of soil erosion in the Dolakha District in Nepal,Thapa (2020) successfully employed RUSLE with GIS.According to Zhou and Wu (2008),GIS improves the model process by allowing the quantification of the impact of a single aspect on the total output.REP tree is also a useful technique in soil erosion mapping (Nhu et al.,2020;Arabameri et al.,2021).
Assessing soil erosion hazards using traditional techniques is time-consuming and costly (Singh et al.,1992;Pandey et al.,2007;Singh and Panda,2017).Integrating current soil erosion models,field data,and information from satellite sensors through GIS appears to be beneficial for future studies (Gitas et al.,2009;Xu et al.,2009).Digital Elevation Model (DEM) is a necessary input for simulating soil erosion.Agriculture employs a sizable age of the population in Ratlam District.Due to a lack of fertility,regions prone to soil erosion are underutilised for agriculture.The remaining agricultural productive fields are in moderate danger of soil erosion.Apart from the subtropical hot summer and general dryness of the study area,which characterise the climatic setting,the effects of soil erosion on fertile land make this study critical for estimating soil erosion and assisting in the preservation of remaining productive lands.In this study,the soil erosion risk zone of Ratlam District was modelled using geospatial technology,RUSLE,AHP,and machine learning algorithms.The current study could assist the government and non-governmental organisations in appropriately implementing development initiatives and policies.Furthermore,the findings of this study could be used to prevent,monitor,and manage soil erosion in the study region by using corrective restoration strategies.
Ratlam District is located in the northwest of Madhya Pradesh in India,and it is situated in the Ganga and Mahi river basins (Fig.1).It is a significant tribal district in Madhya Pradesh’s Malwa region.Ratlam District is bounded on the north by the Mandsaur District,the south by the Jhabua and Dhar districts,the east by the Ujjain and Shajapur districts,the west by Banswara District (Rajasthan),and the northeast by Jhalawar District (Rajsthan).Ratlam District encompasses the latitudes of 23°05′–23°52′N and the longitudes of 74°31′–74°41′E,with a geographical area of 4861 km2.The Chambal River flows west to east across the northern portion of Ratlam District.The Kshipra,Maleni,and Pingla rivers are significant tributaries to the Chambal River in Ratlam District.The Mahi River flows through the southwestern part of Ratlam District.Bageri,Jammer,Karan,Pundia,Bunad,Pampavati,and Telni are the major tributaries of the Mahi River.Ratlam District has gently sloping,hilly,and undulating terrain,with elevations ranging from 267 to 608 m a.s.l.The land is mostly sloping in the northeast direction.The high areas in the study district were identified as having major soil erosion and facing deforestation problems caused by tree cutting,overgrazing,and the loss of natural vegetation.The climate in this region is sub-tropical and rated as “Cwa” as per the Koppen climate classification system,with hot summers and cool winters (https://en.climate-data.org/asia/india/madhyapradesh/ratlam-24506/).As per climate records dating back 30 years,the annual temperature of Ratlam District ranged from 9.00°C (in winter) to 40.5°C in summer (https://en.climate-data.org/asia/india/madhya-pradesh/ratlam-24506/).This region receives an annual average rainfall of 1200.19 mm,the majority of which falls between July and August(https://en.climate-data.org/asia/india/madhya-pradesh/ratlam-24506/).Fine soil,loamy,fine loamy,gravelly loamy,and coarse loamy are the major soil types.Agriculture is the prevailing land use in the study area.The land cover in the study area varies from natural forests in the northeast to cultivated crops such as groundnut,sunflower,paddy,and corn in the mid-north parts.Waste fields or deteriorated land with or without bush were also identified in Ratlam District,as well as stony wastes or barren rocks.Ratlam District is more prone to soil erosion and associated environmental degradation as a result of heavy rainfall in a short period,changing patterns of agriculture,thin surface soil layer,natural denudation,the prevalence of barren hills,etc.
Fig.1.Location of the study area (Ratlam District) in Madhya Pradesh (a) and overview of Ratlam District (b).
3.1.Data sources
The data used in this study were collected from various secondary sources to estimate the values of several factors of RUSLE model.The Landsat 8 Operational Land Imager (OLI) with a spatial resolution of 30.0 m×30.0 m was downloaded from the United States Geological Survey (USGS) Earth Explorer Data Portal on 10 October,2020,to produce the LULC and Normalized Difference Vegetation Index (NDVI) of the study area with the aims of measuring the support practice (P) and land cover and management (C) factors of RUSLE.The majority of agricultural fields were under cultivation at the time of the photo acquisition,which was confirmed by the field survey.Because the resolution of the raw Landsat 8 OLI,DEM,and soil data is not identical,we standardized all of these data to a resolution of 30.0 m×30.0 m using the re-sampling technique in the GIS environment.From the Alaska Data Portal,the DEM was downloaded with 12.5 m resolution,and thereafter the drainage pattern slope,flow accumulation,and flow direction were estimated to explore the slope length and steepness (LS) factor.We generated rainfall data from the Center for Hydrometeorology and Remote Sensing (CHRS) Data Portal to estimate the rainfall erosivity (R) factor,and collected data about soil properties such as sand (%),silt (%),clay (%),and soil organic carbon (SOC) (%) from the Soil Grid Data Portal (https://soilgrids.org/) to calculate the soil erodibility (K) factor.To avoid the unnecessary mixing of errors in estimation,we have corrected and rescaled all collected data in the GIS environment to a spatial resolution of 30 m and the same scale.A detailed description of data sources is given in Table 1.
Table 1 Description of data used in this study.
3.2.Methods
3.2.1.Soil erosion modelling using Revised Universal Soil Loss Equation(RUSLE)
Difficulties associated with soil erosion,displacement,and sediment deposition in rivers have persisted throughout geologic time in almost all parts of the world (Buttafuoco et al.,2012).Nonetheless,human’s growing interventions in the environment have exacerbated the situation in recent years.In the present study,RUSLE was incorporated with GIS and RS on a grid-cell basis for the analysis of average soil erosion in detail.RUSLE is perhaps the most commonly used computerised version of USLE,which is a statistical model designed to estimate annual average soil erosion per unit area.In this study,RUSLE was utilised to measure the probability of soil erosion by bringing into consideration the R,K,LS,C,and P factors.The raster layer of all RUSLE factors was reclassified into several classes using the reclassify tool in the GIS environment (Singh and Panda,2017) by the natural break classification method.RUSLE is a predictive erosion model utilised exclusively to forecast the effects of sheet and rill erosion under given cropping and management systems.It is suitable for estimating mean soil erosion over a longer period of time (Ganasri and Ramesh,2016).
where A is the annual average soil erosion (×103kg/(hm2•a));R is the rainfall erosivity factor (MJ•mm/(hm2•h•a));K is the soil erodibility factor (×103kg•h•MJ/mm);LS is the slope length and steepness factor;C is the land cover and management factor;and P is the conservation practices factor.
In this research work,all the methods that were applied to calculate RUSLE factors are necessarily applicable to Indian climate and physiographic scenarios.Previously,these methods were widely used to replicate the erosionprone regions under Indian conditions (Singh et al.,1992;Pandey et al.,2007;Singh and Panda,2017;Gayen et al.,2019).
3.2.2.Generation of the soil erosion factors using RUSLE
Rainfall,runoff,and drainage play a major role in soil erosion and therefore are commonly known as the variables of the R factor;this factor determines the ability to induce soil erosion through rainfall.To estimate the R factor,the function of the Fourier Index Arnoldus (1980) was used,because this method is more suitable for tropical monsoontype climates.To estimate the R factor,we collected the rainfall data from CHRS Data Portal (12 sample spots).These data were interpolated over Ratlam District using the Inverse Distance Weighting interpolation technique,and the rainfall erosivity was converted using Equations 2 (Hui et al.,2010) and 3.
whereFis the Fourrier index;istands for month (i=1,2,…,12);riis the rainfall in theithmonth (mm);andpis the annual precipitation (rainfall) (mm).
The K factor is the reaction of soil to the erosive rainfall and runoff effects in a particular area.This factor was determined to be dependent on soil materials.The probability of the soil component to erosion,the silt resistance,and the amount of runoff required for specific rainfall participation are all defined by the K factor.The value of the K factor was estimated using the equation of William (1995,2000) and the Soil Grid data.Equations 5–8 were used to calculate the determinants of the K factor of RUSLE.
wherefcsandis a variable that gives low K factor for soils containing a lot of coarse sand and very little sand;fcl–siis a factor that provides low soil erodibility to high clay and silt soils;fSOCis a factor that reduces soil eroding with high organic carbon substances;andfhisandis a component that decreases erodibility for soils with incredibly high sand content.
wheremsis the sand fraction content (diameter of 0.050–2.000 mm) (%);msiltis the silt fraction content (diameter of 0.002–0.050 mm) (%);mcis the clay fraction content (diameter less than <0.002 mm) (%);and SOCis the soil organic carbon content (%).
The topsoil cover was defined by all soil fractions,including sand,silt,clay,and organic carbon,since it is directly influenced by raindrop energy.
The LS factor represents a ratio of soil erosion below specified conditions to the acceptable slope steepness and length.The factor is described as the gap from the preliminary point for a curve upward and the beginning point for a deposit downward slope.To generate the LS,we collected a DEM with 12.5 m resolution (resample to 30.0 m) from the Alaska Data Portal.For data pre-processing,the fill sink algorithm was used to fill sinks on the DEM using the spill elevation method.After calculating the flow direction and flow accumulation from the slope,the length of the slope was measured using the following equation in the GIS environment.A larger value of the LS factor indicates a high potential for soil erosion and vice versa.In this research work,the LS factor (Eq.9) was estimated using the approach of Moore and Burch (1985).
whereLis the slope length (m),which is the product of flow accumulation and resolution of DEM (Moore and Burch,1985);andθis the slope in degree (°).
The C factor is the ratio of soil removed from the soil surface by a single plant that persists until the field is uncovered.The relevance of this factor is determined by the soil cover,management practices,and development,as well as the fact that rain can cause erosion at any time.The C factor decreases from 1 to 0,depending on the vegetation cover and cropping monitoring systems used to reduce soil erosion.The NDVI is a commonly used measure to discover the vitality of green plants and it is calculated by the following formula (Eq.10):
where Band 5 is the near infrared band;and Band 4 is the red band.
The C factor was estimated using the Equations 11 and 12 (van der Knijff et al.,1999;Durigon et al.,2014) that were adopted by Colman (2018) using the raster calculator in the GIS environment.
whereCVKis based on NDVI for European climate condition;αandβdetermine the shape of the curve that associates NDVI with the C factor;andCRAis based on NDVI for tropical climate zones with more intense rainfall.
The P factor in RUSLE is the ratio of soil erosion and support practise with straight-line uphill and downhill tillage to the accompanying soil erosion (Naqvi et al.,2012).The effects of surface management activities that are used to mitigate soil erosion due to erosion processes are expressed by the components of the P factor.Terracing,stripcropping,and contour ploughing are examples of these management activities (Marondedze et al.,2020).The P factor values range from 0.00 to 1.00 and are dependent on the land management practises (Roslee et al.,2019).The values of the P factor were generated from the LULC map of the study area.To classify the LULC,we used the natural colour combination (4-3-2) after composing the Landsat imageries.Thereafter,the maximum likelihood classification method was applied in the GIS environment.Maximum likelihood classification determines the probability that a given pixel belongs to a specific class based on the statistics for each class in each band being normally distributed.During the field survey and documentation of farmers’ perceptions,it was discovered that they grew a diverse range of crops such as groundnut,sunflower,paddy,and corn,but the lands were not designated for a specific crop.These lands alternated between various crops and seasons.As a result,mapping crop-specific LULC is very challenging.This is why we classified all crops as “cultivated crops”.The average acceptable values of the P factor for each land cover form are shown in Table 2.
Table 2 Relative importance scale for AHP’s pair-wise matrix formation (Saaty,1980).
3.2.3.Soil erosion modelling using Analytic Hierarchy Process(AHP)
The AHP is a way to use mathematics and psychology to organise and analyse difficult choices (Saaty,1980).
The primary pair-wise (Eq.13) matrix classification in AHP (Saaty,1980;Chandio et al.,2011) is based on 1–9 scales of relative importance (Table 2),with 1 indicating “equally important” and 9 indicating “absolute importance”(Saaty,1977).
whereaindicates the criteria of cell.
The primary pairwise comparison matrix has been analysed using Octave 6.2.0 (GUI) (Eaton et al.,1977) software to estimate the weight of the variables and the consistency of the matrix (Table 3).
The REP tree is a fast decision tree learner that builds a decision or regression tree using information gain as the splitting criterion and then prunes it using the PEP tree (Mohamed et al.,2012;Kalmegh,2015).It uses the logic of the regression tree to make several trees in different steps (Mallick et al.,2021).Then,the REP tree picks the best tree from all the ones it made,which will be the one that stands for the group.The mean square error of the REP tree predictions is used as a metric.The Waikato Environment for Knowledge Analysis (WEKA) platform (The University of Waikato,Hamilton,New Zealand) has been used to run the REP tree (Saleh et al.,2020) as it is a good platform to perform several machine learning algorithms.The WEKA platform was developed by University of Waikato,New Zealand.
3.2.4.Soil erosion modelling using machine learning algorithms
Random Forest is an exemplification of the general technique of random decision forests (Phinzi et al.,2021).Random decision forest is an ensemble learning technique for classification,regression,and other tasks (Paul et al.,2021).Random Forest can do both tasks of classification and regression.It can deal with large datasets with a lot of dimensions.It makes the model more accurate and stops the model from being too good.Random Forest is based on the concept of ensemble learning,which is a method for solving difficult problems and improving the performance of a model by combining different classifiers (Cheng et al.,2018).
3.2.5.Weighted Linear Combination Method(WLCM)
The WLCM is a rule for using GIS to make composite maps.It is a common method for making decisions in the GIS environment (Malczewski,2000).The WLCM is an analytical approach that can be used for multi-factor decision-making or when more than one factor deserves to be taken into account (Eq.14).A criterion is any major consideration that is taken into account.Each criterion is weighted to reflect how important it is.
wherenis the number of variables;Criteria1is the first variable;and rating indicates the estimated weight of the variable using AHP.
3.2.6.Model calibration and validation
The Kappa statistic can be used to compare an observed accuracy to an expected accuracy (random chance)(Lillesand and Kiefer,1995).Specifically,the Kappa statistic is used not only to compare one classifier to another but also to compare two or more classifiers to each other.After the calculation of the overall accuracy (Eqs.15–17),the expected probability of chance agreement can be calculated from the confusion matrix (Ukrainski,2019).
Thereafter,the Kappa coefficient is estimated using the accuracy and expected probability of chance agreement results (McHugh,2012).Inter-rater dependability is typically evaluated using the Kappa statistic.The degree to which the data collected for the study are accurate representations of the measured variables determines the significance of rater reliability.The Kappa coefficient values are shown in Table 4,where less than 0.00 indicates the poor strength of reliability and 1.00 indicates almost perfect strength of reliability (Kurande et al.,2013).
where,TV is the true value;SV is the sample value;PCis the expected probability of chance agreement;xc iis cell value from the confusion matrix table;xr iis the row wise performance value of the cell;andKstands for Kappa coefficient.The flow diagram of the whole study is given in Figure 2.
Table 4 Interpretation of the Kappa coefficient values by Landis and Koch scale (Kurande et al.,2013).
4.1.Execution of RUSLE factors
4.1.1.Rainfall erosivity(R)factor
Previous studies (e.g.,Angima et al.,2003;Dabral et al.,2008) showed that the average soil erosion rate in the catchment area in the highland regions is more vulnerable to rainfall.In this study,it is verified that the R factor is the most predominant and sensitive determinant of soil erosion as the study area is charactirized by hot summer and general dryness.The annual average rainfall of the study area is about 992.90 mm (Central Ground Water Board,2013).The R factor values range from 432 to 750 MJ•mm/(hm2•h•a),with 27.52% (1347.39 km2) of the area being most vulnerable to soil erosion and 9.09% (442.02 km2) being least vulnerable,respectively (Fig.3a;Table 5).The least vulnerable parts of the study area are in the north and northeast,while the most vulnerable parts are in the south.
4.1.2.Soil erodibility(K)factor
The K factor demonstrates the capacity of soil or surface content resistance to erosion (Gayen et al.,2020),sediment ease of transport,and the volume and rate of runoff given a specific rainfall input as calculated under standard conditions (Kim et al.,2006).The K factor is affected by particle size division,organic matter composition,structure,and permeability in particular (Dissanayake et al.,2019).The calculated K factor values of the study area range from 0.10×10–6to more than 0.40×10–6kg•h•MJ/mm,with 27.89% (1355.73 km2) of the study area being the most vulnerable to soil erosion and 5.84% (283.88 km2) of the study area having the lowest K factor values (Fig.3b;Table 5).The northern and north-eastern parts of the study area,and south-western are more vulnerable to soil erosion,accounting for 56.23% (K factor values of 0.15×10–6–0.25×10–6kg•h•MJ/mm) of the total area.
Fig.2.Flow diagram of the whole study.DEM,Digital Elevation Model;A,annual average soil erosion;R,rainfall erosivity;K,soil erodibility;LS,length of the slope and steepness;C,land cove and management;P,support practisepractice;SOC,soil organic carbon;NDVI,Normalized Difference Vegetation Index;LULC,land use and land cover;RUSLE,Revised Universal Soil Loss Equation;AHP,Analytic Hierarchy Process;TWI,Topographic Wetness Index;SPI,Stream Power Index;REP tree,Reduced Error Pruning tree.
4.1.3.Slope length and steepness(LS)factor
The LS factor represents the vulnerability of a given location to topographic erosion (Bhandari et al.,2015).In this study,it was verified that the LS factor is one of the most predominant and sensitive determinants of soil erosion as the Malwa Plateau (east),Sailana Plateau,Western hills of Sailana,Chambal valley,and Mahi valley are the major physiographic units (Central Ground Water Board,2013).Slope steepness is a measure of the slope’s effect on the rate of soil erosion.The gradient of the landscape has a greater effect on soil erosion than the length of the slope (Teh et al.,2011).Table 5 shows that approximately 64.90% of the region has a very low slope,and 26.89% of the area is covered by a moderately steep and very steep slope.The northern,eastern,and north-western parts of Ratlam District are least susceptible to soil erosion,with LS factor values less than 0.52 (Fig.3c).
4.1.4.Land cover and management(C)factor
The C factor refers to the land’s situation in terms of vegetation density.The higher C factor values suggest a greater propensity for soil erosion since they indicate barren land with less green cover (Tanyaş et al.,2015).The NDVI map obtained from Landsat-8 imagery was used to determine vegetation cover and management data.The spatial distribution of NDVI in the study area was classified into five suitable classes:–0.15–0.10,0.10–0.15,0.15–0.20,0.20–0.28,and 0.28–0.55.The relative effect of management choices can be compared directly to the adjustments of the C factor,which ranged from near 0.00 for excellently covered land to 1.000 for barren or bare areas.The C factor values of Ratlam District range from 0.022 to 1.000 (Fig.3d;Table 5).About 20.37% of the total area has less green cover,which makes it more vulnerable to soil erosion.On the other hand,60.20% of the total area (2926.24 km2) is moderately to highly vulnerable to soil erosion,with C factor values of 0.050 to 0.500.
Fig.3.Spatial distribution of RUSLE factors.(a),R factor;(b),K factor;(c),LS factor;(d),C factor;(e),P factor.
Table 5 Area distribution of RUSLE factors (rainfall erosivity (R),soil erodibility (K),slope length steepness (LS),land cover and management (C),and support practice (P) factors) in different classes (based on natural break).
4.1.5.Support practice(P)factor
The P factor is the ratio of soil erosion associated with specific support practises to the respective loss with up and slope management (Prasannakumar et al.,2011).The P factor reflects the impact of specific soil management practises such as contour cultivation,strip cropping,terrace cultivation,and subsurface drainage.The LULC map produced by Landsat-8 imagery was used to evaluate the P factor values.The LULC of the study area was categorised into five classes: upland,vegetative land,water body,crop land,and built-up land (Fig.3e;Table 6).Figure 3e depicts the P factor value of the study area.About 51.72% of the study area has P factor values of 0.90–1.00,which is most susceptible to soil erosion,whereas only 16.12% of the study area is least sensitive to soil erosion,with P factor values less than 0.50 (Table 5).
Table 6 P factor values of various LULC classes.
4.2.Execution of soil erosion factors for Analytic Hierarchy Process (AHP) and machine learning algorithms
Figure 4 shows the spatical distribution of the selected variables (aspect,slope,elevation,TWI,stream density,SPI,NDVI,distance from the river,rainfall,flow direction,flow accumulation,geomorphology,LULC,sand,silt,clay,and SOC) and meteorological station for the evaluation of soil erosion risk zone.Aspect is the direction of a slope (Stage,1976).It is measured clockwise from 0.00° to 360.00°,where 0.00° is facing north,90.00° is facing east,180.00° is facing south,and 270.00° is facing west (Fig.4a).The soil erosion tendency is determined by the aspect as well as the slope (Liu et al.,2001).About 49.77% of the study area (2419.08 km2) belongs to the low slope(<1.78°) and only 1.21% of the area (58.93 km2) has a very high slope (15.64°–56.99°) (Fig.4b).Elevation (Fig.4c)is the height above the ground or another surface,as well as a high place or position (Wang et al.,2019).About 63.13% of the area (3068.75 km2) has moderate to high elevation (403–540 m).TWI simply has physical importance based on gravity runoff and does not take into account other factors (Ma et al.,2010).The TWI is routinely used to quantitatively replicate soil moisture conditions in a watershed,and it is also the most regularly used indicator for static soil moisture content (Sharma,2010).About 72.09% of the total area (3507.28 km2) has very low TWI (<7.9)(Fig.4d),which indicates the low soil moisture content.Stream density is calculated by dividing the length of all channels within the basin by the basin’s area (Ali et al.,2018).The stream density of 60.15% of the area (2923.58 km2) is between 0.297–1.012 m/m2(Fig.4e).The Stream Power Index (SPI) is a way to measure how much water can wear away rock.The SPI is worked out by using the slope and the contribution area (Danielson,2013).About 60.79% of the area (2954.87 km2) has a high SPI values (1,488,865–2,732,985) (Fig.4f),indicating the region is more prone to soil erosion.NDVI is especially useful for monitoring vegetation on a continental to the global scale because it can account changes in illumination,slope,and viewing angle.Nevertheless,NDVI tends to get too high when there is a lot of vegetation,and it is sensitive to the colour of the soil below (Ayalew et al.,2020).About 62.64% of the total area (3044.92 km2) has low to moderate vegetation cover (Fig.4g),which is more prone to soil erosion.The erosion intensity is also determined by distance from the river (Gayen and Saha,2017) (Fig.4h).Rainfall is the foremost cause of soil erosion (Römkens et al.,2002) and has a direct effect on soil particles,the breakdown of soil aggregates,and the movement of eroded sediment.About 62.83% of the total area (3054.08 km2) is located in the heavy rainfall zone with average annual rainfall of 1150.00–1429.00 mm (Fig.4i).High soil erosion risk can be attributed to fields with low flow accumulation (Jain and Kothyari,2000).Flowing water is a major factor in erosion(Fig.4j and k).Rocks and soil can be worn away by moving water (Römkens et al.,2002).Water breaks up minerals in rocks and carries the minerals with it.Geomorphology is a good indicator (Hancock et al.,2010) of the rock structure.Geomorphologically,the study area is classified as ‘anthropogenic origin’,‘denudational origin’,‘dluvial origin’,‘structural origin’,and ‘water body’ (Fig.4l).The proportion of soil erosion is affected by the type of LULC(Kidane et al.,2019).Improper agriculture,less vegetation cover,and unscientific growth aggravate the quantity of soil erosion (Fig.4m).
4.3.Generation of soil erosion risk zone maps
The final soil erosion risk zone maps were generated in the GIS environment and categorised into priority scales such as Zone I,Zone II,Zone III,Zone IV,and Zone V for Indian circumstances (Singh et al.,1992;Pandey et al.,2007;Singh and Panda,2017),where Zone I depicts the least vulnerability of soil erosion and Zone V shows the highest vulnerability of soil erosion (Ganasri and Ramesh,2016).
4.3.1.Analysis of soil erosion risk zones using RUSLE
Thereafter,the soil erosion risk zone were generated in the GIS environment and categorised into five potential classes as Zone I (very low),Zone II (low),Zone III (moderate),Zone IV (high),and Zone V (very high).To achieve the final output of soil erosion risk using RUSLE model,we multiplied the related factors of RUSLE together and generated the thematic map (Fig.5a;Table 7).The spatial distribution of the soil erosion risk in the study area was as follows: Zone I with an area of 1177.15 km2(24.22% of the study area),Zone II with 19.74% (959.43 km2),Zone III with 18.72% (910.21 km2),Zone IV with 22.48% (1092.55 km2),and Zone V with 14.85% (721.66 km2) (Table 7).
4.3.2.Analysis of soil erosion risk zones using AHP
The final map of soil erosion risk zones was generated by employing WLCM on AHP weights in the GIS environment (Fig.5b).Zone I has an area of 1381.72 km2(28.42% of the study area),Zone II has 23.44% (1139.45 km2),Zone III has 24.04% (1168.47 km2),Zone IV has 13.10% (636.90 km2),and Zone V has 10.99% (534.46 km2)(Table 7).
Fig.4.Spatial distribution of the selected variables and meteorological station for the evaluation of soil erosion risk zone.(a),aspect;(b),slope;(c),elevation;(d),TWI;(e),stream density;(f),SPI;(g),NDVI;(h),distance from river;(i),rainfall;(j),flow accumulation;(k),flow direction;(l),geomorphology;(m),LULC;(n),sand;(o),silt;(p),clay;(q),SOC;(r),sample point.
4.3.3.Analysis of soil erosion risk zones using machine learning algorithms
The prediction values for machine learning algorithms were calculated using the WEKA platform.Thereafter,the soil erosion risk zone maps (Fig.5c and d) were generated in the GIS environment.It is found that for REP tree algorithm,only 5.23% (254.27 km2) of the study area belongs to Zone I,whereas for Random Forest analysis,it is about 1.04%(50.63 km2).
4.3.4.Average soil erosion risk zones
In this regard,twenty soil erosion waypoints were collected using the GPS for ground truth impact verification from the study area field visit.Soil erosion probability points were extracted from the final soil erosion risk map.According to the study’s results,14.73% (715.94 km2) of the study area belongs to Zone I,which has the lowest risk of soil erosion.Only 7.46% (362.52 km2) of the study area is in Zone V,which has the highest risk of soil erosion(Table 7).
The findings of this study are particularly noteworthy due to the fact that India is densely populated,and as a result,food consumption is rising rapidly (Pandey et al.,2020).This was the result of increased stress on the area already under cultivation.In the field,it was found that farmers were not well-informed or knowledgeable about how to maximise harvests while preserving the health of soil quality.This study conducted in Ratlam District will help to identify which fields are at high soil erosion risk and which are least susceptible.Based on this prioritization,the government can easily implement proper soil erosion control programs,benefiting sustainable development.The K and LS factors can be changed slowly and gradually by adding organic matter and establishing terraces.On the other hand,each season can rapidly change and reduce the C and P factors,through planting and managing residues.Due to a lack of real-world data,it is very hard to figure out how water erosion affects the productivity and economic viability of current farming systems.From this study,it is found that the southern and south-western parts of Ratlam District are the most vulnerable to soil erosion.Mahi,Chambal,and Shipra are the main rivers in southern and southwestern parts of Ratlam District.The south-western parts of the study area are moderately vulnerable to soil erosion.
Fig.5.Spatial distribution of soil erosion risk zone in the study area using different techniques.(a),RUSLE;(b),AHP;(c),Random Forest;(d) REP tree.
Table 7 Area distribution of different soil erosion risk zones using different techniques.
4.4.Soil erosion map
The average annual soil erosion (Fig.6) of the study area was estimated using a three-year averaged R factor value of 586.36 MJ•mm/(hm2•a).The calculation was conducted using the raster calculator in the GIS environment.The annual average soil erosion was calculated using RUSLE parameters at a 30 m×30 m grid scale.Finally,the projected average annual soil erosion on a grid-cell basis was consolidated and categorized into priority scales such as Zone I with annual soil erosion of 0.00–7.00×103kg/(hm2•a),Zone II with annual soil erosion of 7.00×103–11.00×103kg/(hm2•a),Zone III with annual soil erosion of 11.00×103–20.00×103kg/(hm2•a),Zone IV with annual soil erosion of 20.00×103–30.00×103kg/(hm2•a),and Zone V with annual soil erosion of 30.00×103–48.00×103kg/(hm2•a) for Indian circumstances The results of this analysis make it easier to use the erosion severity and rate map to come up with better ways to manage the land.The cross-comparison of the results also shows the role of geo-environmental variables in increasing or decreasing the risk of soil erosion.
Soil erosion is particularly extremely risky in the southern,southwestern,and some sparsely populated regions in the eastern and northern parts of the study area.The estimated rate of soil erosion in the study area varies from 7.00×103to 48.00×103kg/(hm2•a) and the amount of soil erosion will increase as a consequence of climate change.Prasannakumar et al.(2011) studied the Siruvani watershed in Kerala in India and estimated soil erosion ranging from 0.00 to 14.92×103kg/(hm2•a) using RUSLE factors including LS,P,R,K,and C factors.In the Sanjal watershed,India,the soil erosion rates were estimated by Dutta et al.(2015) using TRMM data,RUSLE,and GIS,and these rates ranged from 0.20×103to 61.40×103kg/(hm2•a).Using RUSLE and GIS,Shit et al.(2015) evaluated soil erosion in the Kasai-Subarnarekha River (India) interfluve of more than 1.00×104kg/(hm2•a).According to Pal and Chakrabortty (2019),the Arkosa River Basin in India experienced soil erosion at a rate of 0.00–16.00×103kg/(hm2•a).Using RUSLE model,Marondedze and Schutt (2020) determined that the Epworth Area of Zimbabwe has an average annual soil erosion of 1.12×103kg/(hm2•a).
4.5.Validation of the results using Kappa statistic
In this study,a soil erosion inventory map with 100 training points was used to collect information from many different sites.Random cells of samples from the dataset (70.00% for training and 30.00% for testing) were made so that the reliability of the different models could be judged.From the reliability estimation,it is found the accuracy of the applied models is good enough to stimulate the risk potential of soil erosion.The overall accuracy of RUSLE is 0.779 (Kappa coefficient of 0.723),whereas the accuracy of AHP is 0.788 (Kappa coefficient of 0.736),of REP Tree is 0.788 (Kappa coefficient of 0.735),and of Random Forest is 0.808 (Kappa coefficient of 0.760).If the value of Kappa coefficient is in the range of 0.610–0.800,then the reliability of the model is substantial (Table 7).
Fig.6.Spatial distribution of average annual soil erosion.
This article explains how analytical soil erosion estimation models such as RUSLE and AHP and machine learning algorithms such as Random Forest and REP tree can be used in conjunction with GIS to assess the current situation of soil erosion and possible zones in Ratlam District.It is revealed from the assessment that about 362.52 km2of land is risky to very high (Zone V) soil erosion,with an average soil erosion of 0.00–7.00×103kg/(hm2•a),whereas about 2486.29 km2of land is moderate to highly risky to soil erosion,with an average soil erosion 11.00×103–30.00×103kg/(hm2•a).The South-West and Northern portions of the study area are more susceptible to soil erosion because the R,LS,LULC,and NDVI are the most influencing variables in soil erosion.While this study attempted to evaluate the risk of soil erosion using RUSLE,AHP,machine learning algorithms,and GIS,it has some limitations,including uncertainty in RUSLE model-generated data owing to the lack of site-specific parameterization.A thorough ground survey in order to create the most trustworthy database is an extremely expensive and time-consuming project to undertake.On the basis of the above findings,soil erosion and loss are greater in the slope plateau,the slopes that were stripped of rocks,and the places with less vegetation.Soil erosion is more likely to occur in the Chambal and Mahi river basins,because the LULC has changed so much and there are not enough measures to protect the soil.Therefore,soil conservation priority should be assigned to Zone IV and Zone V where the regions are at high risk to soil erosion.Additionally,it is critical to monitor threatened productive land and implement early soil erosion measures to protect these resources.As a result,the developed soil erosion hazard map can be an effective means of conservation and protection,as well as assist governmental and non-governmental organisations in implementing sustainable development projects in the study area.
Declaration and competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors are grateful to acknowledge all the agencies,especially the Indian Meteorological Department (IMD),Geological Survey of India,Survey of India (SOI),United States Geological Survey (USGS),and Alaska Satellite Facility for providing data.We are also grateful to Dr.Prithvish Nag,former Vice Chancellor of Mahatma Gandhi Kashi Vidyapith,Varanasi in India,and former Director of National Atlas and Thematic Mapping Organization,Kolkata,India,for his assistance and direction in completing our research.We would like to acknowledge Dr.Gopal Chandra Debnath from Visva Bharati in West Bengal and Dr.Narayan Chandra Ghosh from Rabindra Bharati University in West Bengal,India for their help with data analysis,model validation,as well as the improvement of English.