An Improved Three ‐ point Method for Power Flow Calculation based on Halton Sequence

: With the development of electric power system, it is necessary to expand the scale of new energy power generation to meet the demand of peak cutting and valley filling. The new energy power generation presents intermittenity and volatility, so it is a hot research topic to use the theory of point estimation method to study the power system planning based on uncertainty quantity. This paper improved the three-point estimation method in probabilistic power flow calculation, combined with Halton sequence sampling method to make sampling samples more uniform, and then added a pair of estimation points on the basis of the three-point estimation method, so as to improve the precision of power flow calculation without calculating the fourth moment. The proposed method is verified on IEEE14 and IEEE33 node systems, and the results verify the rationality of the improved three-point method based on Halton sequence.


Introduction
Since the beginning of the 21st century, photovoltaic and wind power devices have been increasing and their market share has been increasing. At the same time, as fuel-powered vehicles are gradually replaced by electric vehicles, the introduction of these devices into the grid also brings new operational stability problems to the power grid. By calculating these randomness, probabilistic power flow evaluates the stable operation of power system at higher level. In the relevant research of power system, if the input variable is random, mathematical modeling is carried out according to its uncertainty. Probabilistic power flow is mainly based on the known data, such as circuit voltage complex information, combined with the calculation formula and correlation relationship to solve it, and finally solve the circuit node and branch related power flow distribution. The research status of probabilistic power flow calculation is introduced in literature [1]. Currently, the commonly used methods of probabilistic power flow calculation include numerical method, approximate method and analytical method. Monte Carlo simulation is the most common numerical method in probabilistic power flow calculation. This method is generally superior to other methods in terms of accuracy, but it needs to process more data and calculate more than the other two methods. Therefore, in practical calculation, Monte Carlo simulation method is mainly used as a reference method to evaluate other methods. The current research direction of probabilistic power flow calculation is to reduce the amount of calculation as much as possible on the premise of ensuring the accuracy. Literature [2] analyzes the input variables of the function and explains the relationship between the variables. Probabilistic power flow calculation uses different methods. Numerical method has the advantage of high accuracy, such as Monte Carlo simulation method. However, due to its many calculation steps, it is mainly used as a reference for other methods in practical applications. Other improved versions of the method have been developed, such as the Monte Carlo simulation method based on Latin Hypercube sampling. The main representative method of the approximate method is the point estimation method, which mainly uses the point estimation method and the function accumulation formula to analyze and calculate the input variables, and obtains the weight of each point and the result matrix. At present, various improved versions of point estimation methods have been developed, such as probabilistic power flow developed by point estimation method in literature [3], sampling improvement of point estimation method in literature [4], convolution calculation combined in the improvement of point estimation method in literature [5], and Fourier series added in literature [6] on the basis of convolution calculation. For the distribution of multiple peak values, the spline method was rearranged in literature [7], and the Gaussian method was used in literature [8] for hybrid approximation of the function distribution, which can ensure a higher level of calculation accuracy after improvement.
In view of the insufficient accuracy of the existing threepoint estimation method, combined with the current idea that increasing the number of estimation points can improve the calculation accuracy, this paper proposes an improved threepoint method based on Halton sequence, by adding an extra pair of estimation points to improve the calculation accuracy, through the improved IEEE14 and IEEE33 node system to verify the correctness of the proposed algorithm. In this method, the convolutional calculation step is omitted, and the input variables are further fitted together on the basis of the original calculation, and the correlation is fully considered. The advantage of this method in calculation is that it cannot consider whether it is linear function.

Halton Sequence
Halton sequence sampling belongs to the low difference category of sequence sampling method, which can generate the required sample data for calculation at one time, and the obtained sampling values are evenly distributed, which is convenient for calculation. Compared with Latin hypercube sampling method, this method can improve the computing efficiency.
Halton sequence belongs to the class of infinite sequences, and its dimension can be assumed to be s. This sequence is improved based on Vander sequence, which is Vander sequence when Halton sequence is one-dimensional. In calculation, any data can be expressed by the base value of b, as shown in the following formula: Suppose the function is H b (n), When defining it, the definition formula is shown below: Based on the above formula, it can be analyzed that Halton sequence is composed of different prime numbers. The theoretical basis and practical application of Halton sequence are described in literature [9]. Fig 1 compares Halton sampling method and LHS sampling method respectively, takes 100 samples, and obtains two-dimensional sampling distribution diagram of the two methods. Fig 1 shows that: Halton sequence is more efficient in data acquisition, and its samples are distributed more evenly on the plane, which also confirms the sampling advantages of this method mentioned in literature [9].

Improve the Three-Point Method
When studying the operation stability of traditional power network, PQ nodes, such as the active power of generators and the voltage of nodes, will be taken as system control variables in power flow calculation. The control variables mentioned here are generally selected for specific optimization objectives, and relevant data has been set before calculation.
In this paper, the geographical location and installed capacity of the distributed power supply are taken into account when establishing the model of the distribution network including the distributed power supply. In the distributed power supply including wind power, solar power and other energy sources, the output of wind turbines and photovoltaic cells will be affected by random factors such as light intensity and wind speed. When analyzing the distributed power supply containing the above random effects, the random variables affecting the output are replaced by disturbance variables, and the constructed disturbance variables are taken as vectors. In the power flow calculation of the power system, state variables such as the voltage of each node of the power system and the active and reactive power distribution of branches are generally solved. In the analysis of the two, it is found that there is a functional relationship between the disturbance variable and the state variable to be solved, and the specific function is shown in the following: For the distribution network containing uncertain factors, the general deterministic power flow can no longer meet the needs of calculation, and the probabilistic power flow is replacing the traditional deterministic power flow and becoming the mainstream method for computing the grid connection of distributed power. When solving the operation problem of distribution network with uncertain factors, the state variable of power system can be obtained according to the distribution probability function related to disturbance variable and the function formula related to state variable under the premise that the control variable has been determined by probabilistic power flow.
In the research of point estimation method, it can be known that the more estimated points, the more accurate the estimated value of output random variable, but the more estimated points means that a higher order formula needs to be calculated. Here, taking five-point estimation method as an example, the eighth order center distance of input random variable needs to be known, so a larger amount of computation is needed. At the same time, the obtained center distance above the fourth order has no great physical significance in practical application, and the numerical solution obtained in the calculation may be non-real solution.
In this paper, on the basis of the existing three-point estimation method, an additional pair of estimation points are added for additional calculation. In order to improve the calculation accuracy of the three-point estimation method, it is unnecessary to calculate the higher moment of random variable above the fourth order. Firstly, the improvement of the three-point estimation method is to add a pair of new nonuniform probability estimation points on the basis of the original to carry out additional calculations, and then set the weight of each estimation point corresponding to the pair as w k,i =1/2n. According to the definition of position in the threepoint estimation method, combined with Chebyshev correlation inequality, it can be seen that, It is less likely to get results outside the mean x k,i , where is several times standard deviation, so only moments of the third order and above are ignored. The values of the standard positions ζ k,1 and ζ_ k,2 of the new estimate point is shown as follows: On the basis of the above, the weight of position w k,i and the weight of standard position are added to compare. Considering the influence of skewness, different values are taken for weight and standard position respectively, so the formula is shown as follows: , , , , , 2 , , , , , The added estimate points above take into account more information about variables, such as skewness. As shown in the above formula, compared with the standard position and weight corresponding to each position, the standard position corresponding to each estimation point is not of uniform probability, and the effect of input random variable skewness is taken into account. The improved three-point estimation method can be regarded as the combination of three-point method and two-point method. The accuracy of this method is similar to that of five-point estimation method, but it does not need to calculate the high-order moment. Therefore, there are certain advantages in the difficulty of calculation.

Modeling of Uncertain Variables
Including Correlation

Probability Model of Photovoltaic Power Generation
Similar to wind power generation, solar energy is random and greatly affected by weather, which also brings instability to photovoltaic grid-connection. The main factors affecting the performance of photovoltaic battery are the light intensity, surface temperature and air humidity of photovoltaic device surface, and the light intensity is the most important factor. When the light intensity of the area reaches a certain value, The photovoltaic generating unit generates electricity at full capacity. If the photovoltaic generating unit cannot receive sufficient light, it will be forced to reduce the generating capacity. In numerous photovoltaic studies, light intensity mainly follows β mode distribution and Weibull distribution [10], etc. In this paper, β mode distribution is adopted for modeling.
The approximate intensity of light is calculated from the data. The mean value μ and the standard deviation value σ obtained from the historical data are correlated as shown in the following equations: Similar to the output law of wind turbines, the output of photovoltaic generating units can be divided into two types, namely saturated output and unsaturated output [11] The output formula of photovoltaic generating set is shown as follows: In formula: PVG --Rated capacity of photovoltaic generating set; r --Rated illumination of photovoltaic generating set; PVG --Power factor Angle. Using basic mathematical methods to remove the variables of day, season and year as time span, the output of photovoltaic unit can be expressed as: In the formula, γ is the illumination intensity of the locality; A is the surface area of the equipment receiving light; η refers to the output efficiency of photovoltaic generating set. ΔT is the error caused by the predicted temperature difference of the equipment; k is the temperature correlation coefficient. This data is generally provided by the photovoltaic generator set manufacturer.
The illumination intensity of the location indicates the illumination loss caused by different clouds and weather changes in the same time [12]. The data show that the illumination intensity of the relevant area is shown in the following formula: In formula: is the maximum light intensity that occurs in the region. The photovoltaic power generation function is shown in the following equation: In formula: as Upper limit, it can be assumed that the errors generated in the operation of photovoltaic power generation cells conform to the traditional normal distribution, which indicates that the traditional normal distribution can be used to correct the errors in the output of photovoltaic units.

Probabilistic Model of Wind Power Generation
In this paper, Weibull distribution is used to model wind speed distribution. Compared with other mathematical models, Weibull distribution has advantages of simple structure, small calculation amount and high accuracy. This model has been widely used in the current research of wind power generation. Two parameters (scale parameter and shape parameter) are set for Weibull distribution to carry out statistical description of the variation law of wind speed. The function formula is shown as follows: In the formula, the above two parameters used in Weibull distribution can be calculated. In the formula, v is the regional wind speed, the shape parameter is dimensionless, and the unit of the scale parameter is m/s.
The mathematical expectation and variance are shown in the following equation respectively: Its distribution function is shown in the following equation: During normal operation of a wind turbine, the output of the generator depends on a number of factors, including the wind speed in the location and the wind speed cut into the generator, the rated wind speed of the generator, and the wind speed cut into the generator.
When the wind speed is above the actual wind speed, the working efficiency of the wind turbine is very low or even zero, and the working condition at this time is not suitable for the operation of the wind turbine. When the regional wind speed is above the force point, the output efficiency of the wind turbine increases linearly with the increase of the actual wind speed, which also determines the randomness of the wind turbine processing. The specific output situation is shown in the following formula: In the formula: --Rated capacity of wind turbine; --Cut in wind speed of a wind turbine; n --Rated fan speed; co --Cut out fan wind speed; WTG --Power factor Angle of wind turbine.

Load Probability Model
Load is an indispensable part of power system research, so it is necessary to establish an accurate model to analyze the load. In the current analysis and calculation of probabilistic power flow, most of them take normal distribution as the model of load to describe its probability distribution. The active and reactive power parameters of nodes and branches are as follows: In the formula: P , P is the expected value and the variance of , Q , Q is the expected value and the variance of .

Correlation Processing
When calculating probabilistic power flow, it is necessary to consider the correlation between wind power generation, photovoltaic power generation and load. If the correlation is ignored, there will be a large difference between the calculated result and the actual result. Nataf transformation can convert the non-normal correlation input variable into a normal independent distribution variable [13].
The correlation between random input variables is generally expressed by the covariance matrix, which usually reflects the variables related to each other among multiple variables and is usually obtained by the distribution function of its diversity. However, in some cases, the result cannot be obtained. In addition, the covariance matrix can be calculated through the statistical data. If it is not true, the relevant bivariate dependence rule is used to solve it. The solved matrix belongs to the absolute covariance matrix, and the value of most elements in the matrix is small, and then the Pearson coefficient factor is solved through the matrix.
Suppose the n-dimensional variable, which is regarded as the non-normal dependent input variable X= [x 1 , x 2 , ..., x n ], and C X as the coefficient matrix with correlation. In the calculation of matrix, we assume that the standard variable is the independent input variable Y= [y 1 , y 2 , ..., y n ] with normal distribution, and its corresponding coefficient matrix is set as C y , and then the corresponding correlation position coefficient is solved. The relationship between the above two variables is shown in the following equation: In the formula, F i (x i ) and Φ -1 are respectively the cumulative distribution of the function and the standard correlation normal distribution function. The left side of the middle sign is the correlation factor, and the calculation formula is as follows: Where, σ i and σ j are the average variance values of x i and x j respectively. The relationship between the two is shown in the following formula: The calculation steps of the above formula are many, and the correlation coefficient needs to be calculated repeatedly, so a certain amount of calculation will be increased. The coefficient factor of wind power generation can be calculated by referring to the empirical formula in literature [14].

Example Flow
The specific calculation process of the improved threepoint method is as follows: 1. Input data: Relevant input variables are obtained by sampling method based on Halton sequence, and the point estimation method is calculated on the data based on the correlation model established for random variables such as photovoltaic and power generation output in the system; 2. List calculation formulas and statistical calculation data; 3. Combine the estimated points to be selected for the random variable, calculate the correlation coefficients of each position of the fourth random variable by the above formula, and obtain the correlation coefficients of each position of the random variable according to the above data; 4. Carry out deterministic probabilistic power flow at the selected point, and calculate by using Oxrafa or other power flow calculation methods. When the output of random variable is obtained, analyze and calculate its estimated value; 5. Repeat Step 4, calculate 2+1 times, and then the next step; 6. The origin moment, average value and other relevant information of the output variable were obtained according to the calculation results of the previous step. Finally, the probability density function (PDF) and cumulative distribution function (CDF) of the state variable were obtained by the Cornish-Fisher series expansion.

Example Analysis
Firstly, the accuracy of the improved three-point method is verified. MATLAB platform is used to test the modified IEEE 14-node system and the modified IEEE 33-node system, and the results obtained by Monte Carlo simulation method with relevant properties are taken as references to evaluate the improved three-point method in terms of calculation accuracy. At the same time, the computational complexity is also taken into consideration. The calculation time is used as the reference scale as far as possible, and the following error formula is used as the evaluation standard: Where, μ MCS and σ MCS are the expected value and mean variance of the output variables calculated by Monte Carlo simulation; μ PEM and σ PEM are the expected value and mean variance of the relevant output variables calculated by the improved three-point method. N r is the number of related output variables.
Firstly, the proposed method is verified on the improved IEEE 14 node. It is assumed that the active power of the IEEE 14 bus system obeys normal distribution, and the load adopts the constant power factor model. The Monte Carlo simulation results of N=10000 times are used as reference. The coefficient of variation (CV) is defined here as the ratio between the standard deviation and the mean. Table 5-1 lists the mean μ and standard deviation σ of the selected output variables when the coefficient of variation of IEEE 14 bus system is 0.15. The selected power flow output results are respectively the voltage amplitude and phase Angle at bus No. 10, the injected power at bus No. 1, the injected reactive power at bus No. 3, and the active and reactive power at line No. 2-3. The mean values and standard deviations of these PLF outputs obtained by the above three methods are very close to those obtained by Monte Carlo simulation. It can be seen from the data in the table that the accuracy of the twoand three-point estimation method and the improved threepoint estimation method is not much different from that of the Monte Carlo simulation method. Next, simulation analysis is carried out for the advantages of the improved three-point method. The load is simplified in the modified IEEE 14 node system. The power factor of each node is set to 0.75, and the correlation coefficient between loads is set to 0.8. Two 30MW and 25MW wind turbines are added into the node system respectively. The two turbines operate in a power factor of 1. At the same time, the shape and scale parameters are set as 7.0 and 1.6 respectively according to the operation conditions of the wind turbines. The inlet wind speed, rated wind speed and excised wind speed are set as 4m/s, 15m/s and 25m/s respectively. The improved threepoint method was applied to the above model calculation, and the Monte Carlo simulation method of 10000 times was used as a reference to compare the results obtained by the threepoint estimation method. Finally, the load coefficient of variation of 0.05 and 0.1 were analyzed respectively, as shown in Table 2 and Table 3 below.   As can be seen from the above, the overall output variable of the improved three-point method is superior to the threepoint estimation method. The mean and standard deviation errors of the population are verified. The accuracy of the improved three-point method has been greatly improved. Table 4 lists the time of different algorithms. As can be seen from the above table, the time of the improved three-point estimation method has not increased much, but the accuracy can be maintained at a good level. When the coefficient of variation is 0.05, the voltage amplitude at node 4, the active power at line 1-2, the reactive power at line 1-5 and the probability density function (PDF) and the cumulative distribution function (CDF) are respectively shown in Fig 2 -Fig 5, which is presented in the form of unit value.  In the modified IEEE 33 node system, the base power is set to 10MVA and the base voltage is set to 11kV. For simplified processing of the selected system, the power factor is set as 0.75 when calculating reactive power, and there is a correlation between each load, and the correlation coefficient is set as 0.8. Compared with the two wind power plants mentioned above, in order to improve the accuracy, an additional wind turbine is added to the system, that is, three wind turbines are simulated at the same time. The unit capacity is set to the same 200kW, and the power factor is 1 when running. Other parameters are the same as those of the IEEE 14-node system. The calculation is carried out on the basis of the above formula, and the calculation results of Monte Carlo simulation method are also taken as references to judge the superiority of the improved three-point method.
Here, the coefficient of variation of load is set as 0.1, and the error results are listed in Table 5. According to Table 5, we can see that the total mean error and the relative error of standard deviation of the improved three-point method are reduced, indicating that the improved three-point estimation method has improved the calculation accuracy.

Conclusion
In view of the insufficient accuracy of the current threepoint estimation method, the author adopts the improved three-point method based on Halton sequence sampling, which adds a pair of extra estimation points on the basis of the three-point estimation method to improve the calculation accuracy. This method can reduce the calculation of high order center distance and avoid the non-real solutions that may appear in the standardized center distance. When the correlation of input variables is involved, the method collects the correlation information between variables so as to calculate the accuracy of the final result.