^{1}

^{2}

^{1}

^{1}

^{1}

^{2}

Currently, energy saving is increasingly important. During the production procedure, energy saving can be achieved if the operational method and machine infrastructure are improved, but it also increases the complexity of flow-shop scheduling. Actually, as one of the data mining technologies, Grey Wolf Optimization Algorithm is widely applied to various mathematical problems in engineering. However, due to the immaturity of this algorithm, it still has some defects. Therefore, we propose an improved multiobjective model based on Grey Wolf Optimization Algorithm related to Kalman filter and reinforcement learning operator, where Kalman filter is introduced to make the solution set closer to the Pareto optimal front end. By means of reinforcement learning operator, the convergence speed and solving ability of the algorithm can be improved. After testing six benchmark functions, the results show that it is better than that of the original algorithm and other comparison algorithms in terms of search accuracy and solution set diversity. The improved multiobjective model based on Grey Wolf Optimization Algorithm proposed in this paper is conducive to solving energy saving problems in flow-shop scheduling problem, and it is of great practical value in engineering and management.

Many mathematical problems in scientific research and practical engineering essentially belong to multiobjective optimization problem. The analysis of constrained multiobjective optimization algorithm has become a research hotspot in recent years.

Different theories exist in the literature regarding optimization algorithm such as the Improved Multiobjective Grey Wolf Optimizer (IMOGWO) that hybridize with the fast nondominated sorting strategy [

In previous research, some scholars proposed a differential evolution algorithm based on two-population search mechanism, which randomly deletes one of the two individuals with the smallest Euclidean distance [

Several lines of evidence suggest that a number of penalty terms were applied to modify the value of individual objective function. In the process of evolution, feasible nondominant solutions were retained, and infeasible solutions with a small degree of constraint violation were also retained [

As for the improved elite selection strategy, it can make the solution set more widely distributed by setting preference points and expand the application of constrained multiobjective optimization algorithm to high-dimensional problems by combining with Deb criterion [

Up to now, plenty of differential evolution algorithms have been proposed, which minimize the value of the objective function for the feasible solution and minimize the degree of constraint violation for the infeasible solution [

Therefore, we propose an improved Multiobjective Grey Wolf Optimizer related to Kalman filtering and reinforcement learning (MKGWO) in this paper. The main innovation of the algorithm is that Kalman filter facilitates the convergence from solution set to Pareto optimal front end introduced into the static multiobjective algorithm. It combines the characteristics of Kalman filter with the robustness, reliability, and high efficiency of the reinforcement learning system when solving problems [

The scheduling problem is an interdisciplinary field of research, which involves operations research, computer science, control theory, industrial engineering, and many other disciplines [

From a different perspective, combining the data mining technology and mathematical logic, we establish an improved multiobjective operation model based on Grey Wolf Optimization Algorithm to give consideration to energy-saving problems in engineering. The results show that the algorithm can solve the Pareto front end problem in flow-shop scheduling successfully, and it is of great practical value in engineering and management.

As briefly mentioned in the introduction, multiobjective optimization refers to the optimization of a problem with more than one objective function [_{i} indicates the

In single-objective optimization, solutions can be compared easily due to the unary objective function. For maximization problems, solution

(Pareto dominance).

Suppose that there are two vectors such as

Vector

The definition of Pareto optimality is as follows.

(Pareto optimality).

A solution

A set including all the nondominated solutions of a problem is called Pareto-optimal set, and it is defined as follows.

(Pareto optimality set).

The set of all Pareto-optimal solutions is called Pareto set as follows:

A set containing the corresponding objective values of Pareto-optimal solution in Pareto-optimal set is called Pareto-optimal front [

(Pareto-optimal front).

A set containing the values of objective functions for Pareto solutions set is

MOGWO algorithm was proposed by Holland [

The vectors

Position updating mechanism of search agents and effects.

The following formulas are run constantly for each search agent during optimization in order to simulate the hunting and find promising regions of search space:

Traditional Multiobjective Grey Wolf Optimizer is a grey wolf group predation was inspired by multiobjective optimization algorithm, using a fixed external file to store nondominated solution, simple multiobjective grey wolves optimizer in solving static multiobjective problem, because without a good promotion strategy, lead to being not close to the Pareto-optimal front end, and the diversity of solution set is not high [

In 1960, R. E. Kalman published a paper describing a method which can process a time series of measurements and predict unknown variables more precisely than that based on a single measurement alone. This is referred to as the Kalman filter. Kalman filter maintains state vectors, which describe the system state, along with its uncertainties. The equations for the Kalman filter fall into two groups, time update and measurement update equations, which are performed recursively for the Kalman filter to make prediction. Here, the Kalman filter is used to directly predict for future generations in the decision space, and the two major steps are described below [

The measurement update equations are responsible for incorporating a new measurement into the a priori estimate to obtain an improved a posteriori estimate [

The time update equations are responsible for projecting forward the current state and error estimate covariance estimates to obtain the a priori estimates for the next step. New solutions are predicted based on the corrected Kalman filter associated with each individual in the decision space [

Pareto-optimal solutions will then be used to update the reference points and subproblems. The specific equations for the two steps are presented in the following [

Time update step:

Measurement update step:

Here is an example, so we can understand the Kalman filter more intuitively.

As shown in Figure

Kalman filter principle explanation.

Kalman filter has been widely applied to the dynamic multiobjective algorithm to ensure that the dynamic multiobjective algorithm can converge to the Pareto-optimal front end in time when the problem changes. It can be said that, at present, Kalman filter is one of the most effective methods to make the population converge to the Pareto-optimal set and the solution set converge to the Pareto-optimal front end. Therefore, this paper reversely applies it to the static multiobjective algorithm to promote the static multiobjective algorithm to converge to the Pareto-optimal front end faster.

In order to overcome the defects of MOGWO described above, this paper proposes a multiobjective Grey Wolf Algorithm based on Kalman filter transformation, which is hereinafter referred to as MKGWO.

After each iteration, a new grey wolf population is generated by the newly generated grey wolf population and the previous-generation grey wolf population using Kalman filter through update probability

In the field of big data and machine learning, data mining and learning technology can be divided into supervised learning, unsupervised learning, and reinforcement learning. Reinforcement learning grew out from animal learning and parameter perturbation adaptive control theory, referring to the mapping from environmental state to the action. It is a machine learning method that can adapt to and interact with the environment. This method is different from supervised learning through positive cases and counterexamples to advise the agent of what action to take, but by trial and error to find the optimal behavioral strategy.

As is shown in Figure

Principle of reinforcement learning.

In MOGWO improved based on Kalman filter operator, we found that

Reinforcement learning operator operation process.

The reinforcement learning method used here is based on snap–drift neural network. It switches between snap mode and drift mode. In this operator, agent (MOGWO) accepts the state (snap or drift) and reward value (

MKGWO’s algorithm flow framework is as follows (Algorithm

Initialize the grey wolf population

Initialize

Calculate the objective values for each search agent

Find the nondominated solutions and initialize the archive with them

Exclude alpha from the archive temporarily to avoid selecting the same leader

Exclude beta from the archive temporarily to avoid selecting the same leader

Add back alpha and beta to the archive

while (

for each search agent

Update the position of the current search agent by equations (

end for

Update

Invoke Kalman filter by equations (

Invoke reinforcement learning operator by equations (

Calculate the objective values of all search agents

Find the nondominated solutions

Update the archive with respect to the obtained nondomination solutions

If the archive is full

Run the grid mechanism to omit one of the current archive solutions

Add the new solution to the archives

End if

Exclude alpha from the archive temporarily to avoid selecting the same leader

Exclude beta from the archive temporarily to avoid selecting the same leader

Add back alpha and beta to the archive

To test the performance of MKGWO, MKGWO and MOGWO, MOPSO, NSGA2, MOEA/D, and PESA2 simulation experiment, the benchmark functions and correlation index are analyzed in this section.

The operating environment of the simulation experiment is as follows: the machine is Dawning 5000A supercomputer. Xeon X5620 CPU (4 cores)

In this paper, six benchmark functions are selected to evaluate the performance of the algorithm. This group of benchmark functions is widely used in the test of multiobjective optimization algorithm. The function names, dimensions, ranges, and expressions are shown in Table

Benchmark function equation.

Function name | Equation | Search domain | Search boundary |
---|---|---|---|

Kursawe | 3 | ||

Schaffer | 1 | ||

Viennet2 | 2 | ||

Viennet3 | 2 | ||

ZDT1 | 30 | ||

ZDT6 | 10 |

For the performance metric, we have used Inverted Generational Distance (IGD) for measuring convergence. The Spacing (SP) is employed to quantify and measure the coverage. The mathematical formulation of IGD is similar to that of Generational Distance (GD). This modified measure formula is as follows:

The mathematical formulation of the SP and MS measures is as follows:

In the simulation experiment, the population number of each algorithm is 200, the number of archives is 200, and the number of iterations is 500. Each algorithm ran independently for 30 generations, and its minimum value, maximum value, average value, and variance were taken as the results. The remaining parameters are shown in Table

Algorithms’ parameters.

Algorithms | Parameters |
---|---|

KMGWO | Alpha = 0.1; beta = 4; gamma = 2; |

MOGWO | Alpha = 0.1; beta = 4; gamma = 2 |

MOPSO | |

NSGA2 | pc = 0.9; pm = 0.5 |

MOEA/D | Gamma = 0.5 |

PESA2 |

The Pareto diagram in Figures

Kursawe’s test result.

Schaffer’s test result.

Viennet2 test result.

Viennet3 test result.

ZDT1 test result.

ZDT6 test result.

Concerning IGD metric, the merit is clear that the KMGWO significantly dominates over the KMGWO on almost the problems. As shown in Table

Comparison of algorithm running results’ IGD.

Kursawe | Schaffer | Viennet2 | Viennet3 | ZDT1 | ZDT6 | ||
---|---|---|---|---|---|---|---|

KMGWO | Min | 0.001268 | 0.000548 | 0.000195 | 0.00013 | 4.15 | 0.000336 |

Max | 0.001467 | 0.000606 | 0.000221 | 0.000663 | 5.97 | 0.000473 | |

Mean | 0.001367 | 0.000577 | 0.000208 | 0.000397 | 5.06 | 0.000404 | |

Std | 0.00014 | 4.1 | 1.84 | 0.000377 | 1.28 | 9.7 | |

MOGWO | Min | 0.001429 | 0.000626 | 0.000207 | 0.000222 | 2.96 | 0.002448 |

Max | 0.00208 | 0.000689 | 0.000218 | 0.000265 | 3.66 | 0.01636 | |

Mean | 0.001755 | 0.000657 | 0.000213 | 0.000244 | 3.31 | 0.009404 | |

Std | 0.000461 | 4.47 | 7.86 | 2.99 | 4.93 | 0.009838 | |

MOPSO | Min | 0.002178 | 0.000669 | 0.00036 | 0.0002 | 0.000267 | 0.026323 |

Max | 0.002426 | 0.000705 | 0.00039 | 0.000221 | 0.000296 | 0.030315 | |

Mean | 0.002302 | 0.000687 | 0.000375 | 0.00021 | 0.000281 | 0.028319 | |

Std | 0.000175 | 2.61 | 2.08 | 1.44 | 2.11 | 0.002823 | |

NSGA2 | Min | 0.402172 | 0.09739 | 1.538041 | 0.85775 | 0.092674 | 0.134332 |

Max | 0.410001 | 0.097398 | 1.592757 | 0.870869 | 0.109724 | 0.185229 | |

Mean | 0.406086 | 0.097394 | 1.565399 | 0.864309 | 0.101199 | 0.15978 | |

Std | 0.005536 | 5.83 | 0.03869 | 0.009277 | 0.012056 | 0.035989 | |

MOEA/D | Min | 0.001268 | 0.000605 | 2.33 | 0.000105 | 3.44 | 0.000864 |

Max | 0.004135 | 0.000613 | 4.94 | 0.000208 | 4.66 | 0.018473 | |

Mean | 0.002701 | 0.000609 | 3.64 | 0.000156 | 4.05 | 0.009668 | |

Std | 0.002027 | 5.98 | 1.85 | 7.28 | 8.6 | 0.012451 | |

PESA2 | Min | 0.001274 | 0.000628 | 0.000183 | 0.000195 | 0.002272 | 0.021336 |

Max | 0.001549 | 0.000644 | 0.000347 | 0.000323 | 0.002847 | 0.052005 | |

Mean | 0.001412 | 0.000636 | 0.000265 | 0.000259 | 0.00256 | 0.036671 | |

Std | 0.000194 | 1.15 | 0.000116 | 9.05 | 0.000407 | 0.021687 |

Table

Comparison of algorithm running results’ SP.

SP | Kursawe | Schaffer | Viennet2 | Viennet3 | ZDT1 | ZDT6 | |
---|---|---|---|---|---|---|---|

KMGWO | Min | 1.869705 | 0.464979 | 0.229326 | 1.681636 | 0.057698 | 0.170041 |

Max | 2.140332 | 0.628268 | 0.354561 | 2.394785 | 0.067846 | 0.255117 | |

Mean | 2.005019 | 0.546624 | 0.291944 | 2.038211 | 0.062772 | 0.212579 | |

Std | 0.191362 | 0.115463 | 0.088554 | 0.504272 | 0.007176 | 0.060158 | |

MOGWO | Min | 2.071943 | 0.544736 | 0.338835 | 1.691569 | 0.058589 | 0.082575 |

Max | 2.255649 | 0.588034 | 0.395661 | 2.20073 | 0.061941 | 0.180822 | |

Mean | 2.163796 | 0.566385 | 0.367248 | 1.946149 | 0.060265 | 0.131699 | |

Std | 0.129899 | 0.030616 | 0.040182 | 0.360031 | 0.00237 | 0.069471 | |

MOPSO | Min | 1.812654 | 0.597622 | 0.406334 | 2.205442 | 0.074537 | 0.309447 |

Max | 1.948818 | 0.601261 | 0.41442 | 2.221322 | 0.076896 | 0.435343 | |

Mean | 1.880736 | 0.599441 | 0.410377 | 2.213382 | 0.075716 | 0.372395 | |

Std | 0.096282 | 0.002573 | 0.005718 | 0.011228 | 0.001668 | 0.089021 | |

NSGA2 | Min | 1.442097 | 1.198419 | 0.225471 | 2.55751 | 0.342942 | 0.245284 |

Max | 1.490551 | 1.228526 | 0.2299 | 2.645984 | 0.343657 | 0.372529 | |

Mean | 1.466324 | 1.213472 | 0.227685 | 2.601747 | 0.343299 | 0.308907 | |

Std | 0.034262 | 0.021289 | 0.003132 | 0.06256 | 0.000505 | 0.089976 | |

MOEA/D | Min | 1.645038 | 0.234028 | 0.238373 | 0.097528 | 0.054964 | 0.072255 |

Max | 1.655435 | 0.295418 | 0.312842 | 0.145355 | 0.063264 | 0.15472 | |

Mean | 1.650237 | 0.264723 | 0.275607 | 0.121441 | 0.059114 | 0.113488 | |

Std | 0.007352 | 0.04341 | 0.052658 | 0.033819 | 0.005869 | 0.058311 | |

PESA2 | Min | 2.091793 | 0.599484 | 0.276225 | 2.362664 | 0.069884 | 0.22178 |

Max | 2.160544 | 0.638922 | 0.34284 | 2.398225 | 0.076021 | 0.814949 | |

Mean | 2.126169 | 0.619203 | 0.309533 | 2.380444 | 0.072953 | 0.518364 | |

Std | 0.048614 | 0.027887 | 0.047104 | 0.025145 | 0.00434 | 0.419434 |

The low-carbon scheduling problem in the flow shop studied in this section can be described as follows:

At different stages, the machine has different speed gears for production and can be adjusted. From the point of view of energy consumption, the machine has four different states: (machine) on the machining processing status, start state in preparation for a new jobs (machine), standby (machine is in idle), and turned off (machine is turned off). Under normal circumstances, when the machine is working at a higher rate, the processing time will be shortened, but the corresponding energy consumption will increase. Therefore, this problem aims to maximize the completion time and energy consumption index. Due to the characteristics of the problem studied in this chapter, this problem is much more complicated than the traditional flow-shop scheduling problem. In this problem, other settings are as follows.

The job is processed continuously in the workshop. In other words, the process cannot be interrupted. Machines are allowed free time and have unlimited buffers between phases. When there is a first job processing, the machine boots. When all the jobs are finished, the machine is shut down. The machine speed cannot be adjusted in the course of a job processing.

In order to present the mathematical model of the problem, we first defined the following related mathematical symbols according to the above description of the problem.

Symbol definition are below:

Decision variables are as below:

Based on these mathematical symbols, the mixed-integer programming model of the flow-shop low-carbon scheduling problem is presented as follows:

Objective function:

Constraint condition:

Formula (

This section gives a simple example of three jobs and three stages, each with three different processing speeds. Table

Processing time and corresponding power.

Process | Job | Job | Job | ||||||
---|---|---|---|---|---|---|---|---|---|

(28, 4) | (22, 7) | (15, 11) | (22, 4) | (20, 7) | (18, 11) | (20, 4) | (18, 7) | (17, 11) | |

(20, 5) | (16, 8) | (12, 12) | (21, 5) | (18, 8) | (15, 12) | (23, 5) | (20, 8) | (17, 12) | |

(19, 5) | (15, 7) | (11, 10) | (20, 5) | (16, 7) | (12, 10) | (23, 5) | (20, 7) | (16, 10) |

Table

Sequence-dependent start-up time.

Job | Job | Job | Job | ||||||
---|---|---|---|---|---|---|---|---|---|

5 | 8 | 4 | 4 | 6 | 8 | 6 | 4 | 4 | |

6 | 4 | 6 | 2 | 3 | 2 | 4 | 6 | 2 | |

2 | 4 | 8 | 2 | 4 | 2 | 6 | 4 | 4 |

To solve the above pipeline scheduling problem, this paper sets the parameters of KMGWO as follows: population number 20, warehouse number 20, 100 iterations. Figure

Experimental results of Pareto chart.

Table

Flow-shop scheduling.

Job 2 | Job 3 | Job 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Begin | Start | Work | End | Begin | Start | Work | End | Begin | Start | Work | End | |

Process 1 | 0 | 2 | 15 | 17 | 17 | 2 | 22 | 41 | 41 | 6 | 20 | 67 |

Process 2 | 14 | 3 | 20 | 37 | 37 | 4 | 21 | 62 | 63 | 4 | 23 | 90 |

Process 3 | 35 | 2 | 19 | 56 | 60 | 2 | 20 | 82 | 86 | 4 | 23 | 113 |

Experimental results Gantt chart.

In this paper, an improved multiobjective operational model based on Grey Wolf Optimization Algorithm related to Kalman filtering and reinforcement learning (KMGWO) is proposed, which is the combination of data mining technology and mathematical logic. With Kalman filter, the algorithm promotes the understanding set to the real Pareto front end. The reinforcement learning operator is applied to enhance the utilization of the dominant position of the group, and adaptive parameters are used instead of human intervention. The results of six benchmark functions show that the algorithm performs better than the comparison algorithm in terms of approximating the real Pareto optimal solution set and keeping the solution set uniform. Considering the energy saving of the assembly line scheduling solution, KMGWO performance is excellent and accordingly suitable for solving the practical optimization problems. This operational model has the formidable superiority in the field of mathematical optimization, which can be applied to machine learning, engineering optimization design, and other important areas, thus enhancing the performance of energy saving in production management.

The data used to support the findings of this study are included within the article.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work was supported by the National Social Science Foundation of China (NSSFC) under Grant no. 17BGL238.