45 results found
Burtea R, Tsay C, 2024, Constrained continuous-action reinforcement learning for supply chain inventory management, Computers and Chemical Engineering, Vol: 181, ISSN: 0098-1354
Reinforcement learning (RL) is a promising solution for difficult decision-making problems, such as inventory management in chemical supply chains. However, enabling RL to explicitly consider known environment constraints is crucial for safe deployment in practical applications. This work incorporates recent tools for optimization over trained neural networks to introduce two algorithms for safe training and deployment of RL, with a focus on supply chains. Specifically, we use optimization over trained neural-network state–action value functions (i.e., a critic function) to directly incorporate constraints when computing actions in a continuous action space. Furthermore, we introduce a second algorithm that guarantees constraint satisfaction during deployment by directly implementing actions from constrained optimization of a trained value function. The algorithms are compared against state-of-the-art algorithms TRPO, CPO, and RCPO using a computational supply chain case study.
Tsay C, Qvist S, 2023, Integrating process and power grid models for optimal design and demand response operation of giga-scale green hydrogen, AICHE JOURNAL, ISSN: 0001-1541
Folch JP, Lee RM, Shafei B, et al., 2023, Combining multi-fidelity modelling and asynchronous batch Bayesian Optimization, COMPUTERS & CHEMICAL ENGINEERING, Vol: 172, ISSN: 0098-1354
Burtea RA, Tsay C, 2023, Safe deployment of reinforcement learning using deterministic optimization over neural networks, Computer Aided Chemical Engineering, Pages: 1643-1648
Enabling reinforcement learning (RL) to explicitly consider constraints is important for safe deployment in real-world process systems. This work exploits recent developments in deep RL and optimization over trained neural networks to introduce algorithms for safe training and deployment of RL agents. We show how optimization over trained neural-network state-action value functions (i.e., a critic function) can explicitly incorporate constraints and describe two corresponding RL algorithms: the first uses constrained optimization of the critic to give optimal actions for train- ing an actor, while the second guarantees constraint satisfaction by directly implementing actions from optimizing a trained critic model. The two algorithms are tested on a supply chain case study from OR-Gym and are compared against state-of-the-art algorithms TRPO, CPO, and RCPO.
Bermúdez JS, del Rio Chanona A, Tsay C, 2023, Distributional Constrained Reinforcement Learning for Supply Chain Optimization, Computer Aided Chemical Engineering, Pages: 1649-1654
This work studies reinforcement learning (RL) in the context of multi-period supply chains subject to constraints, e.g., on inventory. We introduce Distributional Constrained Policy Optimization (DCPO), a novel approach for reliable constraint satisfaction in RL. Our approach is based on Constrained Policy Optimization (CPO), which is subject to approximation errors that in practice lead it to converge to infeasible policies. We address this issue by incorporating aspects of distributional RL. Using a supply chain case study, we show that DCPO improves the rate at which the RL policy converges and ensures reliable constraint satisfaction by the end of training. The proposed method also greatly reduces the variance of returns between runs; this result is significant in the context of policy gradient methods, which intrinsically introduce high variance during training.
Zhao S, Tsay C, Kronqvist J, 2023, Model-Based Feature Selection for Neural Networks: A Mixed-Integer Programming Approach, Pages: 223-238, ISBN: 9783031445040
In this work, we develop a novel input feature selection framework for ReLU-based deep neural networks (DNNs), which builds upon a mixed-integer optimization approach. While the method is generally applicable to various classification tasks, we focus on finding input features for image classification for clarity of presentation. The idea is to use a trained DNN, or an ensemble of trained DNNs, to identify the salient input features. The input feature selection is formulated as a sequence of mixed-integer linear programming (MILP) problems that find sets of sparse inputs that maximize the classification confidence of each category. These “inverse” problems are regularized by the number of inputs selected for each category and by distribution constraints. Numerical results on the well-known MNIST and FashionMNIST datasets show that the proposed input feature selection allows us to drastically reduce the size of the input to ∼ 15% while maintaining a good classification accuracy. This allows us to design DNNs with significantly fewer connections, reducing computational effort and producing DNNs that are more robust towards adversarial attacks.
Thebelt A, Tsay C, Lee RM, et al., 2022, Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces, 36th Conference on Neural Information Processing Systems (NeurIPS 2022), ISSN: 1049-5258
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search, as they achieve good predictive performance with little or no manual tuning, naturally handle discrete feature spaces, and are relatively insensitive to outliers in the training data. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. To address both points simultaneously, we propose using the kernel interpretation of tree ensembles as a Gaussian Process prior to obtain model variance estimates, and we develop a compatible optimization formulation for the acquisition function. The latter further allows us to seamlessly integrate known constraints to improve sampling efficiency by considering domain-knowledge in engineering settings and modeling search space symmetries, e.g., hierarchical relationships in neural architecture search. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
Folch JP, Lee RM, Shafei B, et al., 2022, Combining multi-fidelity modelling and asynchronous batch bayesian optimization, Publisher: arXiv
Bayesian Optimization is a useful tool for experiment design. Unfortunately,the classical, sequential setting of Bayesian Optimization does not translatewell into laboratory experiments, for instance battery design, wheremeasurements may come from different sources and their evaluations may requiresignificant waiting times. Multi-fidelity Bayesian Optimization addresses thesetting with measurements from different sources. Asynchronous batch BayesianOptimization provides a framework to select new experiments before the resultsof the prior experiments are revealed. This paper proposes an algorithmcombining multi-fidelity and asynchronous batch methods. We empirically studythe algorithm behavior, and show it can outperform single-fidelity batchmethods and multi-fidelity sequential methods. As an application, we considerdesigning electrode materials for optimal performance in pouch cells usingexperiments with coin cells to approximate battery performance.
Ceccon F, Jalving J, Haddad J, et al., 2022, OMLT: Optimization & Machine Learning Toolkit, Journal of Machine Learning Research, Vol: 23, ISSN: 1532-4435
The optimization and machine learning toolkit (OMLT) is an open-source software package incorporating neural network and gradient-boosted tree surrogate models, which have been trained using machine learning, into larger optimization problems. We discuss the advances in optimization technology that made OMLT possible and show how OMLT seamlessly integrates with the algebraic modeling language Pyomo. We demonstrate how to use OMLT for solving decision-making problems in both computer science and engineering.
Thebelt A, Wiebe J, Kronqvist JPF, et al., 2022, Maximizing information from chemical engineering data sets: Applications to machine learning, Chemical Engineering Science, Vol: 252, Pages: 1-14, ISSN: 0009-2509
It is well-documented how artificial intelligence can have (and already is having) a big impact on chemical engineering. But classical machine learning approaches may be weak for many chemical engineering applications. This review discusses how challenging data characteristics arise in chemical engineering applications. We identify four characteristics of data arising in chemical engineering applications that make applying classical artificial intelligence approaches difficult: (1) high variance, low volume data, (2) low variance, high volume data, (3) noisy / corrupt / missing data, and (4) restricted data with physics-based limitations. For each of these four data characteristics, we discuss applications where these data characteristics arise and show how current chemical engineering research is extending the fields of data science and machine learning to incorporate these challenges. Finally, we identify several challenges for future research.
Kelley MT, Tsay C, Cao Y, et al., 2022, A data-driven linear formulation of the optimal demand response scheduling problem for an industrial air separation unit, CHEMICAL ENGINEERING SCIENCE, Vol: 252, ISSN: 0009-2509
Thebelt A, Tsay C, Lee R, et al., 2022, Multi-objective constrained optimization for energy applications via tree ensembles, Applied Energy, Vol: 306, Pages: 1-15, ISSN: 0306-2619
Energy systems optimization problems are complex due to strongly non-linear system behavior and multiple competing objectives, e.g. economic gain vs. environmental impact. Moreover, a large number of input variables and different variable types, e.g. continuous and categorical, are challenges commonly present in real-world applications. In some cases, proposed optimal solutions need to obey explicit input constraints related to physical properties or safety-critical operating conditions. This paper proposes a novel data-driven strategy using tree ensembles for constrained multi-objective optimization of black-box problems with heterogeneous variable spaces for which underlying system dynamics are either too complex to model or unknown. In an extensive case study comprised of synthetic benchmarks and relevant energy applications we demonstrate the competitive performance and sampling efficiency of the proposed algorithm compared to other state-of-the-art tools, making it a useful all-in-one solution for real-world applications with limited evaluation budgets.
Cronjaeger C, Pattison RC, Tsay C, 2022, Tensor-Based Autoencoder Models for Hyperspectral Produce Data, Computer Aided Chemical Engineering, Pages: 1585-1590
Effectively monitoring and controlling product quality is critical in produce supply chain management. Hyperspectral imaging has emerged as a promising technique for monitoring food products, but the size of hyperspectral datasets complicates storage and processing. This work develops a novel architecture for autoencoder models that is well-suited for nonlinear subspace learning on tensorial, hyperspectral data. In particular, separate sub-models are used to (de)compress each mode of the data tensor, preserving spatial locality information and greatly reducing the number of autoencoder parameters. The approach enables memory-efficient training, nonlinear dimensionality reduction, and multi-task learning, as demonstrated by a real-world case study.
Folch JP, Zhang S, Lee RM, et al., 2022, SnAKe: Bayesian Optimization via Pathwise Exploration, ISSN: 1049-5258
Bayesian Optimization is a very effective tool for optimizing expensive black-box functions. Inspired by applications developing and characterizing reaction chemistry using droplet microfluidic reactors, we consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations. We further assume we are working asynchronously, meaning we have to select new queries before evaluating previous experiments. This paper investigates the problem and introduces 'Sequential Bayesian Optimization via Adaptive Connecting Samples' (SnAKe), which provides a solution by considering large batches of queries and preemptively building optimization paths that minimize input costs. We investigate some convergence properties and empirically show that the algorithm is able to achieve regret similar to classical Bayesian Optimization algorithms in both synchronous and asynchronous settings, while reducing input costs significantly. We show the method is robust to the choice of its single hyper-parameter and provide a parameter-free alternative.
Ceccon F, Jalving J, Haddad J, et al., 2022, Presentation abstract: Optimization formulations for machine learning surrogates, Computer Aided Chemical Engineering, Pages: 57-58
In many process systems engineering applications, we seek to integrate surrogate models, e.g. already-trained neural network and gradient-boosted tree models, into larger decision-making problems. This presentation explores different ways to automatically take the machine learning surrogate model and produce an optimization formulation. Our goal is to automate the entire workflow of decision-making with surrogate models from input data to optimization formulation. This presentation discusses our progress towards this goal, gives examples of previous successes, and elicits a conversation with colleagues about the path forward.
Tsay C, 2021, Sobolev trained neural network surrogate models for optimization, Computers & Chemical Engineering, Vol: 153, Pages: 1-14, ISSN: 0098-1354
Neural network surrogate models are often used to replace complex mathematical models in black-box and grey-box optimization. This strategy essentially uses samples generated from a complex model to fit a data-driven, reduced-order model more amenable for optimization. Neural network models can be trained in Sobolev spaces, i.e., models are trained to match the complex function not only in terms of output values, but also the values of their derivatives to arbitrary degree. This paper examines the direct impacts of Sobolev training on neural network surrogate models embedded in optimization problems, and proposes a systematic strategy for scaling Sobolev-space targets during NN training. In particular, it is shown that Sobolev training results in surrogate models with more accurate derivatives (in addition to more accurately predicting outputs), with direct benefits in gradient-based optimization. Three case studies demonstrate the approach: black-box optimization of the Himmelblau function, and grey-box optimizations of a two-phase flash separator and two flashes in series. The results show that the advantages of Sobolev training are especially significant in cases of low data volume and/or optimal points near the boundary of the training dataset—areas where NN models traditionally struggle.
Tsay C, Baldea M, 2021, Non-dimensional feature engineering and data-driven modeling for microchannel reactor control, IFAC 2020 World Congress, Publisher: Elsevier BV, Pages: 11295-11300, ISSN: 2405-8963
Catalytic plate microchannel reactors (CPRs) are a promising means for modular hydrogen/fuels production from distributed natural gas resources. However, the equipment miniaturization presents challenges for process control, including spatially-distributed models, limited availability of measurements, and fast process time constants. In the present paper, we investigate the use of data-driven models—specifically, artificial neural networks (ANNs)—to estimate temperature “hotspots” within CPRs. We prescribe nonlinear transformations of the model inputs in the form of well-known dimensionless quantities (e.g., Reynolds number), and we show that these engineered features can improve the prediction capability of computationally parsimonious ANNs using a first-principles reactor model. Finally, we present a simulation case study that demonstrates the use of a trained ANN for inferential model predictive control.
Seo K, Tsay C, Edgar TF, et al., 2021, Economic Optimization of Carbon Capture Processes Using Ionic Liquids: Toward Flexibility in Capture Rate and Feed Composition, ACS SUSTAINABLE CHEMISTRY & ENGINEERING, Vol: 9, Pages: 4823-4839, ISSN: 2168-0485
Tsay C, Kronqvist J, Thebelt A, et al., 2021, Partition-based formulations for mixed-integer optimization of trained ReLU neural networks, Publisher: arXiv
This paper introduces a class of mixed-integer formulations for trained ReLUneural networks. The approach balances model size and tightness by partitioningnode inputs into a number of groups and forming the convex hull over thepartitions via disjunctive programming. At one extreme, one partition per inputrecovers the convex hull of a node, i.e., the tightest possible formulation foreach node. For fewer partitions, we develop smaller relaxations thatapproximate the convex hull, and show that they outperform existingformulations. Specifically, we propose strategies for partitioning variablesbased on theoretical motivations and validate these strategies using extensivecomputational experiments. Furthermore, the proposed scheme complements knownalgorithmic approaches, e.g., optimization-based bound tightening capturesdependencies within a partition.
Kronqvist J, Misener R, Tsay C, 2021, Between steps: Intermediate relaxations between big-M and convex hull formulations, Publisher: arXiv
This work develops a class of relaxations in between the big-M and convexhull formulations of disjunctions, drawing advantages from both. The proposed"P-split" formulations split convex additively separable constraints into Ppartitions and form the convex hull of the partitioned disjuncts. Parameter Prepresents the trade-off of model size vs. relaxation strength. We examine thenovel formulations and prove that, under certain assumptions, the relaxationsform a hierarchy starting from a big-M equivalent and converging to the convexhull. We computationally compare the proposed formulations to big-M and convexhull formulations on a test set including: K-means clustering, P_ball problems,and ReLU neural networks. The computational results show that the intermediateP-split formulations can form strong outer approximations of the convex hullwith fewer variables and constraints than the extended convex hullformulations, giving significant computational advantages over both the big-Mand convex hull.
Tsay C, Kronqvist J, Thebelt A, et al., 2021, Partition-Based Formulations for Mixed-Integer Optimization of Trained ReLU Neural Networks, 35th Conference on Neural Information Processing Systems (NeurIPS), Publisher: NEURAL INFORMATION PROCESSING SYSTEMS (NIPS), ISSN: 1049-5258
Tsay C, Cao Y, Wang Y, et al., 2021, Identification and Online Updating of Dynamic Models for Demand Response of an Industrial Air Separation Unit, 16th IFAC Symposium on Advanced Control of Chemical Processes (ADCHEM), Publisher: ELSEVIER, Pages: 140-145, ISSN: 2405-8963
Seo K, Tsay C, Hong B, et al., 2020, Rate-Based Process Optimization and Sensitivity Analysis for Ionic-Liquid-Based Post-Combustion Carbon Capture, ACS SUSTAINABLE CHEMISTRY & ENGINEERING, Vol: 8, Pages: 10242-10258, ISSN: 2168-0485
Tsay C, Lejarza F, Stadtherr MA, et al., 2020, Modeling, state estimation, and optimal control for the US COVID-19 outbreak, SCIENTIFIC REPORTS, Vol: 10, ISSN: 2045-2322
Caspari A, Tsay C, Mhamdi A, et al., 2020, The integration of scheduling and control: Top-down vs. bottom-up, JOURNAL OF PROCESS CONTROL, Vol: 91, Pages: 50-62, ISSN: 0959-1524
Tsay C, Baldea M, 2020, Integrating production scheduling and process control using latent variable dynamic models, CONTROL ENGINEERING PRACTICE, Vol: 94, ISSN: 0967-0661
Tsay C, Pattison RC, Zhang Y, et al., 2019, Rate-based modeling and economic optimization of next-generation amine-based carbon capture plants, Applied Energy, Vol: 252, Pages: 1-15, ISSN: 0306-2619
Amine scrubbing processes remain an important technology for mitigating the contribution of carbon emissions to global warming and climate change. Like other chemical processes, they can benefit from computer-aided optimization at the design stage, but systematic optimization procedures are rarely employed due to the challenges of simulating the requisite rate-based mass transfer and reaction models. This paper presents a novel approach for the simulation and optimization of rate-based columns, with specific application to the absorber and stripper columns found in (amine-) solvent-based carbon capture processes. The approach is based on pseudo-transient continuation, and the resulting column models are easily incorporated into large-scale process flowsheets with other previously developed pseudo-transient models. We demonstrate that the proposed approach allows for gradient-based optimization of next-generation amine scrubbing processes by considering a complex carbon capture process under three different operating conditions. The results provide general insight into the design of amine scrubbing processes, and shadow prices at the optimal point(s) suggest potential avenues for improving the process economics. The effects of carbon dioxide removal percentage and flue gas composition on process economics are briefly analyzed.
Tsay C, Baldea M, 2019, <i>110th Anniversary</i>: Using Data to Bridge the Time and Length Scales of Process Systems, INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, Vol: 58, Pages: 16696-16708, ISSN: 0888-5885
Tsay C, Li Z, 2019, Automating Visual Inspection of Lyophilized Drug Products With Multi-Input Deep Neural Networks, 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), Publisher: IEEE
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.