Relational DBMSs allow users to extend the standard declarative SQL language surface using User Defined Functions (UDFs) that implement custom behavior. While UDFs offer many advantages, it is well-known amongst practitioners that they can cause severe degradation in query performance. This degradation is due to the fact that state-ofthe-art query optimizers treat UDFs as black boxes and do not reason about them during optimization.
We demonstrate Froid, a framework for optimizing UDFs by opening up this black box and exposing its underlying operations to the query optimizer. It achieves this by systematically translating the entire body of an imperative multistatement UDF into a single relational algebraic expression. Thereby, any query invoking this UDF is transformed into a query with a nested sub-query that is semantically equivalent to the UDF. We then leverage existing sub-query optimization techniques and thereby get efficient, set-oriented, parallel query plans as opposed to inefficient, iterative, serial execution of UDFs.
We demonstrate the benefits of Froid including performance gains of up to multiple orders of magnitude on real workloads. Froid is available as a feature of Microsoft SQL Server 2019 called ‘Scalar UDF Inlining’.