Audio cross-fade implementation via rational functions

Implementation of the cross-fade audio effect requires shaping the fade profile for a certain audio content that is to be faded out, as well as customizing the audio fade for an additional audio content, which is to be faded in, with the purpose of achieving a smooth transition between the two different audio contents. Similar to the case of applying adjustable fades, the audio cross-fades are usually carried out in the off-line mode, by employing various transcendental functions to shape the audio fades (both fade-out and fade-in effects). To improve the computational capabilities by minimizing the delay between receiving the position within the cross-fade effect and returning both the volume of the faded out section and the volume of the faded in one, we consider that, during the cross-fade effect, the audio volume of each of the two overlapping sections is the output of a rational function i.e. a mapping defined by a rational fraction. A plain HTML5/JavaScript implementation, prepared to be tested in any major browser, is advanced in the paper in order to highlight the suitability of the suggested approach to audio cross-fade customization with real-time computing.


Introduction
In audio engineering, the cross-fade effect is related to audio volume processing. It is well-known and widely used with the purpose of carrying out a smooth transition between two separate audio contents. Performing a cross-fade engages the simultaneous application of fade-out and fade-in audio effects so that the transition from the first content to the second one sounds more natural that is without inserting silence. More precisely, during the same time interval, the first audio content is faded out while the second audio content is faded in. Since the length of the fade-out effect to be applied upon the first audio content is identical to the length of the fade-in effect to be applied upon the second audio content, one identifies a faded out audio section that overlaps a faded in section [1]- [6].
Practically, customizing the audio cross-fade involves setting out the length of the overlapping region, which comes to be the cross-fade length, as well as shaping the fades to be applied upon the two overlapping audio sections [3]- [6]. Since the audio fades are traditionally implemented by means of transcendental functions (e.g. logarithm, exponential, sine) to impose the time-related evolution of the audio volume, the implementation of the audio cross-fade is mostly performed within audio editors by constructing and concurrently valuating two distinct transcendental functions.
However, with the increasing emphasis placed on reactive computing [7]- [15], the primary concern is now related to optimizing the delay between receiving the input, i.e. the playback position, and returning the output, i.e. the audio volume of a definite faded section. In this context, it has been proven that the employment of particular rational functions to shape the audio fades provides both effectiveness and high versatility [16], [17]. Hence, in order to maintain the suitability with reactive computing, the cross-fade implementation will be carried out by considering that the volume of each of the two overlapping sections represents precisely the output of an appropriate rational function.

Cross-fade customization
To fade the first audio content out, we consider the following rational function that associates the current position within the audio cross-fade to the audio volume [16]: wherein one identifies [16], [17]: -the mapping output i.e. the volume of the first audio content, which is to be faded out, -the position within the cross-fade effect i.e. the difference of the playback position within the first audio content and the position corresponding to cross-fade initiation (the mapping input), -the audio cross-fade length.
Both the high level of versatility and the suitability with reactive computing, received by implementing rational function (1) to customize the fade-out transition, have been emphasized in detail [16], [17]. It can also be observed that, by adopting (1), the resulted fade-out effects can act, in real-time, similar to the fade-outs of exponential, logarithmic and S-curve shape, respectively [16].
Having at hand function (1), wherein alongside parameter , the coefficients , and decide the fade-out profile, in order to preserve the suitability with real-time computing, we adopt the following composite function with a view to fading the second audio content in: that is With rational function (1) employed to fade the first audio content out, and rational function (3) used to fade the second audio content in, one perceives that at the cross-fade midpoint = 2 ⁄ , the audio volumes come to be exactly the same, i.e.
Hence, one may introduce: Relationship (6) plainly reveals that the volume of the second audio content (to be faded in) at the beginning of the cross-fade effect is identical to the volume of the first audio content (to be faded out) at the end of cross-fading. On the other hand, according to relation (3), for = we get: Thus, the employment of the composite function (2) ensures that, at the end of cross-fading, the volume of the second audio content, which is to be faded in, matches the volume of the first audio content at the cross-fade initiation i.e. the maximum volume. Imposing now the value for the volume of the first audio content at the beginning of the crossfade that is precisely the volume of the second audio content at the end of cross-fading, and taking into account that the fade-out effect leads to silence while the fade-in transition starts out from silence, one may set: On the other hand, according to (5), we have the audio volumes at the cross-fade halfway point: wherein quantity stands for the ratio of the volume of the first or of the second audio content, occurring at the cross-fade midpoint, to the initial volume of the first audio content, which is to be faded out, that is just the final volume of the second audio content, which is to be faded in.
Having in view relationship (1) along with conditions (8) and (9), one receives the coefficients interfering both in rational fraction (1) and rational fraction (3), which yield the audio contents levels during the cross-fade [16]: Furthermore, relation (1), where the coefficients , , are given by (10), validates the decaying of the volume of the first audio content during the cross-fade effect. More precisely, (1) and (10) lead to a negative rate of change of the volume of the first audio content [16]:  (9), Similarly, according to relation (2), defining the composite function that associates the position within the cross-fade effect to the volume of the second audio content, one obtains: Taking now into consideration (12), it plainly follows that (2) accurately depicts a fade-in audio effect, i.e. we have With mappings (1) and (3), employed for implementing audio fade-outs and fade-ins, respectively, customizing the cross-fade implies setting the value of the cross-fade length i.e. , selecting out of the set {1, 2, 3, 4}, and imposing the value of quantity i.e. the ratio of the volume of the faded out section, occurring at the halfway point, to the volume of the same audio section, detected at the crossfade initiation. The high level of versatility of the cross-fade effect received by implementing (1) and (3) is due to the fact that, according to relations (11)-(14), we have < 0 and > 0, ∈ 0, for any within the set {1, 2, 3, 4}, and for any value of , defined by (9), situated in the open interval (0, 1) that is for any volume of each of the two overlapped audio sections, which is imposed at the cross-fade midpoint and is less than the initial volume of the faded out section.
To emphasize the versatility of the cross-fade effect obtained by implementing rational functions (1) and (3)       As forenamed, according to (9), quantity denotes the ratio of the volume of the audio sections, which is imposed at the cross-fade midpoint, to the maximum volume of the audio sections. For the sake of simplicity, we have assumed that the maximum volume interfering in Fig. 1 up to Fig. 4 has the value = 1. In this case, one perceives that quantity , introduced by (9), comes to be just the audio volume , imposed at the cross-fades halfway point that is = 2.5 s. One observes that the cross-fade effect of Fig. 1, which corresponds to = 1 and ≡ = 0.5, is carried out by applying linear fades. On the other hand, the cross-fades corresponding to = 0.5 and represented in Fig. 2, Fig. 3, and Fig. 4, received by successively increasing the value of in mappings (1) and (3), are performed via fades that incorporate attributes of the S-curve shape [4]- [6]. Thus, even if , i.e. the volume at the midpoint, is preserved at the value of 0.5, in contrast to the effect depicted in Fig. 1, the cross-fades of Fig. 2, Fig. 3 and Fig. 4 are carried out by maintaining the faded out sections at a higher level in the beginning of the effect and at a lower level towards its ending and, at the same time, by keeping the faded in sections at a lower level in the beginning of the cross-fade and at a higher level in the ending region of the effect. The same applies to the cross-fades corresponding to = 0.2 and = 0.7, respectively, having in view that not only in the beginning but also in the ending region of a fade-out received by implementing (1), the absolute value of the rate of change of the audio volume decreases with parameter , encompassed by fraction (1) [16] whilst both in the beginning and in the ending region of a fade-in received by implementing the composite function (3), the rate of change of audio volume decreases with in (3).

JavaScript implementation
In order to check the suggested technique of customizing audio cross-fades from the perspective of reactive computing, we have prepared an HTML5/JavaScript implementation. Assuming that the locators (URLs) of the audio files are provided by the user in the form of function arguments, the code advanced here is ready for running in any major browser. As it is designed, the application initiates a 5 s cross-fade at the playback position of 20 s within the audio contents. To prove the efficiency of the technique, we have set the parameters values, i.e. the functions arguments, so that the cross-fade corresponds to = 0.7 in Fig. 4 that is for = 4 in mappings (1) and (3). On the other hand, with the purpose of accomplishing discretization, the "setInterval()" method of the "window" object is employed in order to evaluate the outputs of (1) and (3), wherein = 4, once every 50 ms.
In order to receive the outputs of mappings (1) and (3) for a certain playback position within the cross-fade, functions "vO()" and "vI()" are invoked. In addition to parameters of function "vO()" that is the implementation of (1), function "vI()" includes the parameter "tauF" i.e. the cross-fade length. This is because function "vI()" represents just the implementation of the composite function (2), used here to fade the second audio content in. Both function "vO()" and function "vI()" are invoked inside the code of function "setVols()", which is designed to automate the volume of each audio content. When it is called, function "setVols()" sets the volume property of the two audio objects, namely "audE1" and "audE2", which point to the appropriate audio contents. Before the cross-fading, the volume of the first audio content, that has to be faded out, is preserved at the value of 1, which represents both the highest value adopted in HTML5 for the audio volume [13], [14], and the highest value of the volume of transitions depicted in Fig. 1 up to Fig. 4. During cross-fading, the "volume" property of audio object "audE1", associated with the audio content to be faded out, is set at the return value of function "vO()" while the "volume" property of audio object "audE2", associated with the audio content to be faded in, is set at the return value of function "vI()".
It has to be emphasized that the implementation is optimized so that not only the playback position corresponding to cross-fade initiation but also the cross-fade attributes (length, shape of fade-out and fade-in transitions) can easily be managed by means of the arguments (parameters values) passed to function "playAndCrossFade()" only.

Conclusion
Both the fade-out and fade-in audio effects, required for carrying out audio cross-fades, are usually implemented in the off-line mode by means of various transcendental functions, such as logarithm, sine, and exponential function. Nevertheless, from the perspective of real-time computing, the receiving of both the volume of the audio section to be faded out and the volume of the section to be faded in exclusively by valuating the outputs of different transcendental functions could be regarded as very time consuming.
Consequently, to ensure that, during the cross-fade effect, the audio volume of each of the two overlapping sections is returned without noticeable delays, we have adopted appropriate rational functions to implement the fade-out and the fade-in audio effects, with the mention that the input of both functions is represented by the playback position within the cross-fade effect whilst the output of one of the functions is just the volume of the faded out audio section, and the output of the other function is precisely the volume of the faded in section.
The two mappings that allow the audio volume controlling have been chosen so that the volume at cross-fade midpoint of the faded out section is equal to the volume at midpoint of the faded in audio section. Hence, the cross-fade effect can straightforwardly be customized just by setting the ratio of the volume of the section to be faded out, occurring at midpoint, to the initial volume of the same audio section, i.e., practically, the maximum volume of each of the two separate audio sections.
The high level of versatility of the resulted cross-fade effect is pointed up by the transitions illustrated in Fig. 1 up to Fig. 4, where, in contrast to the cross-fades obtained for = 0.5 and = 0.7, the cross-fades corresponding to = 0.2 introduce a volume dropping in the middle area of the effect, being, therefore, appropriate for the situation in which the tempi of the overlapped audio sections are exceedingly different. On the other hand, both the efficiency and convenience of customizing the cross-fade effect by means of the proposed rational functions are validated by a plain JavaScript implementation, with the discretization being achieved by employing the "setInterval()" method of the "window" object.