Frame is given as follows For 1 ? n

Frame blocking and windowing:

Frame blocking method is employed to extract the feature parameters. Each of the frame
is windowed (Gupta 2016) to minimize the discontinuities of the signal at both
the ends of frame. The windowing process can be expressed as follows:

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

where

 denotes the windowed independent components in
the

 frequency bin, and ?(n) denotes  Hamming window (Gupta 2016) and it is defined
as

Linear Predictive Coding Coefficient
(LPCC):

The feature extraction is
used to represent speech signal into a finite number of measures of signal.
Each feature represents the spectrum of speech signal in a windowed frame. The
coefficients taken from auto regressive model minimizes the difference between
reckoned and pristine value.

LPC
analysis (Gupta 2016) is an effective method to estimate the parameters of
speech signals.

Where ?
denotes the

 autocorrelation matrix

The
autocorrelation vector is given as

The filter
coefficients vectors are given as follows

The matrix of equations that need to be solved is

                                                                              
=

 

Where

n
represents the autocorrelation function of a windowed speech signal.

Cepstral analysis is the process of
finding the cepstrum of a speech sequence. Cepstral coefficients (Gupta 2016)
can be reckoned from the LPC via a set of recursive procedure. The cepstral
coefficients obtained in this way are called Linear Predictive Cepstral
Coefficients (LPCC).

The recursive procedure is given as follows

                            For 1 ? n ? p

                                            
For n > p

Thereby   the resulting speech signals are linear
combination of the previous p samples. Therefore, the speech production model
can also be defined as linear prediction model or the autoregressive model.
Here ?p? indicates the order of the LPC analysis and the excitation signal em
reckoned here can be termed as prediction error signal or residual signal for
LPC analysis. LPC analysis results in reckoning of smoothed spectrum so most of
the influence of the excitation is discarded.

Dynamic
Time Warping (DTW):

End

Extracting feature parameters
(LPCC)

Frequency bin

Inverse

Establish new model

Check if it is last frequency bin

Need
inverse

Reference model

Frame blocking and windowing

DTW algorithm

Y

N

NN3

YN3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Fig3. Flow chart for Dynamic Time
Warping

Dynamic Time Warping is used for
reckoning of distance between two time series. A time series is a list of
samples taken from a pristine signal and they are ordered by the time at which
their corresponding samples were obtained.The matching distance can be used
between two time series to resample one of them followed by making comparison
in sample-by-sample

 The drawback here is, that it does not produce
intuitive results because the compared samples may not correspond well. The DTW
algorithm removes this discrepancy by reckoning the optimal alignments between
the sample points in the two time series. The algorithm is called “time warping” because it warps
the axes of the two time series in such a way that the corresponding samples
will appear at same location on a common time axis

The adjustment matrix P can be determined by minimum
distortion

 of independent components between two adjacent
frequency bins

The minimum distortion can be obtained between the
independent components 

  and

 . The maximum value in the correlative
coefficients will located in the diagonal and the maximum correlative
coefficient sets are 1 and the others are 0. Then the two adjustment matrices
can be represented         as                                                                                                     

 or

When the adjustment matrix P (Noboru Murata 2001) is

Then it is understood that the independent components located
at the diagonal come from the same speech source and the position does need not
to be adjusted. Otherwise, it should be inversed.

Thus the permutation ambiguity gets
solved by multiplying permutation matrix with the scaled independent
components. Let it is given as

 

 

Then, the independent component

 is multiplied by the permutation matrix P,

 

Similarly the independent component

 is multiplied by the permutation matrix P (D. S.
Jayaraman 2002) ,

  

Finally a new reference template will be
originated and this will replace the previously stored template

Perceptual
Estimation of Speech Quality (PESQ):

Quality
evaluation for speech processing is important in the field of BSS when speech
signal is taken into account, which has been growing in the recent years. For convolutive BSS, the quality of algorithms is reckoned
using signal-to-interference ratio but it requires the knowledge of mixing
conditions. It is found to be difficult to determine the signal-to-interference
ratio in an real time environment. So Perceptual Estimation of Speech
Quality is adapted. In PESQ, both the
reference signal REF and degraded signal DEG will be sampled at

 Hz. It can measure both NB-PESQ (narrowband
PESQ measure) as well as WB-PESQ (wideband PESQ measure). It supports both
modes through the MODE parameter. Using the score value PESQ can be determined.

Simulated Results

Step1:

Initially three input signals were
given and their spectrogram representation is given in x-y axis as Time Vs
Amplitude

Fig4. Input signals gets read and
their spectrogram representation

The input signal is taken at range , no
clipping is done at this range to obtain full fidelity of the signal. If the
signal exceeds the range then audio is clipped at ‘-1’ to ‘+1’

Step2:

The three input signals were mixed and
their spectrogram representation in x-y axis as follows

Fig5. Mixed input signal

The mixing value range is at
‘3x50000double’. The mixing signal obtained by generating random matrix from
the given input signal and multiplying it with the transpose of each input
signal

Step3:

Here the three mixed signal gets separated. Using RKHS the
higher order feature parameters are extracted and ambiguity gets overcome by
DTW.

 

Fig6.Seperation of target signal from
the mixed signal

 

Step4:

Here the PESQ was estimated for the
original and the estimated signal. The PESQ is obtained by taking the score
value. The score value is estimated for all the three signals.