Change Detection - Diff - https://mvsep.com/en/news @ 1727961529

Try our Chrome extension

Easily add the current web-page from your browser directly into your changedetection.io tool, more great features coming soon!

Changedetection.io needs your support!

You can help us by supporting changedetection.io on these platforms;

The more popular changedetection.io is, the more time we can dedicate to adding amazing features!

Many thanks :)

changedetection.io team

Text
Screenshot

Not yet seconds ago

            False

Not yet seconds ago

Current erroring screenshot from most recent request

2 hours ago

Grey lines are ignored Blue lines are triggers Pro-tip: Highlight text to add to ignore filters

* Home

* News

* Plans

* Demo

* FAQ

* Create Account

* Login

* English Русский 中文 Polski Portugues do Brasil Español 日本語 Français Türkçe हिन्दी Tiếng Việt Deutsch 한국어 Bahasa Indonesia Italiano Svenska suomi български език magyar nyelv עִבְֿרִית ภาษาไทย limba română

September news

2024-09-23

1) We have added new piano models. The MVSep Piano model now comes in several variants based on the MDX23C, MelRoformer and SCNet Large neural net architectures. The model produces high-quality separation of music into piano and everything else. See the results in the table below. For comparison, the table shows metrics on the open model Demucs4HT (6 parts) and the old model "mdx23c (2023.08)". The SDR metric used is the higher the better.

Algorithm name Validation type

piano (SDR) other (SDR)

Demucs4HT (6 stems) 2.23 14.51

mdx23c (2023.08, SDR: 4.79) 4.79 17.07

mdx23c (2024.09, SDR: 5.59) 5.59 17.89

MelRoformer (viperx, SDR: 5.67) 5.67 17.95

SCNet Large (2024.09, SDR: 5.89) 5.89 18.16

Ensemble (SCNet + Mel, SDR: 6.19) 6.19 18.47

Listen to: demo, user demos.

2) We have updated our guitar models. A model based on the BSRoformer architecture by viperx has been added. The ensemble has also been updated. It is the one used by default. SDR on our test dataset increased from 7.18 to 7.51.

Listen to: demo, user demos

3) We added a new version of MelBand Roformer for vocals, which showed record results on Synth dataset. You can select it from the list called "Bas Curtiz edition (SDR vocals: 11.18, SDR instrument: 17.49)" in the "MelBand Roformer (vocals, instrumental)" section.

4) We added a new algorithm to the Experimental section: "Apollo MP3 Enhancer (by JusperLee)". This algorithm improves the sound quality of MP3 files compressed with a bitrate of 128 kbps or less. The algorithm is based on the paper "Apollo: Band-sequence Modeling for High-Quality Audio Restoration" and the model is available on huggingface. Below are the spectrograms for the audio compressed to 32 kbps (left) and restored by the new algorithm (right).

Listen to: demo, user demos.

5) We added the "Aspiration by Sucial" algorithm. This algorithm extracts whispers from the voice. The algorithm has limited use, but may be useful to someone. The model was published in our open models topic on github and is also available for download on huggingface.

Listen to: demo, user demos.

🗎 Copy link

August Updates

2024-08-21

We have many updates related to vocal models:

1) The BS Roformer (vocals, instrumental) model has been updated. SDR metrics have increased for vocals from 11.24 to 11.31 and for the instrumental from 17.55 to 17.62

2) We have added a new MelBand Roformer (vocals, instrumental) model. The neural network was first proposed in the article "Mel-Band RoFormer for Music Source Separation" by a group of scientists from ByteDance. The first high-quality weights were made publicly available by Kimberley Jensen. Then the neural network was slightly modified and finetuned by the MVSep team in order to improve the quality metrics. SDR for vocals is comparable to BS Roformer: 11.17. SDR for instrumental: 17.48.

3) Due to the new MelBand Roformer model, all algorithms of the Ensemble series have increased the metrics for vocals from 11.33 to 11.50 and for instrumental from 17.63 to 17.81.

4) We have added a new SCNet (vocals, instrumental) model. The neural network is proposed in the article "SCNet: Sparse Compression Network for Music Source Separation" by a group of scientists from China. The authors have made the neural network code open source, and the MVSep team was able to reproduce results similar to those presented in the published paper. First, we trained a small version of SCNet, and then after some time, a heavier version of SCNet was prepared. The quality metrics are quite close to the quality of Roformer models (which are the top models at the moment), but still slightly inferior. SDR metrics for the large version of the network. Vocals: 10.74 and instrumental part: 17.05.

5) An experimental model for noise removal DeNoise by aufr has been added. The model was prepared and made publicly available by aufr.

All measurements of SDR metrics were carried out on the Multisong dataset.

🗎 Copy link

July updates

2024-07-20

1) We have added the ability to log in to the site via social networks.

2) A new model for drums has been added, which is significantly superior to the old ones. This is an ensemble of HTDemucs and MelRoformer models. The model is available on the website under the name "MVSep Drums (drums, other)".

Below you can find metrics on MultiSong dataset:

HTDemucs (drums fintuned): 12.04

MelRoformer (drums): 12.76

HTDemucs + MelRoformer: 13.05

Also, these models were added to ensemble algorithms and there the metric is even higher: 13.15

The previous best metrics for drums were:

Model HT Demucs (original): 11.24

In ensemble: 11.99

Examples: https://mvsep.com/en/demo?algorithm_id=44

3) We have added new models Bandit v2 for Cinematic source separation. The models divide the track into 3 components “music”, “speech” and “effects/sfx”. The model was trained on the new multilingual dataset Divide and Remaster (DnR) v3.

Examples: https://mvsep.com/en/demo?algorithm_id=45

4) We have added a new model for dividing drums into component parts (DrumSep). This model was prepared by aufr33 and jarredou. It divides the drums into 6 parts: kick, snare, toms, hh, ride, crash. We do not yet have a test dataset to check the quality of such models, so it is difficult to say which of the two available models is better.

Examples: https://mvsep.com/en/demo?algorithm_id=37

5) We have added 2 new models to remove the reverb effect. The models are prepared by anvuew and are based on models with the MelRoformer and BSRoformer architecture. FoxJoy's previous model was based on the MDX-B architecture and removed reverb from the entire track. New models remove the reverb effect from vocals only. It is also difficult to say how well the new models work compared to the previous version.

Examples: https://mvsep.com/en/demo?algorithm_id=22

🗎 Copy link

Summer updates

2024-07-01

We have several updates:

1) We have successfully moved to a new server and expect more stable data loading speeds for all users.

2) We have added a new leaderboard for guitar models (includes electric and acoustic): https://mvsep.com/quality_checker/leaderboard/guitar/?sort=guitar

3) We have updated our old guitar model "MVSep Guitar (guitar, other)". Previously, it used the MDX23C architecture. Now there are two versions available: the updated version of MDX23C and MelRoformer. A comparison of quality metrics on the new leaderboard is below:

Algorithm name Validation type

guitar (SDR) other (SDR)

Demucs4HT (6 stems) 5.22 12.19

mdx23c Old (2023.08, SDR: 4.78) 4.78 11.75

mdx23c New (2024.06, SDR: 6.34) 6.34 13.31

MelRoformer (2024.06, SDR: 7.02) 7.02 13.99

Ensemble (mdx23 + MelRoformer, SDR: 7.18) 7.18 14.15

4) We have added a new model “MVSep Multichannel BS (vocals, instrumental)”. This model is specially prepared for extracting vocals from multi-channel audio (5.1, 7.1, etc.). After processing, it returns multi-channel audio in the same format in which it was sent to the server with the same sample rate. We accept multichannel WAV/FLAC as input.

🗎 Copy link

Move to new server

2024-06-15

We are going to move to a new server within the next week. More stable operation and high speed of uploading files to the server are expected. Previously, many users complained about the low speed. We hope that this problem will be resolved after the migration. Please report us about any problems you encounter on the new server.

🗎 Copy link

Updates of bass models and ensembles

2024-05-24

We did update for our bass models. Previously best SDR for bass was for single model HTDemucs4 FT ~12.05 and in ensemble it was 12.59. We added new model with name "MVSep Bass (bass, other)" - it's ensemble of 2 models finetuned HTDemucs4 and trained from scratch BS Roformer. This model has 2 options - you can extract bass directly from mixture or first extract vocals and after extract bass only from instrumental part.

- SDR for extracting from mixture: 13.25

- SDR for extracting from instrumental: 13.42

We also updated our "Ensemble (vocals, instrum, bass, drums, other)" and "Ensemble All In". Their SDR for bass also increased from 12.59 to 13.44.

🗎 Copy link

Updates of vocal models and ensembles

2024-04-04

1) After release of viperx BS Roformer weights we finetuned them on our dataset. And we were able to improve their SDRs even further. So we added new version of BSRoformer weights. Currently it's probably best available models in the world.

Multisong validation dataset:

SDR vocals: 10.87 -> 11.24

SDR insrum: 17.17 -> 17.55

Synth validation dataset:

SDR vocals: 12.71 -> 13.47

SDR insrum: 12.41 -> 13.17

2) Ensembles also improved:

Ensemble (vocals, instrum) on Multisong dataset:

SDR vocals: 11.06 -> 11.33

SDR instrum: 17.37 -> 17.63

Ensemble (vocals, instrum) on Synth dataset:

SDR vocals: 13.00 -> 13.57

SDR instrum: 12.70 -> 13.27

Ensemble (vocals, instrum, bass, drums, other):

SDR vocals: 11.06 -> 11.33

SDR instrum: 17.37 -> 17.63

SDR bass: 12.57 -> 12.59

SDR drums: 11.94 -> 11.99

SDR other: 7.22 -> 7.33

3) We were reported about some "click" sounds in separated stems. We improved our inference code. They must have gone now. Please check an report us if the problem still exists.

🗎 Copy link

Latest march updates

2024-03-29

1) ViperX released his weights for BS Roformer model which is doing separation on vocal and instrumental parts. Quality of separation currently the best available in the world. We added these weights on MVSep. SDR metrics increased comparing to our own BS Roformer model.

Multisong dataset:

SDR vocals changed: 10.43 -> 10.87

SDR instrumental changed: 16.73 -> 17.17

Synth dataset:

SDR vocals changed: 12.45 -> 12.76

SDR instrumental changed: 12.16 -> 12.46

2) Based on new ViperX model we updated our Ensembles alogrithms:

Ensemble (vocals, instrum) on Multisong dataset:

SDR vocals: 10.75 -> 11.06

SDR instrum: 17.06 -> 17.37

Ensemble (vocals, instrum) on Synth dataset:

SDR vocals: 12.76 -> 13.00

SDR instrum: 12.46 -> 12.70

Ensemble (vocals, instrum, bass, drums, other):

SDR vocals: 10.75 -> 11.06

SDR instrum: 17.06 -> 17.37

SDR bass: 12.53 -> 12.57

SDR drums: 11.84 -> 11.94

SDR other: 7.15 -> 7.22

3) We added more functionality to our MVSep API for developers.

🗎 Copy link

February and march updates

2024-03-12

1) We have released a new high quality model BS Roformer v2. This is Transformers-based architecture from the ByteDance team. Quality metrics are slightly superior to those of the MDX23C. The model continues to improve, so expect new releases in the near future. The demo can be viewed here.

2) All ensembles have been updated to take into account BS Roformer v2. The old version of the ensembles also remains available. Ensemble SDR metrics have increased:

Vocals SDR: 10.44 -> 10.75

Instrumental SDR: 16.74 -> 17.06

3) We have added the ability to download an archive of files received after separation.

4) A high-quality model Whisper (large-v3 version) from OpenAI has been added, which allows you to obtain a transcription of a song/dialogue text from arbitrary audio.

🗎 Copy link

January updates

2024-01-13

1. All Ensembles now have the setting "Include intermediate results and max_fft, min_fft". This option will output the results of each independent algorithm from the ensemble. Since the algorithms work differently, some of them may produce a result that is better than the final ensemble. And min_mag and max_mag allow you to filter out leaked stems in some cases.

2. The Ensemble All-In algorithm now includes the results of the DrumSep algorithm.

3. Now very long tracks (15 minutes and more) are divided into parts and processed on several GPUs at once, allowing to get results faster.

🗎 Copy link

Christmas updates

2023-12-29

1) We have added the DrumSep model. This model produces a detailed separation of the drum track into 4 types: 'kick', 'snare', 'cymbals', 'toms'. The DrumSep model from this github repository is used. The model has two operating modes. The first (default) in the begining applies the Demucs4 HT model to the track, which extracts only the drum part. Next, the DrumSep model is applied. If your track consists only of drums, then it makes sense to use the second mode, where the DrumSep model is applied directly to the loaded audio. Demos available here.

2) A similar LarsNet model was also added, which divides the track into 5 types: 'kick', 'snare', 'cymbals', 'toms', 'hihat'. The model used is from this github repository and trained on the StemGMD dataset. The model has two operating modes. The first (default) applies the Demucs4 HT model to the track, which extracts only the drum part from the track. Next, the LarsNet model is used. If your track consists only of drums, then it makes sense to use the second mode. Unfortunately, subjectively, the quality of separation is inferior in quality to the DrumSep model. Demos available here.

🗎 Copy link

December updates

2023-12-21

1) We added new model BandIt Plus model for separating tracks into speech, music and effects. The model can be useful for television or film clips. The model was prepared by the authors of the article "A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation" in the repository on GitHub. The model was trained on the Divide and Remaster (DnR) dataset. And at the moment it has the best quality metrics among similar models. You can find demos here.

Quality table

Algorithm name DnR dataset (test)

SDR Speech SDR Music SDR Effects

BandIt Plus 15.62 9.21 9.69

2) The code for almost all models has been updated in such a way that the quality of separation has slightly increased and models became faster overall.

3) The Crowd removal model has been updated. It now has better hollywood laughter removal.

🗎 Copy link

New model for crowd noise removal

2023-11-20

We have prepared a unique model for removing crowd sounds from music recordings (applause, clapping, whistling, noise, etc.). Current metrics on our internal dataset for quality control :

* SDR crowd: 5.65

* SDR other: 19.31

Examples of how the model works can be found : here и here.

🗎 Copy link

November updates (MDX23C vocal model improvements)

2023-11-11

We upgraded our main MDX23C 8K FFT model to split tracks into vocal and instrumental parts. SDR metrics have increased on MultiSong Dataset and on Synth Dataset. Separation results have improved accordingly on both Ensemble 4 and Ensemble 8 models. See the changes in the table below.

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard

SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals

8K FFT, Full Band (Previous version) 10.17 16.48 12.35 12.06 11.04

8K FFT, Full Band (New version) 10.36 16.66 12.52 12.22 11.16

Ensemble 4 (Previous version) 10.32 16.63 12.67 12.38 11.09

Ensemble 4 (New version) 10.44 16.74 12.76 12.46 11.17

The previous version of MDX23C 8K FFT is also available for use.

🗎 Copy link

September updates

2023-09-18

1) We upgraded our main MDX23C 8K FFT model to split tracks into vocal and instrumental parts. SDR metrics have increased on MultiSong Dataset and on Synth Dataset. Separation results have improved accordingly on both Ensemble 4 and Ensemble 8 models. See the changes in the table below.

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard

SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals

8K FFT, Full Band (Old version) 10.01 16.32 12.07 11.77 10.85

8K FFT, Full Band (New version) 10.17 16.48 12.35 12.06 11.04

2) We have added two new models MVSep Piano (demo) and MVSep Guitar (demo). Both models are based on the MDX23C architecture. The models produce high quality separation of music into piano/guitar part and everything else. Each of the models is available in two variants. In the first variant, the neural network model is used directly on the entire track. In the second variant, the track is first split into two parts, vocal and instrumental, and then the neural network model is applied only to the instrumental part. In the second case, the separation quality is usually a bit higher. We also prepared a small internal validation set to compare the models by the quality of separation of piano/guitar from the main track. Our model was compared with two other models (Demucs4HT (6 stems) and GSEP). For the piano, we have two validation sets. The first set includes the electric piano as part of the piano part and the second set includes only the acoustic piano.

The metric used is SDR: the larger the better. See the results in the two tables below.

Validation type Algorithm name

Demucs4HT (6 stems) GSEP MVSep Piano 2023 (Type 0) MVSep Piano 2023 (Type 1)

Validation full 2.4432 3.5589 4.9187 4.9772

Validation (only grand piano) 4.5591 5.7180 7.2651 7.2948

Validation type Algorithm name

Demucs4HT (6 stems) MVSep Guitar 2023 (Type 0) MVSep Guitar 2023 (Type 1)

Validation guitar 7.2245 7.7716 7.9251

Validation other 13.1756 13.7227 13.8762

3) We have updated the MDX-B Karaoke model (demo). It now has better quality metrics. The MDX-B Karaoke model was originally prepared as part of the Ultimate Vocal Remover project. The model produces high quality extraction of the lead vocal part from a music track. We have also made it available in two variants. In the first variant, the neural network model is used directly on the whole track. In the second variant, the track is first divided into two parts, vocal and instrumental, and then the neural network model is applied only to the vocal part. In the second case, the separation quality is usually higher and it is possible to extract backing vocals into a separate track. The model was compared on a large validation set with two other Karaoke models from UVR (they are also available on the website). See the results in the table below.

Validation type Algorithm name

UVR (HP-KAROKEE-MSB2-3BAND-3090) UVR (karokee_4band_v2_sn) MDX-B Karaoke (Type 0) MDX-B Karaoke (Type 1)

Validation lead vocals 6.46 6.34 6.81 7.94

Validation other 13.17 13.02 13.53 14.66

Validation back vocals --- --- --- 1.88

🗎 Copy link

Summer updates

2023-08-08

We have a lot of updates. First of all we redid the site from scratch. It has new features like user registration, more informative pages, better design etc. But also we added set of new algorithms:

1) We have released MDX23C models and made update for them. One of models reached 10 SDR on multisong dataset. Currently it's best single models for separation of vocals/instrumental.

2) We added new algorithm Demucs4 Vocals 2023. It's algorithm demucsht_ft but finetuned on big dataset. Metrics are better than for original, but slightly worse than MDX23C. On some melodies it can give more cleaner results.

3) We added new Ensemble algorithms. First is "Ensemble 4 models (vocals, instrum)". It includes: UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023 and two MDX23C models. Algorithm gives the highest possible quality for vocal and instrumental stems. Also if you need more detailed separation including 3 more stems "bass", "drums", "other" you can use "Ensemble 8 models (vocals, bass, drums, other)". This ensemble gives state of art results for 4 stems separation.

You can find comparison tables below (larger SDR is better).

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard

SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals

Ensemble of 4 models 10.18 16.48 12.25 11.95 10.95

MDX23C, 8K FFT, Full Band 10.01 16.32 12.07 11.77 10.85

UVR-MDX-NET-Voc_FT 9.64 15.95 11.40 11.10 10.50

Demucs4 HT Vocals 2023 9.04 15.35 11.59 11.29 9.61

Demucs4 HT default (htdemucs_ft) 8.33 14.63 10.23 9.94 9.08

Algorithm name Multisong dataset

SDR Bass SDR Drums SDR Other SDR Vocals SDR Instrumental

Ensemble of 8 models 12.52 11.73 6.93 10.17 16.48

Demucs 4 HT default (htdemucs_ft) 12.05 11.24 5.74 8.33 14.63

🗎 Copy link

New MDX23C models for vocal separation

2023-07-06

* We have released new MDX23C models. They are based on code from kuielab that was prepared for Sound Demixing Challenge 2023. The results of the obtained models contain the entire frequency spectrum and have the maximum quality metrics for vocals and music on MultiSong Dataset. A total of 4 models are available, by default the model with the highest quality metrics is used. We are currently working on further improvements of these models. More details...

* A model was also prepared consisting of an ensemble of several single MDX23C models, which gives even better quality. It is available from a website with a title "MDX23C Ensemble".

🗎 Copy link

News

2023-05-22

1. MDX-B algorithm produces only vocals and instrumental after last update. It's because other 3 stems (bass, drums, other) work not so great comparing to Demucs4. You still can access old MDX-B (4 stems) at Old Models section.

2. We added Kim_vocal_2 model (trained by Kimberley Jensen) and some other UVR MDX models. Kim_vocal_2 is now used by default.

3. We upgraded MDX processing using overlap=0.8, so it produce higher SDR. For example Kim_vocal_2 alone gives: 9.60 for vocals and 15.91 for instrumental on Multisong dataset.

🗎 Copy link

Reverb effect removal and other news

2023-04-30

1. A new model has been added to the site to remove the reverb effect from music tracks. It is available under the name "FoxJoy Reverb Removal (other)". Examples of reverb removal can be found here.

2. All Demucs4 HT models are now available: htdemucs_ft [quality metrics], htdemucs [quality metrics] and htdemucs_6s [quality metrics]. htdemucs_6s divides the track into 6 parts, in addition to the standard parts, it will additionally include a piano and a guitar. These models are the best for getting bass, drums and other parts of tracks.

3. Added best quality MDX B model for vocal separation: "MDX Kimberley Jensen 2023.02.12 SDR: 9.30 (New)" [quality metrics].

🗎 Copy link

MVSep Vocal Model has been added to the site

2022-11-13

1. Our own MVSep Vocal Model has been added to the site. It was trained on our own large dataset. It shows good results on test data:

Synth dataset vocal SDR: 10.4523

Synth dataset instrumental SDR: 10.1561

MUSDB18HQ dataset vocal SDR: 8.8292

MUSDB18HQ dataset instrumental SDR: 15.2719

2. We added new model from Facebook team - Demucs4 Hybrid Transformer.

🗎 Copy link

Experimental MVSep DNR algorithm

2022-07-29

An experimental MVSep DNR algorithm has been added to the site, which divides tracks into 3 parts: music, special effects and voice. The algorithm was trained on the "Divide and Remaster" dataset. Quality Metrics:

SDR DNR for music: 6.17

SDR DNR for sfx: 7.26

SDR DNR for speech: 14.13

The algorithm is not well suited for ordinary music, but it does a good job when you need clean the voice of the speaker from extraneous noise in the background.

Examples of the MVSep DNR algorithm

🗎 Copy link

July changes on MVSep

2022-07-07

1. We created independent synthetic dataset to compare different music source separation algorithms. We published dataset here as well as automatic judging test system. Also leaderboard of best algorithms is available.

2. New MDX-B UVR vocal model was added. It's latest reelease from UVR Team. You have ability to choose it during selecting MDX-B algorithm in form.

3. New models from Ultimate Vocal Remover based on demucs3 architecture were added. It's available by name UVR Demucs in algorithm list.

Quality metrics for algorithms including UVD Demucs can be found here.

🗎 Copy link

April changes on MVSep

2022-04-18

1. New algorithm Danna Sep was added. It's algorithm which got 3rd place on Leaderboard A in Sony Music Demixing Challenge

2. New algorithm Byte Dance was added. This algorithm took second place in the vocals category on Leaderboard A in the Sony Music Demixing Challenge. It's trained only on the MUSDB18HQ data and has potential in the future if more training data is added.

Quality metrics for these and other algorithms can be found here.

🗎 Copy link

February changes on MVSep

2022-02-24

1. New UVR models: Piano, Bass, Drums and several different Vocal models were added. Possibility to set aggressivness was added for UVR models.

2. New remote GPU servers were added to process queue. Size of queue must be reduced.

3. Instrumental stem was added for spleeter (vocals, drums, bass, other) and spleeter (vocals, drums, bass, piano, other).

🗎 Copy link

December changes on MVSep

2021-12-23

1. Added the ability to select lossless encoding of the created audio-files. Previously, it was possible to use only MP3. Now we added output to WAV and FLAC.

2. Added the output of the general instrumental track for all main algorithms: MDX, Demucs3 and Unmix.

3. Added translation of the site into Polish and Indonesian.

4. Added an automatic script to reset the GPU in case of errors. There should be no longer large server downtime.

Unfortunately, all the highest quality algorithms work very slow. Large queues are periodically formed because of that. We think what to do with this.

🗎 Copy link

Three big news

2021-11-12

1. We had to move to a new server due to lack of space on the old one. Positive effect - the video card has been changed to a more powerful one with more memory. As a result, the waiting queues have decreased and there are fewer errors associated with a lack of GPU memory. The downside is that server costs have doubled.

2. A new algorithm has been added Ultimate Vocal Remover (UVR). It splits the track into two parts, music and vocals. UVR usually does it better than spleeter. There are a lot of models and different settings in the original UVR. We have chosen one of the best models and optimal settings. Perhaps later, a flexible choice of settings for the algorithm will be added.

3. The winner of the Music Demuxing Challenge has finally released his code. We added its models to the site under the names Demux3 Model A and Demux3 Model B. Demux3 Model B gives a better result, and works better for bass and drums comparing to other models, but is slightly inferior in vocals to the MDX-B algorithm.

Below is an updated table comparing the quality of algorithms (data for UVR are not available). The values in the table are calculated on private Music Demuxing Challenge dataset (available only to organizers). The higher the value, the better the algorithm works.

Algorithm Quality (Bass) Quality (Drums) Quality (Other) Quality (Vocals) Example

Spleeter (4 stems) 5.774 5.845 4.321 6.939 Example

UmxXL 6.619 6.838 4.891 7.732 Example

MDX A 7.232 7.173 5.636 8.901 Example

MDX B (Orig) 7.495 7.554 5.533 8.896 ---

MDX B (UVR) 7.495 7.554 5.533 9.482 Example

Ultimate Vocal Remover HQ --- --- --- --- Example

Demucs 3 Model A 8.115 8.037 5.193 7.968 Example

Demucs 3 Model B 8.856 8.850 5.978 8.756 Example

🗎 Copy link

Two new algorithms: MDX A и MDX B

2021-10-19

Two new algorithms have been added to mvsep.com for separate tracks: MDX A and MDX B. These models were created by the participants in the Music Demuxing Challenge who took second place. Their solution code and neural network models were made publicly available. We are still waiting for the first place solution. But even these models significantly outperform Spleeter and UmxXL in competition metrics (see the table above), but slower in speed. MDX A differs from MDX B in that the first algorithm did not use external data for training, so the results are slightly worse than MDX B. Later, the enthusiasts of the UVR project improved the vocal separation model, getting a better value for the quality metric (8.896 -> 9.482).

🗎 Copy link

Several usefull updates at mvsep.com

2021-08-30

* Updated software and site code. Splitting tracks is faster and more stable. Our backend crashes are less and less common.

* Added a new splitting algorithm called UnMix. The algorithm has 4 models "umxXL", "umxHQ", "umxSD", "umxSE". The highest quality is the first "umxXL". According to the first tests, the voice separates a little worse than the spleeter, but the instruments are better. In any case, a large field is now open for experimenting with tracks.

* The page with the split results has been redesigned: an original track has been added, it is convenient to compare from one page. Added information on sharing settings, displays information on the uploaded file, ID3 tags and an image (if any).

Examples of separation based on the new algorithm:

umxXL: Monk Turner Fascinoma - Its Your Birthday

umxHQ: Robin Grey - These Days

umxSD: Brad Sucks - Total Breakdown

umxSE: Paper Navy - Swan Song

And finally, some statistics. About 600-750 tracks are divided on the site per day. And for all the time, more than 300,000 tracks have been split. Moving towards a million.

🗎 Copy link

[email protected]

Advanced features

Quality Checker

Algorithm comparison

Algorithms

Company

Terms & Conditions

Refund Policy

Extra

Help us translate!

Help us promote!

0:00

For now, Differences are performed on text, not graphically, only the latest screenshot is available.

Screenshot requires Playwright/WebDriver enabled